یو ٹی ایف-8
یو ٹی ایف-8 (UTF-8) ایک حرفی رمز بندی ہے جس یونیکوڈ سے معين کردہ تمام ممکنہ حروف کی رمز بندی کے قابل ہے۔ اس کا اصل ڈیزائن کین تھامپسن (Ken Thompson) اور روب پائیک (Rob Pike) نے کیا تھا۔[1]
کوڈ پیج خاکہ
ترمیم_0 | _1 | _2 | _3 | _4 | _5 | _6 | _7 | _8 | _9 | _A | _B | _C | _D | _E | _F | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0_ | NUL 0000 0 |
SOH 0001 1 |
STX 0002 2 |
ETX 0003 3 |
EOT 0004 4 |
ENQ 0005 5 |
ACK 0006 6 |
BEL 0007 7 |
BS 0008 8 |
HT 0009 9 |
LF 000A 10 |
VT 000B 11 |
FF 000C 12 |
CR 000D 13 |
SO 000E 14 |
SI 000F 15 |
1_ | DLE 0010 16 |
DC1 0011 17 |
DC2 0012 18 |
DC3 0013 19 |
DC4 0014 20 |
NAK 0015 21 |
SYN 0016 22 |
ETB 0017 23 |
CAN 0018 24 |
EM 0019 25 |
SUB 001A 26 |
ESC 001B 27 |
FS 001C 28 |
GS 001D 29 |
RS 001E 30 |
US 001F 31 |
2_ | SP 0020 32 |
! 0021 33 |
" 0022 34 |
# 0023 35 |
$ 0024 36 |
% 0025 37 |
& 0026 38 |
' 0027 39 |
( 0028 40 |
) 0029 41 |
* 002A 42 |
+ 002B 43 |
کاما 002C 44 |
- 002D 45 |
. 002E 46 |
/ 002F 47 |
3_ | 0 0030 48 |
1 0031 49 |
2 0032 50 |
3 0033 51 |
4 0034 52 |
5 0035 53 |
6 0036 54 |
7 0037 55 |
8 0038 56 |
9 0039 57 |
رابطہ (اوقاف) 003A 58 |
; 003B 59 |
< 003C 60 |
= 003D 61 |
> 003E 62 |
? 003F 63 |
4_ | @ 0040 64 |
A 0041 65 |
B 0042 66 |
C 0043 67 |
D 0044 68 |
E 0045 69 |
F 0046 70 |
G 0047 71 |
H 0048 72 |
I 0049 73 |
J 004A 74 |
K 004B 75 |
L 004C 76 |
M 004D 77 |
N 004E 78 |
O 004F 79 |
5_ | P 0050 80 |
Q 0051 81 |
R 0052 82 |
S 0053 83 |
T 0054 84 |
U 0055 85 |
V 0056 86 |
W 0057 87 |
X 0058 88 |
Y 0059 89 |
Z 005A 90 |
[ 005B 91 |
\ 005C 92 |
] 005D 93 |
^ 005E 94 |
_ 005F 95 |
6_ | ` 0060 96 |
A 0061 97 |
B 0062 98 |
C 0063 99 |
D 0064 100 |
E 0065 101 |
F 0066 102 |
G 0067 103 |
H 0068 104 |
I 0069 105 |
J 006A 106 |
K 006B 107 |
L 006C 108 |
M 006D 109 |
N 006E 110 |
O 006F 111 |
7_ | P 0070 112 |
Q 0071 113 |
R 0072 114 |
S 0073 115 |
T 0074 116 |
U 0075 117 |
V 0076 118 |
W 0077 119 |
X 0078 120 |
Y 0079 121 |
Z 007A 122 |
{ 007B 123 |
| 007C 124 |
} 007D 125 |
~ 007E 126 |
DEL 007F 127 |
8_ | • +00 128 |
• +01 129 |
• +02 130 |
• +03 131 |
• +04 132 |
• +05 133 |
• +06 134 |
• +07 135 |
• +08 136 |
• +09 137 |
• +0A 138 |
• +0B 139 |
• +0C 140 |
• +0D 141 |
• +0E 142 |
• +0F 143 |
9_ | • +10 144 |
• +11 145 |
• +12 146 |
• +13 147 |
• +14 148 |
• +15 149 |
• +16 150 |
• +17 151 |
• +18 152 |
• +19 153 |
• +1A 154 |
• +1B 155 |
• +1C 156 |
• +1D 157 |
• +1E 158 |
• +1F 159 |
A_ | • +20 160 |
• +21 161 |
• +22 162 |
• +23 163 |
• +24 164 |
• +25 165 |
• +26 166 |
• +27 167 |
• +28 168 |
• +29 169 |
• +2A 170 |
• +2B 171 |
• +2C 172 |
• +2D 173 |
• +2E 174 |
• +2F 175 |
B_ | • +30 176 |
• +31 177 |
• +32 178 |
• +33 179 |
• +34 180 |
• +35 181 |
• +36 182 |
• +37 183 |
• +38 184 |
• +39 185 |
• +3A 186 |
• +3B 187 |
• +3C 188 |
• +3D 189 |
• +3E 190 |
• +3F 191 |
2-byte C_ |
0000 192 |
0040 193 |
Latin 0080 194 |
Latin 00C0 195 |
Latin 0100 196 |
Latin 0140 197 |
Latin 0180 198 |
Latin 01C0 199 |
Latin 0200 200 |
IPA 0240 201 |
IPA 0280 202 |
IPA 02C0 203 |
accents 0300 204 |
accents 0340 205 |
یونانی حروف تہجی 0380 206 |
یونانی حروف تہجی 03C0 207 |
2-byte D_ |
Cyril 0400 208 |
Cyril 0440 209 |
Cyril 0480 210 |
Cyril 04C0 211 |
Cyril 0500 212 |
Armeni 0540 213 |
Hebrew 0580 214 |
Hebrew 05C0 215 |
Arabic 0600 216 |
Arabic 0640 217 |
Arabic 0680 218 |
Arabic 06C0 219 |
Syriac 0700 220 |
Arabic 0740 221 |
Thaana 0780 222 |
N'Ko 07C0 223 |
3-byte E_ |
Indic 0800* 224 |
Misc. 1000 225 |
Symbol 2000 226 |
Kana, CJK 3000 227 |
CJK 4000 228 |
CJK 5000 229 |
CJK 6000 230 |
CJK 7000 231 |
CJK 8000 232 |
CJK 9000 233 |
Asian A000 234 |
ہنگل B000 235 |
ہنگل C000 236 |
ہنگل D000 237 |
PUA E000 238 |
Forms F000 239 |
4‑byte F_ |
SMP , SIP 10000* 240 |
40000 241 |
80000 242 |
SSP, SPUA C0000 243 |
SPUA-B 100000 244 |
140000 245 |
180000 246 |
1C0000 247 |
5-byte 200000* 248 |
5-byte 1000000 249 |
5-byte 2000000 250 |
5-byte 3000000 251 |
6-byte 4000000* 252 |
6-byte 40000000 253 |
254 |
255 |
حوالہ جات
ترمیم- ↑ Email Subject: UTF-8 history, From: "Rob 'Commander' Pike", Date: Wed, 30 Apr 2003..., ...UTF-8 was designed, in front of my eyes, on a placemat in a New Jersey diner one night in September or so 1992...So that night Ken wrote packing and unpacking code and I started tearing into the C and graphics libraries. The next day all the code was done...
بیرونی روابط
ترمیم- RFC 3629 / STD 63 (2003), which establishes UTF-8 as a standard Internet protocol element
- The Unicode Standard, Version 9.0, §3.9 D92, §3.10 D95 (2016)
- ISO/IEC 10646:2014 §9.1
They supersede the definitions given in the following obsolete works:
- ISO/IEC 10646-1:1993 Amendment 2 / Annex R (1996)
- The Unicode Standard, Version 6.0, §3.9 D92, §3.10 D95 (2010)
- The Unicode Standard, Version 5.0, §3.9–§3.10 (2006)
- The Unicode Standard, Version 2.0, Appendix A (1996)
- RFC 2044 (1996)
- RFC 2279 (1998)
- The Unicode Standard, Version 3.0, §2.3 (2000) plus Corrigendum #1 : UTF-8 Shortest Form (2000)
- Unicode Standard Annex #27: Unicode 3.1 (2001)
- Original UTF-8 paper (or pdf) for Plan 9 from Bell Labs
- RFC 5198 defines UTF-8 NFC for Network Interchange
- UTF-8 test pages by Andreas Prilopآرکائیو شدہ (Date missing) بذریعہ user.uni-hannover.de (Error: unknown archive URL), Jost Gippert and the World Wide Web Consortium
- Unix/Linux: UTF-8/Unicode FAQ, Linux Unicode HOWTO, UTF-8 and Gentoo
- The Unicode/UTF-8-character table displays UTF-8 in a variety of formats (with Unicode and HTML encoding information)
- Characters, Symbols and the Unicode Miracle – Computerphile یوٹیوب پر