HTML Character Sets List April 22, 2023 Basic Character Sets List Extended Character Sets List Locale ID LCID List Reference List between HTTP_ACCEPT_LANGUAGE, Locale ID (LCID) and Language Character Sets List (charset=) Some Basic Ones charset=big5 – Chinese Traditional (Big5) charset=euc-kr – Korean (EUC) charset=iso-8859-1 – Western Alphabet charset=iso-8859-2 – Central European Alphabet (ISO) charset=iso-8859-3 – Latin 3 Alphabet (ISO) charset=iso-8859-4 – Baltic Alphabet (ISO) charset=iso-8859-5 – Cyrillic Alphabet (ISO) charset=iso-8859-6 – Arabic Alphabet (ISO) charset=iso-8859-7 – Greek Alphabet (ISO) charset=iso-8859-8 – Hebrew Alphabet (ISO) charset=koi8-r – Cyrillic Alphabet (KOI8-R) charset=shift-jis – Japanese (Shift-JIS) charset=x-euc – Japanese (EUC) charset=utf-8 – Universal Alphabet (UTF-8) charset=windows-1250 – Central European Alphabet (Windows) charset=windows-1251 – Cyrillic Alphabet (Windows) charset=windows-1252 – Western Alphabet (Windows) charset=windows-1253 – Greek Alphabet (Windows) charset=windows-1254 – Turkish Alphabet charset=windows-1255 – Hebrew Alphabet (Windows) charset=windows-1256 – Arabic Alphabet (Windows) charset=windows-1257 – Baltic Alphabet (Windows) charset=windows-1258 – Vietnamese Alphabet (Windows) charset=windows-874 – Thai (Windows) A Longer List Arabic (ASMO 708) – charset=ASMO-708 Arabic (DOS) – charset=DOS-720 Arabic (ISO) – charset=iso-8859-6 Arabic (Mac) – charset=x-mac-arabic Arabic (Windows) – charset=windows-1256 Baltic (DOS) – charset=ibm775 Baltic (ISO) – charset=iso-8859-4 Baltic (Windows) – charset=windows-1257 Central European (DOS) – charset=ibm852 Central European (ISO) – charset=iso-8859-2 Central European (Mac) – charset=x-mac-ce Central European (Windows) – charset=windows-1250 Chinese Simplified (EUC) – charset=EUC-CN Chinese Simplified (GB2312) – charset=gb2312 Chinese Simplified (HZ) – charset=hz-gb-2312 Chinese Simplified (Mac) – charset=x-mac-chinesesimp Chinese Traditional (Big5) – charset=big5 Chinese Traditional (CNS) – charset=x-Chinese-CNS Chinese Traditional (Eten) – charset=x-Chinese-Eten Chinese Traditional (Mac) – charset=x-mac-chinesetrad – charset=950 Cyrillic (DOS) – charset=cp866 Cyrillic (ISO) – charset=iso-8859-5 Cyrillic (KOI8-R) – charset=koi8-r Cyrillic (KOI8-U) – charset=koi8-u Cyrillic (Mac) – charset=x-mac-cyrillic Cyrillic (Windows) – charset=windows-1251 Europa – charset=x-Europa German (IA5) – charset=x-IA5-German Greek (DOS) – charset=ibm737 Greek (ISO) – charset=iso-8859-7 Greek (Mac) – charset=x-mac-greek Greek (Windows) – charset=windows-1253 – charset= Greek, Modern (DOS) – charset=ibm869 Hebrew (DOS) – charset=DOS-862 Hebrew (ISO-Logical) – charset=iso-8859-8-i Hebrew (ISO-Visual) – charset=iso-8859-8 Hebrew (Mac) – charset=x-mac-hebrew Hebrew (Windows) – charset=windows-1255 IBM EBCDIC (Arabic) – charset=x-EBCDIC-Arabic IBM EBCDIC (Cyrillic Russian) – charset=x-EBCDIC-CyrillicRussian IBM EBCDIC (Cyrillic Serbian-Bulgarian) – charset=x-EBCDIC-CyrillicSerbianBulgarian IBM EBCDIC (Denmark-Norway) – charset=x-EBCDIC-DenmarkNorway IBM EBCDIC (Denmark-Norway-Euro) – charset=x-ebcdic-denmarknorway-euro IBM EBCDIC (Finland-Sweden) – charset=x-EBCDIC-FinlandSweden IBM EBCDIC (Finland-Sweden-Euro) – charset=x-ebcdic-finlandsweden-euro IBM EBCDIC (Finland-Sweden-Euro) – charset=x-ebcdic-finlandsweden-euro IBM EBCDIC (France-Euro) – charset=x-ebcdic-france-euro IBM EBCDIC (Germany) – charset=x-EBCDIC-Germany IBM EBCDIC (Germany-Euro) – charset=x-ebcdic-germany-euro IBM EBCDIC (Greek Modern) – charset=x-EBCDIC-GreekModern IBM EBCDIC (Greek) – charset=x-EBCDIC-Greek IBM EBCDIC (Hebrew) – charset=x-EBCDIC-Hebrew IBM EBCDIC (Icelandic) – charset=x-EBCDIC-Icelandic IBM EBCDIC (Icelandic-Euro) – charset=x-ebcdic-icelandic-euro IBM EBCDIC (International-Euro) – charset=x-ebcdic-international-euro IBM EBCDIC (Italy) – charset=x-EBCDIC-Italy IBM EBCDIC (Italy-Euro) – charset=x-ebcdic-italy-euro IBM EBCDIC (Japanese and Japanese Katakana) – charset=x-EBCDIC-JapaneseAndKana IBM EBCDIC (Japanese and Japanese-Latin) – charset=x-EBCDIC-JapaneseAndJapaneseLatin IBM EBCDIC (Japanese and US-Canada) – charset=x-EBCDIC-JapaneseAndUSCanada IBM EBCDIC (Japanese katakana) – charset=x-EBCDIC-JapaneseKatakana IBM EBCDIC (Korean and Korean Extended) – charset=x-EBCDIC-KoreanAndKoreanExtended IBM EBCDIC (Korean Extended) – charset=x-EBCDIC-KoreanExtended IBM EBCDIC (Multilingual Latin-2) – charset=CP870 IBM EBCDIC (Simplified Chinese) – charset=x-EBCDIC-SimplifiedChinese IBM EBCDIC (Spain) – charset=X-EBCDIC-Spain IBM EBCDIC (Spain-Euro) – charset=x-ebcdic-spain-euro IBM EBCDIC (Thai) – charset=x-EBCDIC-Thai IBM EBCDIC (Traditional Chinese) – charset=x-EBCDIC-TraditionalChinese IBM EBCDIC (Turkish Latin-5) – charset=CP1026 IBM EBCDIC (Turkish) – charset=x-EBCDIC-Turkish IBM EBCDIC (UK) – charset=x-EBCDIC-UK IBM EBCDIC (UK-Euro) – charset=x-ebcdic-uk-euro IBM EBCDIC (US-Canada) – charset=ebcdic-cp-us IBM EBCDIC (US-Canada-Euro) – charset=x-ebcdic-cp-us-euro Icelandic (DOS) – charset=ibm861 Icelandic (Mac) – charset=x-mac-icelandic ISCII Assamese – charset=x-iscii-as ISCII Bengali – charset=x-iscii-be ISCII Devanagari – charset=x-iscii-de ISCII Gujarathi – charset=x-iscii-gu ISCII Kannada – charset=x-iscii-ka ISCII Malayalam – charset=x-iscii-ma ISCII Oriya – charset=x-iscii-or ISCII Panjabi – charset=x-iscii-pa ISCII Tamil – charset=x-iscii-ta ISCII Telugu – charset=x-iscii-te Japanese (EUC) – charset=euc-jp – charset=x-euc-jp Japanese (JIS) – charset=iso-2022-jp Japanese (JIS-Allow 1 byte Kana – SO/SI) – charset=iso-2022-jp Japanese (JIS-Allow 1 byte Kana) – charset=csISO2022JP Japanese (Mac) – charset=x-mac-japanese Japanese (Shift-JIS) – charset=shift_jis Korean – charset=ks_c_5601-1987 Korean (EUC) – charset=euc-kr Korean (ISO) – charset=iso-2022-kr Korean (Johab) – charset=Johab Korean (Mac) – charset=x-mac-korean Latin 3 (ISO) – charset=iso-8859-3 Latin 9 (ISO) – charset=iso-8859-15 Norwegian (IA5) – charset=x-IA5-Norwegian OEM United States – charset=IBM437 Swedish (IA5) – charset=x-IA5-Swedish Thai (Windows) – charset=windows-874 Turkish (DOS) – charset=ibm857 Turkish (ISO) – charset=iso-8859-9 Turkish (Mac) – charset=x-mac-turkish Turkish (Windows) – charset=windows-1254 Unicode – charset=unicode Unicode (Big-Endian) – charset=unicodeFFFE Unicode (UTF-7) – charset=utf-7 Unicode (UTF-8) – charset=utf-8 US-ASCII – charset=us-ascii Vietnamese (Windows) – charset=windows-1258 Western European (DOS) – charset=ibm850 Western European (IA5) – charset=x-IA5 Western European (ISO) – charset=iso-8859-1 Western European (Mac) – charset=macintosh Western European (Windows) – charset=Windows-1252 List of Locale ID (LCID) Values Language – Country/Region LCID Hex LCID Dec Afrikaans – South Africa 0436 1078 Albanian – Albania 041c 1052 Amharic – Ethiopia 045e 1118 Arabic – Saudi Arabia 0401 1025 Arabic – Algeria 1401 5121 Arabic – Bahrain 3c01 15361 Arabic – Egypt 0c01 3073 Arabic – Iraq 0801 2049 Arabic – Jordan 2c01 11265 Arabic – Kuwait 3401 13313 Arabic – Lebanon 3001 12289 Arabic – Libya 1001 4097 Arabic – Morocco 1801 6145 Arabic – Oman 2001 8193 Arabic – Qatar 4001 16385 Arabic – Syria 2801 10241 Arabic – Tunisia 1c01 7169 Arabic – U.A.E. 3801 14337 Arabic – Yemen 2401 9217 Armenian – Armenia 042b 1067 Assamese 044d 1101 Azeri (Cyrillic) 082c 2092 Azeri (Latin) 042c 1068 Basque 042d 1069 Belarusian 0423 1059 Bengali (India) 0445 1093 Bengali (Bangladesh) 0845 2117 Bosnian (Bosnia/Herzegovina) 141A 5146 Bulgarian 0402 1026 Burmese 0455 1109 Catalan 0403 1027 Cherokee – United States 045c 1116 Chinese – People’s Republic of China 0804 2052 Chinese – Singapore 1004 4100 Chinese – Taiwan 0404 1028 Chinese – Hong Kong SAR 0c04 3076 Chinese – Macao SAR 1404 5124 Croatian 041a 1050 Croatian (Bosnia/Herzegovina) 101a 4122 Czech 0405 1029 Danish 0406 1030 Divehi 0465 1125 Dutch – Netherlands 0413 1043 Dutch – Belgium 0813 2067 Edo 0466 1126 English – United States 0409 1033 English – United Kingdom 0809 2057 English – Australia 0c09 3081 English – Belize 2809 10249 English – Canada 1009 4105 English – Caribbean 2409 9225 English – Hong Kong SAR 3c09 15369 English – India 4009 16393 English – Indonesia 3809 14345 English – Ireland 1809 6153 English – Jamaica 2009 8201 English – Malaysia 4409 17417 English – New Zealand 1409 5129 English – Philippines 3409 13321 English – Singapore 4809 18441 English – South Africa 1c09 7177 English – Trinidad 2c09 11273 English – Zimbabwe 3009 12297 Estonian 0425 1061 Faroese 0438 1080 Farsi 0429 1065 Filipino 0464 1124 Finnish 040b 1035 French – France 040c 1036 French – Belgium 080c 2060 French – Cameroon 2c0c 11276 French – Canada 0c0c 3084 French – Democratic Rep. of Congo 240c 9228 French – Cote d’Ivoire 300c 12300 French – Haiti 3c0c 15372 French – Luxembourg 140c 5132 French – Mali 340c 13324 French – Monaco 180c 6156 French – Morocco 380c 14348 French – North Africa e40c 58380 French – Reunion 200c 8204 French – Senegal 280c 10252 French – Switzerland 100c 4108 French – West Indies 1c0c 7180 Frisian – Netherlands 0462 1122 Fulfulde – Nigeria 0467 1127 FYRO Macedonian 042f 1071 Gaelic (Ireland) 083c 2108 Gaelic (Scotland) 043c 1084 Galician 0456 1110 Georgian 0437 1079 German – Germany 0407 1031 German – Austria 0c07 3079 German – Liechtenstein 1407 5127 German – Luxembourg 1007 4103 German – Switzerland 0807 2055 Greek 0408 1032 Guarani – Paraguay 0474 1140 Gujarati 0447 1095 Hausa – Nigeria 0468 1128 Hawaiian – United States 0475 1141 Hebrew 040d 1037 Hindi 0439 1081 Hungarian 040e 1038 Ibibio – Nigeria 0469 1129 Icelandic 040f 1039 Igbo – Nigeria 0470 1136 Indonesian 0421 1057 Inuktitut 045d 1117 Italian – Italy 0410 1040 Italian – Switzerland 0810 2064 Japanese 0411 1041 Kannada 044b 1099 Kanuri – Nigeria 0471 1137 Kashmiri 0860 2144 Kashmiri (Arabic) 0460 1120 Kazakh 043f 1087 Khmer 0453 1107 Konkani 0457 1111 Korean 0412 1042 Kyrgyz (Cyrillic) 0440 1088 Lao 0454 1108 Latin 0476 1142 Latvian 0426 1062 Lithuanian 0427 1063 Malay – Malaysia 043e 1086 Malay – Brunei Darussalam 083e 2110 Malayalam 044c 1100 Maltese 043a 1082 Manipuri 0458 1112 Maori – New Zealand 0481 1153 Marathi 044e 1102 Mongolian (Cyrillic) 0450 1104 Mongolian (Mongolian) 0850 2128 Nepali 0461 1121 Nepali – India 0861 2145 Norwegian (Bokmål) 0414 1044 Norwegian (Nynorsk) 0814 2068 Oriya 0448 1096 Oromo 0472 1138 Papiamentu 0479 1145 Pashto 0463 1123 Polish 0415 1045 Portuguese – Brazil 0416 1046 Portuguese – Portugal 0816 2070 Punjabi 0446 1094 Punjabi (Pakistan) 0846 2118 Quecha – Bolivia 046B 1131 Quecha – Ecuador 086B 2155 Quecha – Peru 0C6B 3179 Rhaeto-Romanic 0417 1047 Romanian 0418 1048 Romanian – Moldava 0818 2072 Russian 0419 1049 Russian – Moldava 0819 2073 Sami (Lappish) 043b 1083 Sanskrit 044f 1103 Sepedi 046c 1132 Serbian (Cyrillic) 0c1a 3098 Serbian (Latin) 081a 2074 Sindhi – India 0459 1113 Sindhi – Pakistan 0859 2137 Sinhalese – Sri Lanka 045b 1115 Slovak 041b 1051 Slovenian 0424 1060 Somali 0477 1143 Sorbian 042e 1070 Spanish – Spain (Modern Sort) 0c0a 3082 Spanish – Spain (Traditional Sort) 040a 1034 Spanish – Argentina 2c0a 11274 Spanish – Bolivia 400a 16394 Spanish – Chile 340a 13322 Spanish – Colombia 240a 9226 Spanish – Costa Rica 140a 5130 Spanish – Dominican Republic 1c0a 7178 Spanish – Ecuador 300a 12298 Spanish – El Salvador 440a 17418 Spanish – Guatemala 100a 4106 Spanish – Honduras 480a 18442 Spanish – Latin America e40a 58378 Spanish – Mexico 080a 2058 Spanish – Nicaragua 4c0a 19466 Spanish – Panama 180a 6154 Spanish – Paraguay 3c0a 15370 Spanish – Peru 280a 10250 Spanish – Puerto Rico 500a 20490 Spanish – United States 540a 21514 Spanish – Uruguay 380a 14346 Spanish – Venezuela 200a 8202 Sutu 0430 1072 Swahili 0441 1089 Swedish 041d 1053 Swedish – Finland 081d 2077 Syriac 045a 1114 Tajik 0428 1064 Tamazight (Arabic) 045f 1119 Tamazight (Latin) 085f 2143 Tamil 0449 1097 Tatar 0444 1092 Telugu 044a 1098 Thai 041e 1054 Tibetan – Bhutan 0851 2129 Tibetan – People’s Republic of China 0451 1105 Tigrigna – Eritrea 0873 2163 Tigrigna – Ethiopia 0473 1139 Tsonga 0431 1073 Tswana 0432 1074 Turkish 041f 1055 Turkmen 0442 1090 Uighur – China 0480 1152 Ukrainian 0422 1058 Urdu 0420 1056 Urdu – India 0820 2080 Uzbek (Cyrillic) 0843 2115 Uzbek (Latin) 0443 1091 Venda 0433 1075 Vietnamese 042a 1066 Welsh 0452 1106 Xhosa 0434 1076 Yi 0478 1144 Yiddish 043d 1085 Yoruba 046a 1130 Zulu 0435 1077 HID (Human Interface Device) 04ff 1279 Reference List between HTTP_ACCEPT_LANGUAGE, Locale ID (LCID) and Language HTTP LCID Dec Language LCID = 2048 Default “af” LCID = 1078 Afrikaans “sq” LCID = 1052 Albanian “ar-sa” LCID = 1025 Arabic(Saudi Arabia) “ar-iq” LCID = 2049 Arabic(Iraq) “ar-eg” LCID = 3073 Arabic(Egypt) “ar-ly” LCID = 4097 Arabic(Libya) “ar-dz” LCID = 5121 Arabic(Algeria) “ar-ma” LCID = 6145 Arabic(Morocco) “ar-tn” LCID = 7169 Arabic(Tunisia) “ar-om” LCID = 8193 Arabic(Oman) “ar-ye” LCID = 9217 Arabic(Yemen) “ar-sy” LCID = 10241 Arabic(Syria) “ar-jo” LCID = 11265 Arabic(Jordan) “ar-lb” LCID = 12289 Arabic(Lebanon) “ar-kw” LCID = 13313 Arabic(Kuwait) “ar-ae” LCID = 14337 Arabic(U.A.E.) “ar-bh” LCID = 15361 Arabic(Bahrain) “ar-qa” LCID = 16385 Arabic(Qatar) “eu” LCID = 1069 Basque “bg” LCID = 1026 Bulgarian “be” LCID = 1059 Belarusian “ca” LCID = 1027 Catalan “zh-tw” LCID = 1028 Chinese(Taiwan) “zh-cn” LCID = 2052 Chinese(PRC) “zh-hk” LCID = 3076 Chinese(Hong Kong) “zh-sg” LCID = 4100 Chinese(Singapore) “hr” LCID = 1050 Croatian “cs” LCID = 1029 Czech “da” LCID = 1030 Danish “n” LCID = 1043 Dutch(Standard) “nl-be” LCID = 2067 Dutch(Belgian) “en” LCID = 9 English “en-us” LCID = 1033 English(United States) “en-gb” LCID = 2057 English(British) “en-au” LCID = 3081 English(Australian) “en-ca” LCID = 4105 English(Canadian) “en-nz” LCID = 5129 English(New Zealand) “en-ie” LCID = 6153 English(Ireland) “en-za” LCID = 7177 English(South Africa) “en-jm” LCID = 8201 English(Jamaica) “en” LCID = 9225 English(Caribbean) “en-bz” LCID = 10249 English(Belize) “en-tt” LCID = 11273 English(Trinidad) “et” LCID = 1061 Estonian “fo” LCID = 1080 Faeroese “fa” LCID = 1065 Farsi “fi” LCID = 1035 Finnish “fr” LCID = 1036 French(Standard) “fr-be” LCID = 2060 French(Belgian) “fr-ca” LCID = 3084 French(Canadian) “fr-ch” LCID = 4108 French(Swiss) “fr-lu” LCID = 5132 French(Luxembourg) “mk” LCID = 1071 FYRO Macedonian “gd” LCID = 1084 Gaelic(Scots) “gd-ie” LCID = 2108 Gaelic(Irish) “de” LCID = 1031 German(Standard) “de-ch” LCID = 2055 German(Swiss) “de-at” LCID = 3079 German(Austrian) “de-lu” LCID = 4103 German(Luxembourg) “de-li” LCID = 5127 German(Liechtenstein) “e” LCID = 1032 Greek “he” LCID = 1037 Hebrew “hi” LCID = 1081 Hindi “hu” LCID = 1038 Hungarian “is” LCID = 1039 Icelandic “in” LCID = 1057 Indonesian “it” LCID = 1040 Italian(Standard) “it-ch” LCID = 2064 Italian(Swiss) “ja” LCID = 1041 Japanese “ko” LCID = 1042 Korean “ko” LCID = 2066 Korean(Johab) “lv” LCID = 1062 Latvian “lt” LCID = 1063 Lithuanian “ms” LCID = 1086 Malaysian “mt” LCID = 1082 Maltese “no” LCID = 1044 Norwegian(Bokmal) “no” LCID = 2068 Norwegian(Nynorsk) “p” LCID = 1045 Polish “pt-br” LCID = 1046 Portuguese(Brazil) “pt” LCID = 2070 Portuguese(Portugal) “rm” LCID = 1047 Rhaeto-Romanic “ro” LCID = 1048 Romanian “ro-mo” LCID = 2072 Romanian(Moldavia) “ru” LCID = 1049 Russian “ru-mo” LCID = 2073 Russian(Moldavia) “sz” LCID = 1083 Sami(Lappish) “sr” LCID = 3098 Serbian(Cyrillic) “sr” LCID = 2074 Serbian(Latin) “sk” LCID = 1051 Slovak “s” LCID = 1060 Slovenian “sb” LCID = 1070 Sorbian “es” LCID = 1034 Spanish(Spain – Traditional Sort) “es-mx” LCID = 2058 Spanish(Mexican) “es” LCID = 3082 Spanish(Spain – Modern Sort) “es-gt” LCID = 4106 Spanish(Guatemala) “es-cr” LCID = 5130 Spanish(Costa Rica) “es-pa” LCID = 6154 Spanish(Panama) “es-do” LCID = 7178 Spanish(Dominican Republic) “es-ve” LCID = 8202 Spanish(Venezuela) “es-co” LCID = 9226 Spanish(Colombia) “es-pe” LCID = 10250 Spanish(Peru) “es-ar” LCID = 11274 Spanish(Argentina) “es-ec” LCID = 12298 Spanish(Ecuador) “es-c” LCID = 13322 Spanish(Chile) “es-uy” LCID = 14346 Spanish(Uruguay) “es-py” LCID = 15370 Spanish(Paraguay) “es-bo” LCID = 16394 Spanish(Bolivia) “es-sv” LCID = 17418 Spanish(El Salvador) “es-hn” LCID = 18442 Spanish(Honduras) “es-ni” LCID = 19466 Spanish(Nicaragua) “es-pr” LCID = 20490 Spanish(Puerto Rico) “sx” LCID = 1072 Sutu “sv” LCID = 1053 Swedish “sv-fi” LCID = 2077 Swedish(Finland) “th” LCID = 1054 Thai “ts” LCID = 1073 Tsonga “tn” LCID = 1074 Tswana “tr” LCID = 1055 Turkish “uk” LCID = 1058 Ukrainian “ur” LCID = 1056 Urdu “ve” LCID = 1075 Venda “vi” LCID = 1066 Vietnamese “xh” LCID = 1076 Xhosa “ji” LCID = 1085 Yiddish “zu” LCID = 1077 Zulu admin
HTML Entities HTML 4.01 supports the ISO 8859-1 (Latin-1) character set. The lower part of ISO-8859-1 (codes from 0-127) is the original…
HTML ASCII Reference HTML and XHTML uses standard 7-BIT ASCII when transmitting data over the Web. 7-BIT ASCII represents 128 different character values…
Character Sets List Basic Character Sets List Extended Character Sets List Locale ID LCID List Reference List between HTTP_ACCEPT_LANGUAGE, Locale ID (LCID) and Language Character…