CodePage and CharSet

it2022-05-14  79

Code Page charset 语种 708 ASMO-708 阿拉伯字符 (ASMO 708)720 DOS-720 阿拉伯字符 (DOS)28596 iso-8859-6 阿拉伯字符 (ISO)1256 windows-1256 阿拉伯字符 (Windows)1257 windows-1257 波罗的海字符 (Windows)852 ibm852 中欧字符 (DOS)28592 iso-8859-2 中欧字符 (ISO)1250 windows-1250 中欧字符 (Windows)936 gb2312 简体中文 (GB2312)950 big5 繁体中文 (Big5)862 DOS-862 希伯来字符 (DOS)866 cp866 西里尔字符 (DOS)874 windows-874 泰语 (Windows)932 shift_jis 日语 (Shift-JIS)949 ks_c_5601-1987 朝鲜语1251 windows-1251 西里尔字符 (Windows)1252 iso-8859-1 西欧字符1253 windows-1253 希腊字符 (Windows)1254 iso-8859-9 土耳其字符 (Windows)1255 windows-1255 希伯来字符 (Windows)1258 windows-1258 越南字符 (Windows)20866 koi8-r 西里尔字符 (KOI8-R)21866 koi8-ru 西里尔字符 (KOI8-U)28595 iso-8859-5 西里尔字符 (ISO)28597 iso-8859-7 希腊字符 (ISO)28598 iso-8859-8 希伯来字符 (ISO-Visual)38598 iso-8859-8-i 希伯来字符 (ISO-Logical)50932 _autodetect 日语 (自动选择)51932 euc-jp 日语 (EUC)52936 hz-gb-2312 简体中文 (HZ)65001 utf-8 Unicode (UTF-8) 

代码页(CodePage)名称(CharSet)显示名称(中文)显示名称(英文)Info.CodePageInfo.Name(CharSet)Info.DisplayName(cn)Info.DisplayName(en)37IBM037IBM EBCDIC(美国 - 加拿大)IBM EBCDIC (US-Canada)437IBM437OEM 美国OEM United States500IBM500IBM EBCDIC(国际)IBM EBCDIC (International)708ASMO-708阿拉伯字符 (ASMO 708)Arabic (ASMO 708)720DOS-720阿拉伯字符 (DOS)Arabic (DOS)737ibm737希腊字符 (DOS)Greek (DOS)775ibm775波罗的海字符 (DOS)Baltic (DOS)850ibm850西欧字符 (DOS)Western European (DOS)852ibm852中欧字符 (DOS)Central European (DOS)855IBM855OEM 西里尔语OEM Cyrillic857ibm857土耳其字符 (DOS)Turkish (DOS)858IBM00858OEM 多语言拉丁语 IOEM Multilingual Latin I860IBM860葡萄牙语 (DOS)Portuguese (DOS)861ibm861冰岛语 (DOS)Icelandic (DOS)862DOS-862希伯来字符 (DOS)Hebrew (DOS)863IBM863加拿大法语 (DOS)French Canadian (DOS)864IBM864阿拉伯字符 (864)Arabic (864)865IBM865北欧字符 (DOS)Nordic (DOS)866cp866西里尔字符 (DOS)Cyrillic (DOS)869ibm869现代希腊字符 (DOS)Greek, Modern (DOS)870IBM870IBM EBCDIC(多语言拉丁语 2)IBM EBCDIC (Multilingual Latin-2)874windows-874泰语 (Windows)Thai (Windows)875cp875IBM EBCDIC(现代希腊语)IBM EBCDIC (Greek Modern)932shift_jis日语 (Shift-JIS)Japanese (Shift-JIS)936gb2312简体中文 (GB2312)Chinese Simplified (GB2312)949ks_c_5601-1987朝鲜语Korean950big5繁体中文 (Big5)Chinese Traditional (Big5)1026IBM1026IBM EBCDIC(土耳其拉丁语 5)IBM EBCDIC (Turkish Latin-5)1047IBM01047IBM 拉丁语 1IBM Latin-11140IBM01140IBM EBCDIC(美国 - 加拿大 - 欧洲)IBM EBCDIC (US-Canada-Euro)1141IBM01141IBM EBCDIC(德国 - 欧洲)IBM EBCDIC (Germany-Euro)1142IBM01142IBM EBCDIC(丹麦 - 挪威 - 欧洲)IBM EBCDIC (Denmark-Norway-Euro)1143IBM01143IBM EBCDIC(芬兰 - 瑞典 - 欧洲)IBM EBCDIC (Finland-Sweden-Euro)1144IBM01144IBM EBCDIC(意大利 - 欧洲)IBM EBCDIC (Italy-Euro)1145IBM01145IBM EBCDIC(西班牙 - 欧洲)IBM EBCDIC (Spain-Euro)1146IBM01146IBM EBCDIC(英国 - 欧洲)IBM EBCDIC (UK-Euro)1147IBM01147IBM EBCDIC(法国 - 欧洲)IBM EBCDIC (France-Euro)1148IBM01148IBM EBCDIC(国际 - 欧洲)IBM EBCDIC (International-Euro)1149IBM01149IBM EBCDIC(冰岛语 - 欧洲)IBM EBCDIC (Icelandic-Euro)1200utf-16UnicodeUnicode1201UnicodeFFFEUnicode (Big-Endian)Unicode (Big-Endian)1250windows-1250中欧字符 (Windows)Central European (Windows)1251windows-1251西里尔字符 (Windows)Cyrillic (Windows)1252Windows-1252西欧字符 (Windows)Western European (Windows)1253windows-1253希腊字符 (Windows)Greek (Windows)1254windows-1254土耳其字符 (Windows)Turkish (Windows)1255windows-1255希伯来字符 (Windows)Hebrew (Windows)1256windows-1256阿拉伯字符 (Windows)Arabic (Windows)1257windows-1257波罗的海字符 (Windows)Baltic (Windows)1258windows-1258越南字符 (Windows)Vietnamese (Windows)1361Johab朝鲜语 (Johab)Korean (Johab)10000macintosh西欧字符 (Mac)Western European (Mac)10001x-mac-japanese日语 (Mac)Japanese (Mac)10002x-mac-chinesetrad繁体中文 (Mac)Chinese Traditional (Mac)10003x-mac-korean朝鲜语 (Mac)Korean (Mac)10004x-mac-arabic阿拉伯字符 (Mac)Arabic (Mac)10005x-mac-hebrew希伯来字符 (Mac)Hebrew (Mac)10006x-mac-greek希腊字符 (Mac)Greek (Mac)10007x-mac-cyrillic西里尔字符 (Mac)Cyrillic (Mac)10008x-mac-chinesesimp简体中文 (Mac)Chinese Simplified (Mac)10010x-mac-romanian罗马尼亚语 (Mac)Romanian (Mac)10017x-mac-ukrainian乌克兰语 (Mac)Ukrainian (Mac)10021x-mac-thai泰语 (Mac)Thai (Mac)10029x-mac-ce中欧字符 (Mac)Central European (Mac)10079x-mac-icelandic冰岛语 (Mac)Icelandic (Mac)10081x-mac-turkish土耳其字符 (Mac)Turkish (Mac)10082x-mac-croatian克罗地亚语 (Mac)Croatian (Mac)20000x-Chinese-CNS繁体中文 (CNS)Chinese Traditional (CNS)20001x-cp20001TCA 台湾TCA Taiwan20002x-Chinese-Eten繁体中文 (Eten)Chinese Traditional (Eten)20003x-cp20003IBM5550 台湾IBM5550 Taiwan20004x-cp20004TeleText 台湾TeleText Taiwan20005x-cp20005Wang 台湾Wang Taiwan20105x-IA5西欧字符 (IA5)Western European (IA5)20106x-IA5-German德语 (IA5)German (IA5)20107x-IA5-Swedish瑞典语 (IA5)Swedish (IA5)20108x-IA5-Norwegian挪威语 (IA5)Norwegian (IA5)20127us-asciiUS-ASCIIUS-ASCII20261x-cp20261T.61T.6120269x-cp20269ISO-6937ISO-693720273IBM273IBM EBCDIC(德国)IBM EBCDIC (Germany)20277IBM277IBM EBCDIC(丹麦 - 挪威)IBM EBCDIC (Denmark-Norway)20278IBM278IBM EBCDIC(芬兰 - 瑞典)IBM EBCDIC (Finland-Sweden)20280IBM280IBM EBCDIC(意大利)IBM EBCDIC (Italy)20284IBM284IBM EBCDIC(西班牙)IBM EBCDIC (Spain)20285IBM285IBM EBCDIC(英国)IBM EBCDIC (UK)20290IBM290IBM EBCDIC(日语片假名)IBM EBCDIC (Japanese katakana)20297IBM297IBM EBCDIC(法国)IBM EBCDIC (France)20420IBM420IBM EBCDIC(阿拉伯语)IBM EBCDIC (Arabic)20423IBM423IBM EBCDIC(希腊语)IBM EBCDIC (Greek)20424IBM424IBM EBCDIC(希伯来语)IBM EBCDIC (Hebrew)20833x-EBCDIC-KoreanExtendedIBM EBCDIC(朝鲜语扩展)IBM EBCDIC (Korean Extended)20838IBM-ThaiIBM EBCDIC(泰语)IBM EBCDIC (Thai)20866koi8-r西里尔字符 (KOI8-R)Cyrillic (KOI8-R)20871IBM871IBM EBCDIC(冰岛语)IBM EBCDIC (Icelandic)20880IBM880IBM EBCDIC(西里尔俄语)IBM EBCDIC (Cyrillic Russian)20905IBM905IBM EBCDIC(土耳其语)IBM EBCDIC (Turkish)20924IBM00924IBM 拉丁语 1IBM Latin-120932EUC-JP日语(JIS 0208-1990 和 0212-1990)Japanese (JIS 0208-1990 and 0212-1990)20936x-cp20936简体中文 (GB2312-80)Chinese Simplified (GB2312-80)20949x-cp20949朝鲜语 WansungKorean Wansung21025cp1025IBM EBCDIC(西里尔塞尔维亚 - 保加利亚语)IBM EBCDIC (Cyrillic Serbian-Bulgarian)21866koi8-u西里尔字符 (KOI8-U)Cyrillic (KOI8-U)28591iso-8859-1西欧字符 (ISO)Western European (ISO)28592iso-8859-2中欧字符 (ISO)Central European (ISO)28593iso-8859-3拉丁语 3 (ISO)Latin 3 (ISO)28594iso-8859-4波罗的海字符 (ISO)Baltic (ISO)28595iso-8859-5西里尔字符 (ISO)Cyrillic (ISO)28596iso-8859-6阿拉伯字符 (ISO)Arabic (ISO)28597iso-8859-7希腊字符 (ISO)Greek (ISO)28598iso-8859-8希伯来字符 (ISO-Visual)Hebrew (ISO-Visual)28599iso-8859-9土耳其字符 (ISO)Turkish (ISO)28603iso-8859-13爱沙尼亚语 (ISO)Estonian (ISO)28605iso-8859-15拉丁语 9 (ISO)Latin 9 (ISO)29001x-Europa欧罗巴Europa38598iso-8859-8-i希伯来字符 (ISO-Logical)Hebrew (ISO-Logical)50220iso-2022-jp日语 (JIS)Japanese (JIS)50221csISO2022JP日语(JIS- 允许 1 字节假名)Japanese (JIS-Allow 1 byte Kana)50222iso-2022-jp日语(JIS- 允许 1 字节假名 - SO/SI)Japanese (JIS-Allow 1 byte Kana - SO/SI)50225iso-2022-kr朝鲜语 (ISO)Korean (ISO)50227x-cp50227简体中文 (ISO-2022)Chinese Simplified (ISO-2022)51932euc-jp日语 (EUC)Japanese (EUC)51936EUC-CN简体中文 (EUC)Chinese Simplified (EUC)51949euc-kr朝鲜语 (EUC)Korean (EUC)52936hz-gb-2312简体中文 (HZ)Chinese Simplified (HZ)54936GB18030简体中文 (GB18030)Chinese Simplified (GB18030)57002x-iscii-deISCII 梵文ISCII Devanagari57003x-iscii-beISCII 孟加拉语ISCII Bengali57004x-iscii-taISCII 泰米尔语ISCII Tamil57005x-iscii-teISCII 泰卢固语ISCII Telugu57006x-iscii-asISCII 阿萨姆语ISCII Assamese57007x-iscii-orISCII 奥里雅语ISCII Oriya57008x-iscii-kaISCII 卡纳达语ISCII Kannada57009x-iscii-maISCII 马拉雅拉姆语ISCII Malayalam57010x-iscii-guISCII 古吉拉特语ISCII Gujarati57011x-iscii-paISCII 旁遮普语ISCII Punjabi65000utf-7Unicode (UTF-7)Unicode (UTF-7)65001utf-8Unicode (UTF-8)Unicode (UTF-8)65005utf-32Unicode (UTF-32)Unicode (UTF-32)65006utf-32BEUnicode (UTF-32 Big-Endian)Unicode (UTF-32 Big-Endian)

The following Windows code pages exist:

874 — Thai932 — Japanese936 — Chinese (simplified) (PRC, Singapore)949 — Korean950 — Chinese (traditional) (Taiwan, Hong Kong)1200 — Unicode (BMP of ISO 10646, UTF-16LE)1201 — Unicode (BMP of ISO 10646, UTF-16BE)1250 — Latin (Central European languages)1251 — Cyrillic1252 — Latin (Western European languages, replacing Code page 850)1253 — Greek1254 — Turkish1255 — Hebrew1256 — Arabic1257 — Latin (Baltic languages)1258 — Vietnamese65000 — Unicode (BMP of ISO 10646, UTF-7)65001 — Unicode (BMP of ISO 10646, UTF-8)

SBCS (Single Byte Character Set) Codepages

DBCS (Double Byte Character Set) Codepages

Table 2-3 lSO 8859 Character Sets

StandardLanguages Supported

ISO 8859-1

Western European (Albanian, Basque, Breton, Catalan, Danish, Dutch, English, Faeroese, Finnish, French, German, Greenlandic, Icelandic, Irish Gaelic, Italian, Latin, Luxemburgish, Norwegian, Portuguese, Rhaeto-Romanic, Scottish Gaelic, Spanish, Swedish)

ISO 8859-2

Eastern European (Albanian, Croatian, Czech, English, German, Hungarian, Latin, Polish, Romanian, Slovak, Slovenian, Serbian)

ISO 8859-3

Southeastern European (Afrikaans, Catalan, Dutch, English, Esperanto, German, Italian, Maltese, Spanish, Turkish)

ISO 8859-4

Northern European (Danish, English, Estonian, Finnish, German, Greenlandic, Latin, Latvian, Lithuanian, Norwegian, Sámi, Slovenian, Swedish)

ISO 8859-5

Eastern European (Cyrillic-based: Bulgarian, Byelorussian, Macedonian, Russian, Serbian, Ukrainian)

ISO 8859-6

Arabic

ISO 8859-7

Greek

ISO 8859-8

Hebrew

ISO 8859-9

Western European (Albanian, Basque, Breton, Catalan, Cornish, Danish, Dutch, English, Finnish, French, Frisian, Galician, German, Greenlandic, Irish Gaelic, Italian, Latin, Luxemburgish, Norwegian, Portuguese, Rhaeto-Romanic, Scottish Gaelic, Spanish, Swedish, Turkish)

ISO 8859-10

Northern European (Danish, English, Estonian, Faeroese, Finnish, German, Greenlandic, Icelandic, Irish Gaelic, Latin, Lithuanian, Norwegian, Sámi, Slovenian, Swedish)

ISO 8859-13

Baltic Rim (English, Estonian, Finnish, Latin, Latvian, Norwegian)

ISO 8859-14

Celtic (Albanian, Basque, Breton, Catalan, Cornish, Danish, English, Galician, German, Greenlandic, Irish Gaelic, Italian, Latin, Luxemburgish, Manx Gaelic, Norwegian, Portuguese, Rhaeto-Romanic, Scottish Gaelic, Spanish, Swedish, Welsh)

ISO 8859-15

Western European (Albanian, Basque, Breton, Catalan, Danish, Dutch, English, Estonian, Faroese, Finnish, French, Frisian, Galician, German, Greenlandic, Icelandic, Irish Gaelic, Italian, Latin, Luxemburgish, Norwegian, Portuguese, Rhaeto-Romanic, Scottish Gaelic, Spanish, Swedish) 

Code Page Identifiers

59(共 70)对本文的评价是有帮助 评价此主题  

The following table defines the available code page identifiers.

Note   ANSI code pages can be different on different computers, or can be changed for a single computer, leading to data corruption. For the most consistent results, applications should use Unicode, such as UTF-8 or UTF-16, instead of a specific code page.

Identifier.NET NameAdditional information037IBM037IBM EBCDIC US-Canada437IBM437OEM United States500IBM500IBM EBCDIC International708ASMO-708Arabic (ASMO 708)709 Arabic (ASMO-449+, BCON V4)710 Arabic - Transparent Arabic720DOS-720Arabic (Transparent ASMO); Arabic (DOS)737ibm737OEM Greek (formerly 437G); Greek (DOS)775ibm775OEM Baltic; Baltic (DOS)850ibm850OEM Multilingual Latin 1; Western European (DOS)852ibm852OEM Latin 2; Central European (DOS)855IBM855OEM Cyrillic (primarily Russian)857ibm857OEM Turkish; Turkish (DOS)858IBM00858OEM Multilingual Latin 1 + Euro symbol860IBM860OEM Portuguese; Portuguese (DOS)861ibm861OEM Icelandic; Icelandic (DOS)862DOS-862OEM Hebrew; Hebrew (DOS)863IBM863OEM French Canadian; French Canadian (DOS)864IBM864OEM Arabic; Arabic (864)865IBM865OEM Nordic; Nordic (DOS)866cp866OEM Russian; Cyrillic (DOS)869ibm869OEM Modern Greek; Greek, Modern (DOS)870IBM870IBM EBCDIC Multilingual/ROECE (Latin 2); IBM EBCDIC Multilingual Latin 2874windows-874ANSI/OEM Thai (same as 28605, ISO 8859-15); Thai (Windows)875cp875IBM EBCDIC Greek Modern932shift_jisANSI/OEM Japanese; Japanese (Shift-JIS)936gb2312ANSI/OEM Simplified Chinese (PRC, Singapore); Chinese Simplified (GB2312)949ks_c_5601-1987ANSI/OEM Korean (Unified Hangul Code)950big5ANSI/OEM Traditional Chinese (Taiwan; Hong Kong SAR, PRC); Chinese Traditional (Big5)1026IBM1026IBM EBCDIC Turkish (Latin 5)1047IBM01047IBM EBCDIC Latin 1/Open System1140IBM01140IBM EBCDIC US-Canada (037 + Euro symbol); IBM EBCDIC (US-Canada-Euro)1141IBM01141IBM EBCDIC Germany (20273 + Euro symbol); IBM EBCDIC (Germany-Euro)1142IBM01142IBM EBCDIC Denmark-Norway (20277 + Euro symbol); IBM EBCDIC (Denmark-Norway-Euro)1143IBM01143IBM EBCDIC Finland-Sweden (20278 + Euro symbol); IBM EBCDIC (Finland-Sweden-Euro)1144IBM01144IBM EBCDIC Italy (20280 + Euro symbol); IBM EBCDIC (Italy-Euro)1145IBM01145IBM EBCDIC Latin America-Spain (20284 + Euro symbol); IBM EBCDIC (Spain-Euro)1146IBM01146IBM EBCDIC United Kingdom (20285 + Euro symbol); IBM EBCDIC (UK-Euro)1147IBM01147IBM EBCDIC France (20297 + Euro symbol); IBM EBCDIC (France-Euro)1148IBM01148IBM EBCDIC International (500 + Euro symbol); IBM EBCDIC (International-Euro)1149IBM01149IBM EBCDIC Icelandic (20871 + Euro symbol); IBM EBCDIC (Icelandic-Euro)1200utf-16Unicode UTF-16, little endian byte order (BMP of ISO 10646); available only to managed applications1201unicodeFFFEUnicode UTF-16, big endian byte order; available only to managed applications1250windows-1250ANSI Central European; Central European (Windows)1251windows-1251ANSI Cyrillic; Cyrillic (Windows)1252windows-1252ANSI Latin 1; Western European (Windows)1253windows-1253ANSI Greek; Greek (Windows)1254windows-1254ANSI Turkish; Turkish (Windows)1255windows-1255ANSI Hebrew; Hebrew (Windows)1256windows-1256ANSI Arabic; Arabic (Windows)1257windows-1257ANSI Baltic; Baltic (Windows)1258windows-1258ANSI/OEM Vietnamese; Vietnamese (Windows)1361JohabKorean (Johab)10000macintoshMAC Roman; Western European (Mac)10001x-mac-japaneseJapanese (Mac)10002x-mac-chinesetradMAC Traditional Chinese (Big5); Chinese Traditional (Mac)10003x-mac-koreanKorean (Mac)10004x-mac-arabicArabic (Mac)10005x-mac-hebrewHebrew (Mac)10006x-mac-greekGreek (Mac)10007x-mac-cyrillicCyrillic (Mac)10008x-mac-chinesesimpMAC Simplified Chinese (GB 2312); Chinese Simplified (Mac)10010x-mac-romanianRomanian (Mac)10017x-mac-ukrainianUkrainian (Mac)10021x-mac-thaiThai (Mac)10029x-mac-ceMAC Latin 2; Central European (Mac)10079x-mac-icelandicIcelandic (Mac)10081x-mac-turkishTurkish (Mac)10082x-mac-croatianCroatian (Mac)12000utf-32Unicode UTF-32, little endian byte order; available only to managed applications12001utf-32BEUnicode UTF-32, big endian byte order; available only to managed applications20000x-Chinese_CNSCNS Taiwan; Chinese Traditional (CNS)20001x-cp20001TCA Taiwan20002x_Chinese-EtenEten Taiwan; Chinese Traditional (Eten)20003x-cp20003IBM5550 Taiwan20004x-cp20004TeleText Taiwan20005x-cp20005Wang Taiwan20105x-IA5IA5 (IRV International Alphabet No. 5, 7-bit); Western European (IA5)20106x-IA5-GermanIA5 German (7-bit)20107x-IA5-SwedishIA5 Swedish (7-bit)20108x-IA5-NorwegianIA5 Norwegian (7-bit)20127us-asciiUS-ASCII (7-bit)20261x-cp20261T.6120269x-cp20269ISO 6937 Non-Spacing Accent20273IBM273IBM EBCDIC Germany20277IBM277IBM EBCDIC Denmark-Norway20278IBM278IBM EBCDIC Finland-Sweden20280IBM280IBM EBCDIC Italy20284IBM284IBM EBCDIC Latin America-Spain20285IBM285IBM EBCDIC United Kingdom20290IBM290IBM EBCDIC Japanese Katakana Extended20297IBM297IBM EBCDIC France20420IBM420IBM EBCDIC Arabic20423IBM423IBM EBCDIC Greek20424IBM424IBM EBCDIC Hebrew20833x-EBCDIC-KoreanExtendedIBM EBCDIC Korean Extended20838IBM-ThaiIBM EBCDIC Thai20866koi8-rRussian (KOI8-R); Cyrillic (KOI8-R)20871IBM871IBM EBCDIC Icelandic20880IBM880IBM EBCDIC Cyrillic Russian20905IBM905IBM EBCDIC Turkish20924IBM00924IBM EBCDIC Latin 1/Open System (1047 + Euro symbol)20932EUC-JPJapanese (JIS 0208-1990 and 0121-1990)20936x-cp20936Simplified Chinese (GB2312); Chinese Simplified (GB2312-80)20949x-cp20949Korean Wansung21025cp1025IBM EBCDIC Cyrillic Serbian-Bulgarian21027 (deprecated)21866koi8-uUkrainian (KOI8-U); Cyrillic (KOI8-U)28591iso-8859-1ISO 8859-1 Latin 1; Western European (ISO)28592iso-8859-2ISO 8859-2 Central European; Central European (ISO)28593iso-8859-3ISO 8859-3 Latin 328594iso-8859-4ISO 8859-4 Baltic28595iso-8859-5ISO 8859-5 Cyrillic28596iso-8859-6ISO 8859-6 Arabic28597iso-8859-7ISO 8859-7 Greek28598iso-8859-8ISO 8859-8 Hebrew; Hebrew (ISO-Visual)28599iso-8859-9ISO 8859-9 Turkish28603iso-8859-13ISO 8859-13 Estonian28605iso-8859-15ISO 8859-15 Latin 929001x-EuropaEuropa 338598iso-8859-8-iISO 8859-8 Hebrew; Hebrew (ISO-Logical)50220iso-2022-jpISO 2022 Japanese with no halfwidth Katakana; Japanese (JIS)50221csISO2022JPISO 2022 Japanese with halfwidth Katakana; Japanese (JIS-Allow 1 byte Kana)50222iso-2022-jpISO 2022 Japanese JIS X 0201-1989; Japanese (JIS-Allow 1 byte Kana - SO/SI)50225iso-2022-krISO 2022 Korean50227x-cp50227ISO 2022 Simplified Chinese; Chinese Simplified (ISO 2022)50229 ISO 2022 Traditional Chinese50930 EBCDIC Japanese (Katakana) Extended50931 EBCDIC US-Canada and Japanese50933 EBCDIC Korean Extended and Korean50935 EBCDIC Simplified Chinese Extended and Simplified Chinese50936 EBCDIC Simplified Chinese50937 EBCDIC US-Canada and Traditional Chinese50939 EBCDIC Japanese (Latin) Extended and Japanese51932euc-jpEUC Japanese51936EUC-CNEUC Simplified Chinese; Chinese Simplified (EUC)51949euc-krEUC Korean51950 EUC Traditional Chinese52936hz-gb-2312HZ-GB2312 Simplified Chinese; Chinese Simplified (HZ)54936GB18030Windows XP and later: GB18030 Simplified Chinese (4 byte); Chinese Simplified (GB18030)57002x-iscii-deISCII Devanagari57003x-iscii-beISCII Bengali57004x-iscii-taISCII Tamil57005x-iscii-teISCII Telugu57006x-iscii-asISCII Assamese57007x-iscii-orISCII Oriya57008x-iscii-kaISCII Kannada57009x-iscii-maISCII Malayalam57010x-iscii-guISCII Gujarati57011x-iscii-paISCII Punjabi65000utf-7Unicode (UTF-7)65001utf-8Unicode (UTF-8)

 

IBM PC (OEM) code pages [edit]

These code pages were originally embedded directly in the text mode hardware of the graphic adapters used with the IBM PC and its clones, including the original MDA and CGA adapters whose character sets could only be changed by physically replacing a ROM chip that contained the font. The interface of those adapters (emulated by all later adapters such as VGA) was typically limited to single byte character sets with only 256 characters in each font/encoding (although VGA added partial support for slightly larger character sets). Since the original IBM PC code page (number 437) was not really designed for international use, several partially compatible country or region specific variants emerged. Microsoft refers to these as the OEM code pages because they were defined by the OEM's who licensed MS-DOS for distribution with their hardware, not by Microsoft or a standard body. Examples include:

437 – Original IBM PC hardware code page667 - Polish (Mazovia)668 - Slavic720 – Arabic/Middle East737 – Greek770 - Lithuanian773 - Lithuanian775 – Estonian, Lithuanian and Latvian790 - Polish (Mazovia)819 - ISO 8859-1850 – "Multilingual (Latin-1)" (Western European languages)851 - Greek852 – "Slavic (Latin-2)" (Central and Eastern European languages)853 - Turkish (Latin-2)854 - Spanish855 – Cyrillic857 – Turkish858 – "Multilingual" with euro symbol860 – Portuguese861 – Icelandic862 – Hebrew863 – French (Quebec French)864 - Arabic/Middle East865 – Danish/Norwegian Differs from 437 only in the letter Ø (ø) in place of ¥ and ¢866 – Cyrillic867 – Czech (Kamenický)868 - Arabic/Middle East869 – Greek872 - Cyrillic874 – Thai[7]895 - Czech (Kamenický), (conflictive ID)912915932 - Japanese (DBCS)991 - Polish (Mazovia)

When dealing with older hardware, protocols and file formats, it is often necessary to support these code pages, but use of newer code pages, in particular Unicode, is encouraged for new designs.

Code page 819 is identical to Latin-1, ISO/IEC 8859-1, and with slightly-modified commands, permits MS-DOS machines to use that encoding. It was used with IBM AS/400 minicomputers.

Code pages for DBCS character sets [edit]

These code pages represent DBCS character encodings for various CJK languages. In Microsoft operating systems, these are used as both the "OEM" and "ANSI" code page for the applicable locale.

932 – Supports Japanese936 – GBK Supports Simplified Chinese949 – Supports Korean950 – Supports Traditional Chinese

Microsoft code page numbers for various other character encodings [edit]

The following code page numbers are specific to Microsoft Windows. IBM may use different numbers for these code pages.

1200 – UTF-16LE Unicode little-endian1201 – UTF-16BE Unicode big-endian65000 – UTF-7 Unicode65001 – UTF-8 Unicode10000 – Macintosh Roman encoding (followed by several other Mac character sets)10007 – Macintosh Cyrillic encoding10029 – Macintosh Central European encoding20127 – US-ASCII The classic US 7 bit character set with no char larger than 12728591 – ISO-8859-128592 – ISO-8859-228593 – ISO-8859-328594 – ISO-8859-428595 – ISO-8859-528596 and 38596 – ISO-8859-628597 – ISO-8859-728598 and 38598 – ISO-8859-828599 – ISO-8859-928600 – ISO-8859-1028601 – ISO-8859-11(28602 – ISO-8859-12)28603 – ISO-8859-1328604 – ISO-8859-1428605 – ISO-8859-1528606 – ISO-8859-16

Miscellaneous [edit]

(number missing) – ASMO449+ Supports Arabic(number missing) – MIK Supports Bulgarian and Russian as well

Windows (ANSI) code pages [edit]

Microsoft defined a number of code pages known as the ANSI code pages (as the first one, 1252 was based on an apocryphal ANSI draft of what became ISO 8859-1). Code page 1252 is built on ISO 8859-1 but uses the range 0x80-0x9F for extra printable characters rather than the C1 control codes used in ISO-8859-1. Some of the others are based in part on other parts of ISO 8859 but often rearranged to make them closer to 1252.

1250 – Central and East European Latin1251 – Cyrillic1252 – West European Latin1253 – Greek1254 – Turkish1255 – Hebrew1256 – Arabic1257 – Baltic1258 – Vietnamese874 – Thai

Microsoft recommends applications use UTF-8 or UCS-2/UTF-16 instead of these code pages.[8]

 

转载于:https://www.cnblogs.com/shangdawei/archive/2013/05/14/3078205.html


最新回复(0)