GB18030

Range Codpoints Characters
GB 11383 A0 - FE 128 128
Graphical characters A1A1 - A9FE 846 718
Graphical characters A840 - A9A0 192 166
Chinese characters B0A1 - F7FE 6768 6763
Chinese characters 8140 - A0FE 6080 6080
Chinese characters AA40 - FEA0 8160 8160
User defined area AAA1 - AFFE 564
User defined area F8A1 - FEFE 658
User defined area A140 - A7A0 672
GB 13000.1 CJK extension A


This characterset was published on March 17, 2000 by the Ministry of Information Industry (zhōnghuárénmíngònghéguó xìnxí chǎnyèbù) under the name Information technology - Chinese Ideograms coded characterset for information interchange - Extension for the basic set (资讯技术 - 资讯交换用汉字编码字元集 - 基本集的扩充 xìnxíjìshù - xìnxíjiāohuàn yòng hànzì biānmǎ zìfújí - jīběnjí de kuòchōng).
The purpose of this characterset is to combine Unihan Extention A with the previous GB charactersets but also to create enough codepoints for Unicode BMP. To realise this a part of the characters are encoded using one byte (0x00 to 0x7F), a part is encoded using two bytes (0x81 to 0xFE for the first byte and 0x40 to 0x7FE for the second byte), and a part is encoded using 4 bytes (0x81308130 to 0xFE39FE30 or otherwise put 0x8130 to 0xFE39 for the first two bytes and 0x8130 to 0xFE39 for the last two bytes).

link : Ask Dr.International, #15

[ < back ] - [ home ]

   
Search >


Local links are in blue, links to other websites are in red, commands are in green.

You need unicode fonts, a 4+ browser and acrobat reader to fully explore and enjoy this webpage. (if necessary you can download asian fontpacks for acrobat reader)

Currently translating my thesis to English : more info

© Seba - contact at seba at ulyssis dot org
users online: 4