The origin of this distribution was several projects that involved text encoded in many obscure character encodings. Many of these encodings are not supported in the most frequently used character set conversion tools (i.e. iconv), so this package was put together to provide the encoding information in a simple, consistent format.
No program is provided to actually do the conversion between characters sets because of the wide variety of text file formats they appear in. It is up to the developer/user to write their own conversion programs using this data.
Individual mapping tables are available by clicking on the file link and
complete archives are available as:
| Current Mapping Tables - Version 2.1 - 28 May 2008 | ||||
|---|---|---|---|---|
| 1 | 8859-16.TXT | Expanded Latin alphabet 10. | Added 11 January 2000. | |
| 2 | ALTVAR.TXT | Alternativnyj Variant Russian. | ||
| 3 | ARMSCII-7.TXT | Armenian Standard Code for Information Interchange 1999, 7-bit encoding for transmission | Added 13 November 2000 | |
| 4 | ARMSCII-8.TXT | Armenian Standard Code for Information Interchange 1999, 8-bit encoding for Windows and Unix. | Added 13 November 2000 | |
| 5 | ARMSCII-8A.TXT | Armenian Standard Code for Information Interchange 1999, alternative 8-bit encoding for DOS and Macintosh. | Added 13 November 2000 | |
| 6 | AST166-7.TXT | Armenian national standard AST166.1997, 7-bit encoding for transmission. | ARMSCII-7 is more current. | |
| 7 | AST166-8.TXT | Armenian national standard AST166.1997, 8-bit encoding for Windows and Unix. | ARMSCII-8 is more current. | |
| 8 | AST166-A.TXT | Armenian national standard AST166.1997, "A" encoding for DOS and MacOS. | ARMSCII-8A is more current. | |
| 9 | ATEX.TXT | ATeX Arabic transliteration. | ||
| 10 | BRM.TXT | Buddhist Relief Mission transliteration encoding for Pali. | Added 25 July 2005 | |
| 11 | CP1133.TXT | IBM CP1133 Lao mapping. | Added 06 December 1999. | |
| 12 | CSCD.TXT | Chattha Sangayana CD Pali transliteration encoding. | Added 25 July 2005 | |
| 13 | CSCSX.TXT | Classical Sanscrit eXtended transliteration encoding. | Added 25 July 2005 | |
| 14 | CSXPLUS.TXT | Classical Sanscrit eXtended Plus transliteration encoding. | Added 25 July 2005 | |
| 15 | DECMCS.TXT | DEC Multinational Character Set 1987. | ||
| 16 | EGAF.TXT | EGA Farsi (Persian). | Visual encoding. | |
| 17 | GEO-ITA.TXT | Georgian InfoTech/Academy encoding. | ||
| 18 | GEO-PS.TXT | Georgian Parliament encoding. | ||
| 19 | GN-LINUX.TXT | Linux console Guarani encoding. | Added 22 September 2005. | |
| 20 | GN-TIMESG.TXT | A Times New Roman based variant encoding of Guarani. | Added 18 November 2005. | |
| 21 | GN-WIN.TXT | WIN-GN Guarani encoding for (La)TeX. | Added 22 September 2005. | |
| 22 | HAMSH.TXT | Hamshahri Persian encoding. | Visual encoding. | |
| 23 | IRANSYSTEM.TXT | Common Persian encoding. | Visual encoding. | Updated 21 January 2000. |
| 24 | IRNA.TXT | IRNA Persian encoding. | Visual encoding. | |
| 25 | ISIRI2900.TXT | Older Persian encoding. | Visual encoding. | |
| 26 | ISIRI3342.TXT | Mapping actually used in Iran. | ||
| 27 | ISO002.TXT | ISO 646 (IRV) mapping. | ||
| 28 | ISO006.TXT | ISO 646-1991 mapping. | Added 14 November 2000. | |
| 29 | ISO053.TXT | ISO 5426-1980 Extended Latin for Bibliographic use. | Added 03 November 2000. | |
| 30 | ISOIR111.TXT | ISO IR 111/ECMA Cyrillic. | Added 03 November 2000. | |
| 31 | JAGHBUB.TXT | Latin transliteration encoding for Middle Eastern languages. | Added 03 February 2006. | |
| 32 | KOI8RU.TXT | Obsoleted Ukrainian. | Updated 20 December 1999. | |
| 33 | KOI8U.TXT | KOI8 Ukrainian (RFC2319). | Added 20 December 1999. | |
| 34 | KOI8UNI.TXT | Fingertip Software Unified Cyrillic. | Updated 20 December 1999. | |
| 35 | KZ1048.TXT | Khazakh national standard. | Added 14 June 2007. | |
| 36 | MOZPALI.TXT | Pali transliteration encoding. | Added 25 July 2005. | |
| 37 | MULELAO1.TXT | Mule G1 Lao mapping. | Added 06 December 1999. | |
| 38 | NAVLS.TXT | Linguist's Software Laser Navajo mapping. | Added 25 July 2005. | |
| 39 | NBSC.TXT | Nota Bene SerboCroat Latin (partial mapping). | ||
| 40 | NORMYN.TXT | Normyn transliteration encoding for Pali. | Added 25 July 2005. | |
| 41 | PAFOR1.TXT | Foreign1 transliteration encoding for Pali. | Added 25 July 2005. | |
| 42 | PAKEW.TXT | Kew transliteration encoding for Pali. | Added 25 July 2005. | |
| 43 | PAKH2SKJ.TXT | KH2S_KJ transliteration encoding for Pali. | Added 25 July 2005. | |
| 44 | PALBIT.TXT | LeedsBit transliteration encoding for Pali. | Added 25 July 2005. | |
| 45 | PATRA.TXT | Times Roman A transliteration encoding for Pali. | Added 25 July 2005. | |
| 46 | PAVELT.TXT | Velthuis' (La)TeX sequences for Pali. | Added 25 July 2005. | |
| 47 | PAVRI.TXT | VRI transliteration encoding for Pali. | Added 25 July 2005. | |
| 48 | OSNOVAR.TXT | Osnovnoj Variant Russian. | ||
| 49 | PTCP154.TXT | Paratype Cyrillic Asian. | Added 04 May 2005. | |
| 50 | RISCOS.TXT | Acorn RISC OS. | Added 09 May 2003. | |
| 51 | SEASCII.TXT | Stanford Extended ASCII (from RFC 698). | ||
| 52 | SHIFTGB.TXT | Shifted GB2312.1980. | Updated 06 December 1999. | |
| 53 | SOCNET-C.TXT | Cyrillic font encoding used by http://www.serbianorthodoxchurch.net. | Added 27 May 2008. | |
| 54 | SOCNET-L.TXT | Latin font encoding used by http://www.serbianorthodoxchurch.net. | Added 27 May 2008. | |
| 55 | TEX-CMMI.TXT | TeX mapping for the Computer Modern Math Italic fonts. | ||
| 56 | TEX-CMR.TXT | TeX mapping for the Computer Modern Roman fonts. | ||
| 57 | TEX-CMSY.TXT | TeX mapping for the Computer Modern Symbol fonts. | ||
| 58 | TEX-CMTI.TXT | TeX mapping for the Computer Modern Text Italic fonts. | ||
| 59 | TEX-CMTT.TXT | TeX mapping for the Computer Modern Typewriter fonts. | ||
| 60 | TIS620.TXT | TCCII 2533 1009 / TIS 620 Thai. | ||
| 61 | UCODE.TXT | U-Code Russian. | ||
| 62 | VIQRI.TXT | Vietnamese Quoted Readable Implicit. | ||
| 63 | VISCII.TXT | VISCII 1.1 Vietnamese. | ||
| 64 | VN5712-1.TXT | TCVN 5712-1 1993 Vietnamese. | ||
| 65 | VN5712-2.TXT | TCVN 5712-2 1993 Vietnamese. | ||
| 66 | VNI.TXT | VNISoft encoded Vietnamese. | Added 25 July 2005 | |
| 67 | VPS.TXT | VPS encoded Vietnamese. | Added 25 July 2005 |
libeth library available at ftp://ftp.geez.org/pub/libeth/libeth-0.34e.tar.gz
iconv program found on many distributions of
Linux and Unix these days.