<< Click to Display Table of Contents >> Navigation: 5. Detailed description of the Actions > 5.2. Input Actions > 5.2.12. Edi-Fact / X12 Reader > 5.2.12.3. Character Encoding in EDI/X12 files |
The default character encoding used to read X12 files is “CP1252” (The “CP1252” character set is exactly like the standard “ISO-8859-1” but it contains, in addition, the Euro Symbol, that is quite useful). If the X12 file has a BOM (Byte-Order-Mark), then Anatella will use the character encoding specified inside the BOM (i.e. it will use UTF-8, UTF-16 or UTF32). If the X12 file does not contain any BOM, Anatella is still able to detect UTF-16 or UTF-32 files properly and open them accordingly.
For the EDI files, the same rules as for the X12 files applies (see the previous paragraph). If there is no BOM and if the file is not UTF-16, nor UTF-32, then Anatella looks at the first “Data Element” of the “UNB” Segment:
The First “Data Element” of the “UNB” Segment is: |
Character Encoding used by Anatella to read the EDI file |
UNOA, UNOB, UNOC |
CP1252 (Latin1 - Western Europe and Americas) |
UNOD |
ISO-8859-2 (Latin2 - Slavic and Central European languages) |
UNOE |
ISO-8859-5 (Latin - Cyrillic) |
UNOF |
ISO-8859-7 (Latin - Greek) |
UNOG |
ISO-8859-3 (Latin3 - Esperanto, Galician, Maltese, and Turkish) |
UNOH |
ISO-8859-4 (Latin4 - Scandinavia/Baltic) |
UNOI |
ISO-8859-6 (Latin - Arabic) |
UNOJ |
ISO-8859-8 (Latin - Hebrew) |
UNOK |
ISO-8859-9 (Latin5 - Same as Latin1 except for Turkish) |