5. Detailed description of the Actions > 5.2. Input Actions > 5.2.12. Edi-Fact / X12 Reader > 5.2.12.3. Character Encoding in EDI/X12 files

The default character encoding used to read X12 files is “CP1252” (The “CP1252” character set is exactly like the standard “ISO-8859-1” but it contains, in addition, the Euro Symbol, that is quite useful). If the X12 file has a BOM (Byte-Order-Mark), then Anatella will use the character encoding specified inside the BOM (i.e. it will use UTF-8, UTF-16 or UTF32). If the X12 file does not contain any BOM, Anatella is still able to detect UTF-16 or UTF-32 files properly and open them accordingly.

For the EDI files, the same rules as for the X12 files applies (see the previous paragraph). If there is no BOM and if the file is not UTF-16, nor UTF-32, then Anatella looks at the first “Data Element” of the “UNB” Segment:

The First “Data Element” of the “UNB” Segment is:	Character Encoding used by Anatella to read the EDI file
UNOA, UNOB, UNOC	CP1252 (Latin1 - Western Europe and Americas)
UNOD	ISO-8859-2 (Latin2 - Slavic and Central European languages)
UNOE	ISO-8859-5 (Latin - Cyrillic)
UNOF	ISO-8859-7 (Latin - Greek)
UNOG	ISO-8859-3 (Latin3 - Esperanto, Galician, Maltese, and Turkish)
UNOH	ISO-8859-4 (Latin4 - Scandinavia/Baltic)
UNOI	ISO-8859-6 (Latin - Arabic)
UNOJ	ISO-8859-8 (Latin - Hebrew)
UNOK	ISO-8859-9 (Latin5 - Same as Latin1 except for Turkish)