Files
c3c/test/unit
konimarti 40e6a2c4a3 codepage: add single-byte code page support (#2891)
* codepage: add single-byte code page support

Add std::encoding::codepage with a shared engine for converting between
single-byte code pages and UTF-8 using table-driven mappings.

Introduce generated tables and wrappers for several code pages[1] each
exposing encode/decode helpers built on a common CodePageTable
structure.

The mapping data is generated by cpgen[2] from the Unicode Consortium’s
published code page mapping files and follows the Unicode standard’s
interpretation of control characters (abstract characters) rather than
historical VGA glyph shapes.

[1] Code page overview/groups:

    DOS/OEM code pages (legacy PC):
    cp437 cp737 cp775 cp850 cp852 cp855 cp857 cp860 cp861 cp862 cp863
    cp864 cp865 cp866 cp869 cp874

    Windows code pages (ANSI/Windows):
    cp1250 cp1251 cp1252 cp1253 cp1254 cp1255 cp1256 cp1257 cp1258

    ISO/IEC 8859 series (Latin/Regional):
    iso_8859_1 iso_8859_2 iso_8859_3 iso_8859_4 iso_8859_5 iso_8859_6
    iso_8859_7 iso_8859_8 iso_8859_9 iso_8859_10 iso_8859_11 iso_8859_13
    iso_8859_14 iso_8859_15 iso_8859_16

[2] github.com/konimarti/cpgen

Signed-off-by: Koni Marti <koni.marti@gmail.com>

* codepage: change encoding format, streamline api

* Use enum to collect the data.

---------

Signed-off-by: Koni Marti <koni.marti@gmail.com>
Co-authored-by: Christoffer Lerno <christoffer@aegik.com>
2026-02-11 01:10:12 +01:00
..
2025-07-21 02:40:17 +02:00