P5-text-unidecode

Jul 20, 2023

US-ASCII transliterations of Unicode text

It often happens that you have non-Roman text data in Unicode, but you can’t display it – usually because you’re trying to show it to a user via an application that doesn’t support Unicode, or because the fonts you need aren’t accessible. You could represent the Unicode characters as “???????” or “\15BA\15A0\1610…”, but that’s nearly useless to the user who actually wants to read what the text says.

What TextUnidecode provides is a function, unidecode… that takes Unicode data and tries to represent it in US-ASCII characters.



Checkout these related ports:
  • Zbase32 - Base32 Encoder/Decoder
  • Ytnef - Unpack data in MS Outlook TNEF format
  • Yj - Convert between YAML, TOML, JSON, and HCL
  • Yj-bruceadams - Command line tool that converts YAML to JSON
  • Xml2c - Convert an XML file into C struct/string declarations
  • Xdeview - X11 program for uu/xx/Base64/BinHex/yEnc de-/encoding
  • Wkhtmltopdf - Convert HTML (or live webpages) to PDF or image
  • Uulib - Library for uu/xx/Base64/BinHex/yEnc de-/encoding
  • Uudeview - Program for uu/xx/Base64/BinHex/yEnc de-/encoding
  • Unix2dos - Convert ASCII newlines between CR/LF and LF
  • Tuc - Text to Unix Conversion
  • Trans - Character encoding converter generator
  • Tnef - Unpack data in MS Outlook TNEF format
  • Ta2as - TASM to AT&T asm syntax converter (GNU AS)
  • Showkey - Display cooked key sequences (keycap-to-keystrokes mappings)