Py-langid

Jul 20, 2023

Standalone Language Identification (LangID) tool

langid.py is a standalone Language Identification LangID tool.

The design principles are as follows

Fast
Pre-trained over a large number of languages currently 97
Not sensitive to domain-specific features e.g. HTML/XML markup
Single .py file with minimal dependencies
Deployable as a web service

Remark the main script langid/langid.py is cross-compatible with both Python2 and Python3, but the accompanying training tools are still Python2-only, hence not installed by this port.

See also the port textproc/py-langdetect for a similar program.



Checkout these related ports:
  • Zxing-cpp - ZXing C++ Library for QR code recognition
  • Zu-hunspell - Zulu hunspell dictionaries
  • Zu-aspell - Aspell Zulu dictionary
  • Zq - Easier and faster alternative to jq
  • Zorba - General purpose C++ XQuery processor
  • Zenxml - Simple C++ XML Processing
  • Zed - Command-line tool to manage and query Zed data lakes
  • Yq - Command-line YAML and XML processor, jq wrapper for YAML/XML documents
  • Yould - Pronounceable word generator
  • Yodl - Easy to use but powerful document formatting/preparation language
  • Yi-hunspell - Yiddish hunspell dictionaries
  • Yi-aspell - Aspell Yiddish dictionary
  • Yelp-xsl - DocBook XSLT stylesheets for yelp
  • Yelp-tools - Utilities to help manage documentation for Yelp and the web
  • Ydiff - Diff readability enhancer for color terminals