RECENT POSTS
- Introduction to FreeBSD Security Best Practices
- Working with Package Management in FreeBSD
- Understanding FreeBSD Security Advisories and Updates
- Troubleshooting Common System Administration Issues in FreeBSD
- Tips for Hardening FreeBSD to achieve System Protection
- Setting Up DHCP Server in FreeBSD
- Secure User and Group Management in FreeBSD Systems
- Secure Remote Access with SSH in FreeBSD
- Optimizing System Performance in FreeBSD
- Network Packet Capture with tcpdump in FreeBSD
- All posts ...
Do you have GDPR compliance issues ?
Check out Legiscope a GDPR compliance software, that will save you weeks of work, automating your documentation, the training of your teams and all processes you need to keep your organisation compliant with privacy regulations
P5-lingua-zh-wordsegmenter
Jul 20, 2023
Simplified Chinese Word Segmentation
This is a perl version of simplified Chinese word segmentation.
The algorithm for this segmenter is to search the longest word at each point from both left and right directions, and choose the one with higher frequency product.
The original program is from the CPAN module LinguaZHWordSegment https//metacpan.org/author/CHENYR I did the follwing changes 1 make the interface object oriented; 2 make the internal string into utf8; 3 using sogou’s dictionary http//www.sogou.com/labs/dl/w.html as the default dictionary.
- Older
- Newer
Checkout these related ports:
- Wordpress-zh_tw -
- Wordpress-zh_cn -
- Wenju - Collection of writing tools in Chinese
- Ve - NTHU-CS Maple BBS 2.36 BBS-like editor
- Ttfm - Big5/GB enhanced TrueType Font Manager
- Ttf2pt1 - True Type Font to Postscript Type 1 converter with Chinese maps
- Tintin++ -
- Tin -
- Taipeisanstc - Taipei Sans TC
- Sourcehanserif-tc-otf -
- Sourcehanserif-sc-otf -
- Sourcehansans-tc-otf -
- Sourcehansans-sc-otf -
- Scim-tables - SCIM table based Chinese input methods
- Scim-pinyin - SCIM Chinese Smart Pinyin input method