Perl extension to extract main content of a web page
HTML::ExtractMain is a module which takes HTML content, and uses the Readability algorithm to detect the main body of the page, usually skipping headers, footers, navigation, etc.
$
pkg install p5-HTML-ExtractMainOrigin
www/p5-HTML-ExtractMain
Size
14.7KiB
License
ART10, GPLv1+
Maintainer
jnlin@freebsd.cs.nctu.edu.tw
Dependencies
3 packages
Required by
0 packages