May 26, 2018

Produces summaries from the textual content of web pages

The HTMLSummary module produces summaries from the textual content of web pages. It does so using the location heuristic, which determines the value of a given sentence based on its position and status within the document; for example, headings, section titles and opening paragraph sentences may be favoured over other textual content. A LENGTH option can be used to restrict the length of the summary produced.

This distribution contains the HTMLSummary module, and some supporting modules. The full list of modules is

HTMLSummary TextSentence LinguaJAJcode LinguaJAJtruncate