py311-ocrmypdf

16.11.1textproc

Adds an OCR text layer to scanned PDF files

OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched or copy-pasted. Main features: * Generates a searchable PDF/A file from a regular PDF * Places OCR text accurately below the image to ease copy / paste * Keeps the exact resolution of the original embedded images * When possible, inserts OCR information as a "lossless" operation without disrupting any other content * Optimizes PDF images, often producing files smaller than the input file * If requested deskews and/or cleans the image before performing OCR * Validates input and output files * Distributes work across all available CPU cores * Uses Tesseract OCR engine to recognize more than 100 languages * Scales properly to handle files with thousands of pages * Battle-tested on millions of PDFs

$pkg install py311-ocrmypdf

github.com/ocrmypdf/OCRmyPDF ↗

Origin

textproc/py-ocrmypdf

Size

1.73MiB

License

MPL20

Maintainer

kai@FreeBSD.org

Dependencies

14 packages

Required by

2 packages

Dependencies (14)

unpaper tesseract python311 py311-rich py311-pluggy py311-pillow-heif py311-pillow py311-pikepdf py311-pdfminer.six py311-packaging py311-img2pdf py311-deprecation pngquant ghostscript10

Required By (2)

py311-paperless-ngx readur