How I converted PDF to text in Ubuntu.
Seems to be OCRmyPDF tool suits for me. Lets try $ sudo apt install ocrmypdf
Select languages $ tesseract –list-langs
Install that you need $ sudo apt install tesseract-ocr-spa
$ ocrmypdf -v -l ‘spa’ old.pdf new.pdf //Got errors and no output
I need -f attribute. $ ocrmypdf -v -f -l spa old.pdf new.pdf //Instead SPA use your PDF language
I’ve got PIL.Image.DecompressionBombError: Image size (1115186111 pixels) exceeds limit of 256,000,000 pixels, could be decompression bomb DOS attack.
Try to set --max-image-mpixels 1300 $ ocrmypdf -v -f -l spa --max-image-mpixels 1300 old.pdf new.pdf
Install LibreOffice draw $ sudo apt install libreoffice-draw //Starts to work after reboot
$ sudo apt install libreoffice-gnome libreoffice -y // For GNOME -y means YES for any prompt
Try to istall scribus $ sudo apt install scribus // I don't have needed dependancy
$ sudo apt install inkscape // Suitable only for editing one page PDF
Try PDF-Shuffler for edit multipage PDF. $ sudo apt install pdfshuffler // It is good idea to use pdfshuffler for cli
1. Install gImageReader it from snap.
2. Got an error No tessaract languages are available for use. Recognition will not work.
2.1. Try $ sudo apt install tesseract-ocr-rus // Helped