Convert PDF to TEXT

How I converted PDF to text in Ubuntu.

pdf a text recognition with linux and Operating systems|lower


LINUX RECOGNIZE TEXT

 

  1. Seems to be OCRmyPDF tool suits for me. Lets try  $ sudo apt install ocrmypdf

  2. Select languages $ tesseract –list-langs

  3. Install that you need $ sudo apt install tesseract-ocr-spa

  4. $ ocrmypdf -v -l ‘spa’ old.pdf new.pdf //Got errors and no output

  5. I need -f attribute. $ ocrmypdf -v -f -l spa old.pdf new.pdf //Instead SPA use your PDF language

  6. I’ve got PIL.Image.DecompressionBombError: Image size (1115186111 pixels) exceeds limit of 256,000,000 pixels, could be decompression bomb DOS attack.

  7. Try to set --max-image-mpixels 1300 $ ocrmypdf -v -f -l spa --max-image-mpixels 1300 old.pdf new.pdf

 

TO EDIT PDF

  1. Install LibreOffice draw $ sudo apt install libreoffice-draw //Starts to work after reboot

  2. $ sudo apt install libreoffice-gnome libreoffice -y // For GNOME -y means YES for any prompt

  3. Try to istall scribus $ sudo apt install scribus // I don't have needed dependancy

  4. $ sudo apt install inkscape // Suitable only for editing one page PDF

  5. Try PDF-Shuffler for edit multipage PDF. $ sudo apt install pdfshuffler // It is good idea to use pdfshuffler for cli