PDF 2 txt with OCR
When a PDF file includes only a picture with text you cannot paste or use the text. You can extract the text from the image the following way:
- convert -normalize -density 300 foo.pdf foo.png
- tesseract foo.png foo.txt -l eng
You need to install the packages imagemagick, tesseract-ocr and tesseract-ocr-eng (on Debian: aptitude install imagemagick tesseract-ocr tesseract-ocr-eng).