PdfToText is a lightweight solution contained into one single PHP source; its purpose is to extract text from your PDF files.

Written in pure PHP, the PdfToText class does not require you to use tools available only as external binary packages. This will be a brain-saver for you if you are using it on shared servers (no installation and no configuration will ever be required).

The PdfToText class currently supports the following features :
  • The text contents of a PDF file are available as a whole, but also as individual pages, using either the Text string property or the Pages array property
  • You can use class methods to search for text within your document and retrieve the corresponding page
  • You can extract JPEG images from your PDF input and save them to externals files
  • The class has been carefully optimized to reduce both memory usage and execution time

And more to come :
  • Handling of languages written from right-to-left (RTL)
  • Handling of CID fonts (which were designed by Adobe before the Unicode standard emerged)
  • Handling of additional image formats (CCIT FAX, etc.)
  • Better handling of character positioning on the page

Have a try with the online demo page, then take a look at the documentation one. And finally, dare to try a download !
Latest news
Version 1.3.0 has been released
Check the Downloads section !