Top of the morning, reporters. I’m in Scotland right now, and we’re gonna do a special PDF edition, inspired by this article that advertised an “interactive map” but delivered… a PDF. *facepalm*
So we’re gonna do a special PDF edition, with all the tools and techniques and blowtorches I can think of.
– SmallPDF converts filetypes. It also has the cutest website.
– CometDocs does the same, but can OCR documents (strip text)
Some other converters are:
– Able2Extract (paid)
– Tabula was the subject of its own TFR post; it scrapes highlighted sections of PDF’s
– Overview has its own TFR as well; it can handle millions of documents
– DocumentCloud hosts, OCRs, embeds, analyzes and does a ton of other cool stuff
– PDFTables just converts to Excel format, but it was made by ScraperWiki, so it’s probably pretty good
– pdftk edits and manipulates, but you need to run it from the command line
– Combine PDFs is a super simple app I use to forge multiple docs into one
So! Did I miss any?