Clean those PDFs with Tabula

Hello, reporters! This week, we’re featuring guest contributor Teddy Maiorca again:

Have you ever looked at a PDF full of data and wished those tables would just turn into a spreadsheet without having to manually enter each row yourself?

Look no further than Tabula, a free, open-sourced tool that enables you to extract data locked inside PDF files in a matter of seconds.

Ditch the hassle of trying to copy rows from a table and paste them into Excel. Just upload your PDF to Tabula, click and drag to draw a box around the table you want to pull data from and, voilà, you can export your data as a CSV or Excel file.

If you have an especially large PDF and don’t have the time to draw a box around every table, hit the “automatically detect tables” button. Tabula can even save your selection as a template if you have multiple PDFs with similar layouts.

It works on Mac, Windows and Linux and was created by journalists, for journalists. Liberate that data, reporters!

 

Teddy Maiorca is a University of Missouri student who is currently working as a graphics editor for the Columbia Missourian.