Saving time and rearranging websites

Hi, reporters! I spend a lot of time teaching “scraping” (resourceful ways to get data) because journalists are interested in it. I generally never recommend out-of-the-box tools, because I teach that learning the basics of HTML is ultimately way more useful for scraping.

But! There is one tool that has finally won my heart, and it is called ParseHub.

I recommend ParseHub because it does something that’s genuinely really difficult, at least with the Google Sheets method I usually preach. It can grab a website element, and then the sub-parts of that element, correctly and in order, and pop them all in a spreadsheet for you.

I used it for an award-winning (yes I said it) story on education grants in Georgia. Basically, I had a long list of hundreds of schools, and I wanted to group them by city, so we could start digging for sources from there. ParseHub found each school, its city and website, and plopped them into Google Sheets, my home away from home.

Even if you feel like you’re not using “scraping,” ParseHub can be a great way to take some pesky web information and reorder it for you. Let’s all pray that it stays free, at least for another few years!

One more thing...

Did you miss the last TFR? Google’s Colab is an awesome tool for experimenting with code when you’re too scared to really dive in