Tag Archives | scraping

Digging Olympic Data at Londinium MMXII

This is a guest post by Makoto Inoue, one of the organisers of this weekend’s Londinium MMXII hackathon. The Olympics! Only a few days to go until seemingly every news camera on the planet is pointed at the East End of London, for a month of sporting coverage. But for data diggers everywhere, this is […]

Three hundred thousand tonnes of gold

On 2 July 2012, the US Government debt to the penny was quoted at $15,888,741,858,820.66. So I wrote this scraper to read the daily US government debt for every day back to 1996. Unfortunately such a large number overflows the double precision floating point notation in the database, and this same number gets expressed as […]

Software Archaeology and the ScraperWiki Data Challenge at #europython

There’s a term in technical circles called “software archaeology” – it’s when you spend time studying and reverse-engineering badly documented code, to make it work, or make it better. Scraper writing involves a lot of this stuff. ScraperWiki’s data scientists are well accustomed with a bit of archaeology here and there. But now, we want […]

Local ScraperWiki Library

It quite annoyed me that you can only use the scraperwiki library on a ScraperWiki instance; most of it could work fine elsewhere. So I’ve pulled it out (well, for Python at least) so you can use it offline. How to use pip install scraperwiki_local You can then import scraperwiki in scripts run on your […]

How to get along with an ASP webpage

Fingal County Council of Ireland recently published a number of sets of Open Data, in nice clean CSV, XML and KML formats. Unfortunately, the one set of Open Data that was difficult to obtain, was the list of sets of open data. That’s because the list was separated into four separate pages. The important thing […]

Scraping guides: Excel spreadsheets

Following on from the CSV scraping guide, we’ve now added one about scraping Excel spreadsheets. You can get to them from the documentation page. The Excel scraping guide is available in Ruby, Python and PHP. Just as with all documentation, you can choose which at the top right of the page. As with CSV files, at first […]

Access government in a way that makes sense to you? Surely not!

alpha.gov.uk uses Scraperwiki, a cutting edge data-gathering tool, to deliver the results that citizens want. And radically for government, rather than tossing a finished product out onto the web with a team of defenders, this is an experiment in customer engagement. If you’re looking to renew your passport, find out about student loans or how […]