Tag Archives | pdf

The Tyranny of the PDF

Got a PDF you want to get data from?Try our easy web interface over at pdftables.com!    Why is ScraperWiki so interested in PDF files? Because the world is full of PDF files. The treemap above shows the scale of their dominance. In the treemap the area a segment covers is proportional to the number of […]

Table Scraping Is Hard

The Problem NHS trusts have been required to publish data on their expenditure over £25,000 in a bid for greater transparency; A well known B2B publisher came to us to aggregate that data and provide them with information spanning across the hundreds of different trusts, such as: who are the biggest contractors across the NHS? […]