-
Recent Posts
Our Tweets
- Energy news: using @scraperwiki to free data about oil fracing wells eenews.net/public/energyw… 1 day ago
- Microfinance in S Africa by @tlevine and #datakind plus themix.org an invisible market made more visible, bit.ly/LvmLoU 1 day ago
- Well done @tomtaylor (second place) and @colinwren (first place) for @scraperwiki prize at #nhshackday #nhshd 2 days ago
- Too many amazing projects at #nhshackday #nhshd. Exhausting judging them! 2 days ago
- Sitting on panel of diverse judges - doctors, journalists, technologists, former Chief Medical Officer at #nhshackday #nhshd 2 days ago
Find us on Facebook
Archives
Categories
Meta
Author Archives: Francis Irving
Our friendly competitors / partners
I made this diagram a few months ago now (for VCs), that shows the world of online data collaboration and scraping from a ScraperWiki point of view. It shows the kind of companies and technologies that, if ScraperWiki were to not … Continue reading
Job advert: Lead programmer
Oil wells, marathon results, planning applications… ScraperWiki is a Silicon Valley style startup, in the North West of England, in Liverpool. We’re changing the world of open data, and how data science is done together on the Internet. We’re looking for a … Continue reading
Lots of new libraries
We’ve had lots of requests recently for new 3rd party libraries to be accessible from within ScraperWiki. For those of you who don’t know, yes, we take requests for installing libraries! Just send us word on the feedback form and … Continue reading
Scraping guides: Dates and times
Working with dates and times in scrapers can get really tricky. So we’ve added a brand new scraping guide to the ScraperWiki documentation page, giving you copy-and-paste code to parse dates and times, and save them in the datastore. To … Continue reading
New backend now fully rolled out
The new faster, safer sandbox that powers ScraperWiki is now fully rolled out to all users. You should find running and developing scrapers and views faster than before, and that you’re using much more recent versions of Ruby, Python and associated … Continue reading
Scraping guides: Parsing HTML using CSS selectors
We’ve added a new scraping copy-and-paste guide, so you can quickly get the lines of code you need to parse an HTML file using CSS selectors. Get to it from the documentation page: The HTML parsing guide is available in Ruby, Python … Continue reading
Four data trends to rule them all, the data scientist king to bind them
My favourite soundbite from O’Reilly’s Strata data conference was a definition of big data. John Rauser, Amazon’s main data scientist, said to me that “data is big data when you can’t process it on one machine”. And naturally, small data is … Continue reading
Make RSS with an SQL query
Lots of people have asked for it to be easier to get data out of ScraperWiki as RSS feeds. The Julian has made it so. The Web API now has an option to make RSS feeds as a format (i.e. … Continue reading
Scraping guides: Excel spreadsheets
Following on from the CSV scraping guide, we’ve now added one about scraping Excel spreadsheets. You can get to them from the documentation page. The Excel scraping guide is available in Ruby, Python and PHP. Just as with all documentation, you can … Continue reading
A faster, safer sandbox to play in
When programmers first hear about ScraperWiki, their initial reaction is often “what! you let anyone edit general purpose code and run it on your servers!”. The answer is that, yes, we do, but in an isolated environment. Your own “sandbox” … Continue reading