Archive by Author

Book review: Data Science at the Command Line by Jeroen Janssens

In the mixed environment of ScraperWiki we make use of a broad variety of tools for data analysis. Data Science at the Command Line by Jeroen Janssens covers tools available at the Linux command line for doing data analysis tasks. The book is divided thematically into chapters on Obtaining, Scrubbing, Modeling, Interpreting Data with “intermezzo” […]

Book review: Remote Pairing by Joe Kutner

Pair programming is an important part of the Agile process but sometimes the programmers are not physically co-located. At ScraperWiki we have staff who do both scheduled and ad hoc remote working therefore methods for working together remotely are important to us. A result of a casual comment on Twitter, I picked up “Remote Pairing” […]

NewsReader: the developers story

ScraperWiki has been a partner in NewsReader, an EU Framework 7 research project, for the last couple of years. The aim of NewsReader is to give computers the power to “understand” the news; to extract from a myriad of news articles the underlying events which gave rise to those articles; the who, the where, the […]

Exploring the ONS

The Office for National Statistics (ONS) is the United Kingdom statistical body charged by the government with the task of collecting and publishing  statistics related to the economy, population and society of England and Wales at national, regional and local levels. The data is typically published in the form of Excel spreadsheets. The ONS is […]

Book review: Network Graph Analysis and visualization with Gephi by Ken Cherven

I generally follow the rule that if I haven’t got anything nice to say about something then I shouldn’t say anything at all. Network Graph Analysis and visualization with Gephi by Ken Cherven challenges this principle. Gephi is a system for producing network visualisations, as such it doesn’t have a great many competitors. Fans of […]

Inordinately fond of beetles… reloaded!

Some time ago, in the era before I joined ScraperWiki I had a play with the Science Museums object catalogue. You can see my previous blog post here. It was at a time when I was relatively inexperienced with the Python programming language and had no access to Tableau, the visualisation software. It’s a piece […]