As I’ve previously written, the COINS database is not just a very significant release of open data, it’s also quite hard to get your head round.
With this in mind, Her Majesty’s Treasury is holding a seminar on July 2 2010, from 9.30am to 12.30pm for coders and others who want to understand more about the database, which provides details of all government spending. If you contact firstname.lastname@example.org by Friday June 25 2010 you should be able to sign up for the event.
Visit this link to the data.gov.uk site for more information on the event.
A couple of weeks ago the government did something really significant. No, it wasn’t anything to do with the Big Society – significant though that may be. It was the release of one enormous database, known as COINS.
Essentially COINS (and I do really like writing it in capitals) is the detail on everything the government in the UK spends at every single level.
The treasure trove of information is so big it’s caused people considerable trouble to download, but if you can make sense of the thousands and thousands of entries almost every expert agrees COINS is the grand prize, the data release most likely to transform our understanding of government.
I won’t bother saying much more, because there’s, frankly, not that much to say. It’s going to take a long time to get our heads round the database. But, if you’re interested in finding out more about how it will be used and is being used, here are some interesting links to keep you going.
- This helpful explanation from HM Treasury should clear up some of the basic questions about COINS – what it is, how the data has been presented and who might find it useful. Since there’s so much complex informationn, HM Treasury reckons it might be more useful to organisations, rather than individuals. This, gulp, 31-page PDF gives you a full description of how to understand COINS.
- The Guardian, which to its enormous credit has been at the vanguard of the fight to open up government data, has followed the release closely. This post does a good job of summming up what it’s all about.
- Here’s Alpine Interactive‘s visualisation of spending by department that gives you some impression of what COINS can be used for, too.
- But, perhaps most impressively and most excitingly, you can get to grips with COINS by looking at Wheredoesmymoneygo’s online search tool for the database.
- This excellent blog post will explain what the most important fields in COINS actually mean. And the Guardian has a glossary of terms here.
Thanks to Paul Bradshaw I recently came across a post by Tony Hirst who has very helpfully provided an extraordinary lesson in the joys (and I do mean joys) of scraping data using Google Docs, which I strongly recommend following if you’re at all interested in these sorts of things and haven’t seen it (it was written nearly two years ago).
In just a few very easy steps, as Tony very helpfully points out, it’s possible to get data to appear magically in your spreadsheet (from Wikipedia) and then turn that data into a map, with Google Maps via Yahoo Pipes . It’s just a matter of sitting down and reading Tony’s post – and carefully following it.
What’s remarkable is that much of this is so straightforward. Admittedly, Yahoo Pipes – an extraordinarily powerful tool – does take some getting used too. But then there are people like Tony who can help you get used to some of its quirks.
I followed the tutorial word for word and got almost everything working, but have had a little trouble with the map part. It turned out there have been a few problems with the Yahoo Pipes location module, which appears to be a bit tempremental. I’ve since learned from Mary Hamilton that you can concatenate a bit of postcode in to get it working, but am yet to try that.
Still, this blog post is very much recommended. I’ve started to muck around, as a result, with scraping data from Birmingham City Council’s website. In particular, I’ve had a bit of a go at getting swimming pool opening times off the site.
Helpfully (although I’m quite sure it wasn’t deliberate) the website is organised into tables. It’s relatively straightforward to grab the contents of the table and stick it in the spreadsheet. However, the tables don’t all follow the same rules. Some have a Monday to Friday field, for example, while others have a separate entry for each day of the week.
Nonetheless, it opens up the possibility of gaining a greater understanding of swimming pool opening times in Birmingham as part of my ongoing investigation into swimming pool provision in the city.
As Chris points out it is something of a surprise that election data isn’t just available freely. But it isn’t. Well, at least until Chris got involved.
I got a chance to speak to Chris about why he was interested in providing this information and how he went about the process. He makes some really valuable points:
- The only database of election results is commercial, but it should be a matter for public record – and therefore free.
- While local authorities do publish results on their websites, they do so in lots of different ways – with different formats and styles.
- Chris is on the local public data panel. Councils have approached them but don’t know how to do it.
- It can be intimidating to get involved in open data for public authorities
As a result, Chris had the idea that he could get councils to produce their results on the web in a consistent machine-readable form. Then it’d be possible to build up a database, while councils could begin to learn about how to use open data.
- While it took some time to get the project off the ground, now a number of people have started to get interested. Very recently there are many councils involved and Chris says that councils have started to help each other.
- There are at least 30 people in local authorities who are beginning to understand how open data can be useful to them.
- Now they’ll have a database with between 20 and 40 local authorities’ election results, some going as far back as 1998.
Now Chris says it can be demonstrated that local authorities can be part of the open data movement in a fairly simple way. Chris talks passionately about how big, monolithic projects often turn out to be useless. Instead, he’s much more interested in small projects that be demonstrably useful – and will help people to learn.
Rob Benson handles E-communications for Birmingham East and North Primary Care Trust. During SpeedData Rob spoke about how it might be quite difficult to get some people within organisations to appreciate the reasons for making data available, particularly within an institution like the NHS where issues of privacy are particularly important.
But Rob was taken with Jon Bounds cat-based explanation of why giving people things they can use for themselves can mean your data much more – and much more interesting ways – than if you don’t. As Rob is quick to point out, by just using a few simple tools, with some data and a bit of imagination, you can make a powerful point on your own – rather as Dave Harte demonstrated with his gritting map.
Alex Burrows is the head of strategy at Centro, the public transport authority for the West Midlands. He explained that Centro deals with enormous amounts of data, but has until now perhaps not been as aware of some of the interesting things that other people might be prepared to do with their data.
I spoke to Alex as the SpeedData drew to a close about how he’d found the event and what value he felt he’d got from it. We’d just been chatting to Matthew Somerville, who had been showing Alex his traintimes.org.uk site – an example of how other a motivated individual can take a source of data and do something interesting and very useful with it.
Jon explained why people do stuff for free on the web by talking about Cliche Kitty and Domo Kun.
For those who don’t know, Cliche Kitty and Domo Kun were brought together on the internet. Have a look here.
Jon described how people started to use these images and add to them and create new and sometimes very clever things. Jon says that this process is very simple. It doesn’t take much skill to add captions to images, for example. And Jon also introduced Richard Dawkins’ Meme Theory. This states that ideas only survive if they are able to compete and therefore develop and evolve.
Jon then went on to describe how the LOLcats phenomenon began to develop, with pictures and captions – what are known as image macros. He then showed a graph that demonstrated how, by releasing your data in one single format you’ll only get a sudden short surge. By allowing your data to be played with it will attract a far greater audience.
Jon described how open movements can grow quickly, including the uksnow twitter tag and map and his own Twitpanto project, which started from nothing more than a message on twitter.
Big City Plan
Jon described how a group of Birmingham bloggers got together because they were frustrated by the ‘monolithic’ nature of the consultation document for the Big City Plan.
He described how the bloggers translated it into plain english and made a really simple website, Big City Plan Talk. The site included the council document and a plain English translation. It was also particularly easy to comment on. They vaguely knew that if people were aware of it they’d use it. Without any publicit a quarter of the total responses to the consultation came to their site. It was a genuinely useful thing, Jon said, because it has been useful to other people who’ve used the model since.