A site about how community groups and charities can make the most of data and open data to do something useful. Focused on Birmingham, relevant everywhere.

How to visualise data in new ways using word clouds


web2.0mashupA Word Cloud has the ability to visualise words based on the number of appearances in a particular text. I know this is very very basic stuff for some people, but I think it is an important post to help people understand that data can indeed be seen and used in different ways.

I won’t go through all the ins and outs of mashups and data visualisation, we’ve already got a page that explain that in more detail. But just to give a few general pointers stolen from that article:

  • A mashup is made by combining two or more data sources with a tool to create something new.
  • Data visualisation is when you take one data source and display the content in a new way.
  • The data source can be anything from a spreadsheet to a website.
  • The tool can also be described as an application or a widget, like online maps and graphs.

Obtaining the data

In this example we will create a word cloud based on the maiden speech made by James Arbuthnot July 20th. 1987. This speech is made available by TheyWorkForYou, a service that let’s you search public activity in the UK Parliament.

Your data source can be any text you like, most services also let’s you use a web address, but this is more unreliable. Another popular method is to use RSS feeds of blogs. These are helpful for quickly getting an overview of what the author is actually talking about, – if you don’t believe the about page ;-).

Choosing your tool

Depending on how you want to visualise your data, there are several ways to do it. In this example we want to create a cloud of words, I would recommend two tools for achieving this; Wordle and TagCrowd. Both are good options are provide different options for customising your final cloud. Wordle give you more visual control and TagCrowd gives you better editorial control.

If your data was of a different nature, another tool might fit your needs better. For example spreadsheet data would normally work better in charts and graphs or even in maps. But that is something we will cover in a later article

Visualising it

Now that the hard part is over, you simply have to combine the two to generate your word cloud. Copy the speech and paste it into either Wordle or TagCrowd. In Wordle you have to select ‘Create’ from the navigation, and once it is finished, you select ‘Save to public gallery’. Wordle now generates the embed code for you and you can show the word cloud in any web page (Although you’ll only get a small thumbnail). TagCrowd provides an HTML based output which is more flexible, but also more prone to errors.

TagCrowdHere is the Wordle equivalent of the same cloud.

Good luck and let us know how you get on. Please put a link in the comments to any Word Clouds you generate.

Update: In response to Jon Bounds comments (see below) we have changed the wording in this article slightly.

2 Responses to “How to visualise data in new ways using word clouds”

  1. Jon Bounds says:

    Great intro this, I wouldn’t call it data mashing tho’ — I’d say that involves combining data from two sources somehow (for example http://portwiture.com/ that combines Twitter statues with flickr photos).

    This is more an, interesting, visualisation technique.

    • I think you’re right, a better word would probably have been just ‘mashup’. We constantly struggle with the terminology. We don’t want to put people off by using technical terms, but at them same time educating about their meaning.

Leave a Reply

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>