Chapter 4

Lessons from the London Datastore

I’ve worked for local government in London since 2005. In March 2009, I moved to City Hall to undertake a yearlong research project funded by the Greater London Authority (GLA), Capital Ambition, and the Department of Communities and Local Government. The purpose of the project was to examine how policy was working across the London Boroughs, particularly regarding their use of new media and technology. It also meant analyzing their use of qualitative research methodologies. This project built on research previously undertaken by Leo Boland, who had recently taken on the role of Chief Executive of The Greater London Authority.

Boland and I co-authored an article published in 2008 in the journal Public Money & Management entitled “What Lies Beyond Service Delivery? Leadership Behaviors for Place Shaping in Local Government” (Boland, L. & Coleman, E., 2008). We noted how governments were struggling to create cognitive shifts around areas such as waste minimization and obesity, as well as the co-production of services. We were particularly influenced by the view that:

A public sector that does not utilize the power of user-generated content will not just look old, outdated, and tired. It will also be far less productive and effective in creating public goods. (Leadbeater, C. & Cottam, H., 2007)

We accepted that the big challenge for public service reform was not just to make public services more efficient and reliable—like next-generation consumer web services, such as—but to make them communal and collective, which means inviting and encouraging citizens to participate. To us, open data seemed a vital component of that invitation to participation.

Some important policy milestones had paved the way. In 2008, at the central government level in the United Kingdom, the Power of Information Taskforce (headed by then Labor Minister Tom Watson) ran a competition asking, “What would you create with public information?” The competition offered a substantial prize fund for the winner. In London, Boris Johnson, as part of his election manifesto for mayor, had committed to publishing an open register of interests for all mayoral advisors, and providing a search function on the mayor’s website that would enable all Londoners to instantly find information about all grants, contracts, and programs over £1,000. And on President Obama’s first day in office in 2009, he made great inroads by issuing the Open Government Directive committing to three principles—transparency, participation, and collaboration—as the cornerstone of his administration.

The City Hall Perspective

The importance of strong political leadership cannot be underestimated in the drive to opening up public data. In the process, however, it is interesting to see how public officials can sometimes undermine that leadership. Mayor Boris Johnson had brought with him to City Hall a cadre of mayoral advisors, all of whom had close ties with the Conservative party (then in opposition in government) in the UK and all of whom were of a generation that understood the power of technology.

Individuals like Guto Harri, Communications Director, and Dan Ritterband, Marketing Director, were close to Steve Hilton, former Director of Strategy for David Cameron. Hilton is also the husband of Rachel Whetstone, the Global VP of Public Affairs and Communications for Google. This group of people all encouraged the Mayor to support an official open data portal for London, called the London Datastore, in order to fulfill his manifesto pledges. They were also keeping a keen eye on the national position being adopted by Conservative Campaign Headquarters before the 2010 General Election.

The 2010 Conservative Party Manifesto made explicit reference to open data under the heading “Change Society to Make Government More Transparent,” though no reference to the role of open data was mentioned in two additional related manifesto categories. Reading between the lines from a policy point of view, it seemed that the open data focus of the Conservatives was on transparency rather than the disruptive opportunities that open data offered. It didn’t focus on open data’s potential role in stimulating economic activity or harnessing disruptive technologies that could benefit citizens. However, there was enough of an open door and the right winds of change to make the establishment of the London Datastore a possibility.


In 2007, a working group convened by Tim O’Reilly and Carl Malamud offered a definition of what constitutes open data. The resulting document cited eight principles that are widely quoted in the open data movement to determine whether data is open or not: complete, timely, accessible, able to be processed by a machine, non-discriminatory, available without registration, non-proprietary, and free of any copyright or patent regulations (Open Government Working Group, 2007). David Eaves, the Canadian open data activist, simplified the definition somewhat in his influential Three Laws of Open Government Data:

  1. If it can’t be spidered or indexed, it doesn’t exist.

  2. If it isn’t available in open and machine-readable formats, it can’t engage.

  3. If a legal framework doesn’t allow it to be repurposed, it doesn’t empower. (Eaves, D., 2009)

I was heavily influenced by Eaves’ definition because it offered a very simple explanation of open data, especially in the early days when little was understood about its potential and it was hard to find actual, practical examples to point to.

The Beginnings

I established a small internal group to begin the scoping process for the establishment of the London Datastore. It included members of both the Data Management Asset Group (DMAG) and the Technology Group (TG) within the GLA. The initial proposition by DMAG and TG was to develop a “web portal” using proprietary software. An initial prototype for this had already been built. Given my interest in ensuring that policy development should be a two-way process and mindful of the invitation to participate, I argued for a shift in approach to open up the scoping process to those most likely to use the data in the first place—technologists and those active in the open data movement.

The role of social media, particularly Twitter, is something not to be underestimated when trying to develop a successful model of engagement around government data. Our call to “Help Us Free London’s Data” was sent via the London Datastore Twitter account (@londondatastore) on October 20, 2009, linking to the following invitation:

The Greater London Authority is currently in the process of scoping London’s Datastore. Initially, we propose to release as much GLA data as possible and to encourage other public agencies in London to do the same, and we’d like your help! We want the input of the developer community from the outset prior to making any decisions on formats or platforms. We would, therefore, like to invite interested developers to City Hall, so that we can talk to you about what we want to do, get your views, and seek your input on the best way to deliver for London. (“Help Us Free London’s Data,” 2009)

This invitation drew over sixty developers to our open workshop on the following Saturday in London’s Living Room in City Hall. We got some clear messages from the technology community that helped us manage expectations in the months to follow. We heard their deep level of frustration and cynicism from the many years they had spent trying to get public data released, most specifically in the areas of Transport and Crime. We also heard their concerns that the current structures of government might stop the project from going much further.

More importantly, we listened to them when they told us to “go ugly early” and not make the mistake that government often does of allowing perfection to be the enemy of good. They told us that, as long as the data was not in PDF form, they would take it, and they would help us clean it up at no cost to the state. By working together, we could make things better—it was a powerful moment in the data release journey.

I believe that being open from the very beginning was a crucial element of the success of the London Datastore. Said technologist Chris Thorpe, a former engagement strategist for The Guardian’s Open Platform initiative, in his subsequent blog:

Being invited into an organization’s home for the start of something suggests a good open relationship to come. The presence on a Saturday of several GLA staff involved in the process also shows me they care deeply about it. (Thorpe, 2009)

Until very recently, I worked in government for thirteen years, largely in communications and engagement, and later, in policy and strategy. Many of those years were spent trying to articulate difficult government propositions to an often apathetic or hostile electorate. The emergence of that kind of third party endorsement for a government initiative, from a respected member of the technology community like Chris (or any community for that matter), is something I found very powerful. I believe that is something that government needs a lot more of if it is to have any hope of repairing the democratic deficit that exists around the world. Open is the only way to achieve this.

Once you move into the open, though, you have to continue in the open, and this can end up being where the real tensions of data release play out. Following the launch of the London Datastore on January 7, 2010, I wrote in a blog post:

On [January] 7, we promised to increase the datasets from 50 to 200 by [January 29], and thanks to the good work of Gareth Baker in the DMAG team in the GLA, we did just that. Since then, we have had a more or less continuous stream of meetings with the functional bodies Transport for London, Met Police, London Development Agency, Olympic Development Authority, and LFEPA. These meetings have been held with the developer community variously represented by Professor Jonathan Raper, Chris Taggart, and Tom Loosemore—and it’s been exciting to see the interchange between those in the developer community and public servants—coming as they do from different cultures.

All of the functional bodies have agreed that the Datastore is a good idea and have committed to freeing up data in the coming weeks and months. We realize that this might not meet the sense of urgency in the developer community—but let’s not pull any punches. We knew that negotiations were always going to be time-consuming and, in some cases, difficult. And let’s not be coy about it—being comfortable about releasing data requires huge cultural shifts in the public sector. But we have left all of our meetings encouraged and with the definite feeling that the agenda is changing fast and for the better. (Coleman, 2010)

Reading Between the Lines

In reality, however, things were a little less rosy behind the scenes. My blog post was trying to hold a fine line between managing developer expectations and being honest about the challenges I was experiencing at an official level. As a public servant working for the GLA group, I could not possibly be publicly critical of the reluctance and resistance that I was getting at the official level to the release of their data. To do so would have potentially undermined the authority of the mayor and suggested divisions within the Greater London Authority group.

Since I raised expectations in the stakeholder group to a high level very publicly, I now faced the reality that the timeframe for the release of Transport and Crime data was going to be quite long. Even though the mayor had clearly signaled his intent to presume openness by default at the launch of the London Datastore, it was becoming clear that many of his public servants who were charged with implementing his policy were not inclined to comply with his wishes.

The resistance that I was experiencing does not emerge in isolation in response to a particular initiative, but rather is hard-wired into the bureaucracy. It’s both to do with cultures of government secrecy generally (Bennett, 1985; Worthy, 2008), as well as progressive attempts by governments to exploit the monetization of state data (Burkert, 2004).

Other commentators suggest that there is a three-tiered driver at play in the release of open government data. “Three groups of actors can be distinguished: civil society, mid and top level public servants. All actors must be engaged in order to ensure the success of the open data project” (Hogge, 2010). Interestingly, within civil society, Hogge identified “civic hackers” as particularly important.

While I agree with her point about civic hackers (I like to call them digital disrupters), I disagree with her selected drivers and would suggest that the three actors that must be engaged are the state, civil society, and the media. When I, as a public official, was unable to state publicly the resistance to data release at an official level, I could brief the digital disrupters in the Datastore network. They could raise issues on their blogs and ask questions publicly through their networks (social and otherwise) that brought external pressure to bear on their local and central government contacts.

Equally, the role of the media cannot be underestimated. Charles Arthur, technology editor for The Guardian, played an essential part in the establishment of the London Datastore. He epitomizes the potential of a new relationship between government and media. A long-time proponent of open data, Charles understood the state’s nervousness in entering this territory and the importance of reinforcing the good aspects. He gave praise where it was due, rather than adopting the “gotcha stance” that many in the media take when government takes new steps in new directions. He wrote a seminal piece in September 2010 that praised both Transport for London and the Mayor:

You might think that Boris Johnson’s presence pushing this along is just a bit of grandstanding, but that wouldn’t be correct. He’s actually been in the vanguard of politicians introducing open data. If you have a long memory for public data-related stories, you’ll recall that he did a rather neat end-run around the Labour administration’s Home Office in 2008, when, as part of his manifesto while running for the office of London mayor, he declared that he would publish crime maps... Johnson did go on to publish them, and London has been in the forefront of cities, which have tried to do innovative things with the data that its local government and authorities collect. (Arthur, 2010)

It’s worth noting that from a UK perspective, The Guardian publicly praising a Conservative Mayor is notable because while The Guardian regards itself as the paper of record, Conservative commentators perceive it as the home of the left.

And Then There Is Happenstance

The world of data release is neither linear nor always planned. For example, we always knew that the release of bus data in London would be a game changer for the city, since so many Londoners rely heavily on the bus network. We had lots of discussions with TfL over many months about releasing bus data that demonstrated the state was still struggling with the speed of technology (even though we were quite far along on the data release journey). In one meeting between me, transport developers, and Transport for London, the TfL officer responsible for releasing the bus dataset laid out a timeframe that included publishing the data on TfL’s website, then waiting six months to enable the data to “bed down” before releasing an official API for developers.

There was quite a heated exchange between the developers and the official while they explained that, as soon as the data went live on TfL’s website, they would simply scrape the data and build their apps anyway. A few days later, I received a call from the TfL official telling me that he had considered the discussion and would shorten the data release deadline by three months (bearing in mind that TfL had a whole marketing campaign ordered and paid for to coincide with their release).

However, within hours of that conversation ending, I started noticing some interesting tweets suggesting that TfL had released their bus data. What followed was a rather surreal conversation with TfL. It turns out that the link to the data was available internally on the TfL intranet all along, and someone had simply emailed the link externally, whereupon the developers descended and immediately started building their apps.

“You’ve got to tell those developers to stop accessing that data,” the beleaguered TfL official pleaded with me, bemoaning his loss on the planned marketing campaign and worried about the impact the data load would have on the TfL servers. I had to explain that I didn’t know all the developers and ask if he ever heard of the whack-a-mole principle.

Once the genie is out of the bottle, there are effects in the system that you simply cannot control. Given that TfL had released so much of its real-time data at that point, whoever made the link to the bus data public probably reasoned there was no reason not to do so. Open begets more open, and the levers of command and control in the organization can suddenly cease to have the power that they once did.

Three Years On

I have to say that I’m now a poacher turned gamekeeper. I recently left government to join Transport API, a startup that is building its business on open data, including that released by Transport for London. As an aggregator of open data, we are at the coalface of building the businesses we predicted could exist if the state released its public data. Our platform provides data to the incredibly successful City Mapper app in London and, with our partner Elgin (another open data company), we are providing intelligent transport solutions to local authorities around the UK at vastly more cost-effective prices and better terms and conditions than those offered by the incumbents.

We are surrounded by many small companies working on similar issues in the Open Data Institute offices in London where we are based. Our colleagues at Mastodon C, who work with big data and health data, were recently lauded by the** Economist for their prescription analytics demonstrating the vast sums of money that the National Health Service could save by using generic versions of commonly prescribed drugs. There is also mySociety, a nonprofit organization that continues to develop innovative technology solutions, like Mapumental, FixMyStreet, and FixMyTransport. Of course, there is also Chris Taggart, who is building OpenCorporates, the open database of the corporate world. We hope that all our companies will make the world a better, more open, and more efficient place for citizens—and we believe in it so strongly that we are putting our own money where our mouths were almost three years ago.

I also work on a consultancy basis with the Connected Digital Economy Catapult, an initiative of the Technology Strategy Board (TSB) tasked with supporting the acceleration of the UK’s best digital ideas to market. Supporting the interoperability of open data is one of its key targets, building platforms to create multiplier effects with that data along the digital economy value chain. That, along with the work of the Future Cities Catapult, also established by the TSB, provides an important emergent new infrastructure, which I hope will give further impetus, support, and capability for even more tranches of data release in the coming years.

I think there are challenges for the private sector that are different—but not dissimilar—to those the public sector faced when open data became part of the policy landscape. The race for ownership of the smart city has been on for quite some time. A 2010 report commissioned by the Rockefeller Foundation (The Future of Cities, Information and Inclusion) bears out my experiences as Director of Digital Projects in City Hall. The report suggests that there is a potential battle between Jane Jacobs-inspired hacktivists pushing for self-serve governance and latter-day Robert Moses types carving out monopolies for IBM or Cisco. It also argues that without a delicate balance between the scale of big companies and the DIY spirit of government 2.0, the urban poor could be the biggest losers.

Big companies, as well as government, need to learn that you have to collaborate to compete, you have to operate on a presumption of openness, and you have to move away from the idea of first to market advantage. There is profit to be made, of course, but that profit need not be at the expense of a better deal for citizens. All of us, state and private sector alike, need to first ask what is best for public value—and then we need to share our assets to achieve those public goods. The next part of this journey is going to be exciting.

About the Author

Emer Coleman was the architect of The London Datastore. She also works as a journalist and consultant and writes about how technology impacts organizational development. She is the founder of DSRPTN, a consultancy specializing in leadership and change, and also the Business Development Director at TransportAPI, a startup powering innovation and change in transport. She was named in Wired Magazine’s Top 100 Digital Power Influencers List 2011. She holds a BA in History and Sociology from University College Cork and an MPA from Warwick Business School.


  • Arthur, C. (2010, September 3). Another Data Win: TfL Opens Up Bus and Tube Timetables for Developers. The Guardian. Retrieved from
  • Bennett, C. (1985). From the Dark to the Light: The Open Government Debate in Britain. Journal of Public Policy, 5 02), 187-213. doi: 10.1017/S0143814X00003020. Retrieved from
  • Boland, L. & Coleman, E. (2008). New Development: What Lies Beyond Service Delivery? Leadership Behaviours for Place Shaping in Local Government. Public Money & Management, 28 (5), 313-318. Retrieved from
  • Burkert, H. (2004). The Mechanics of Public Sector Information. In G. Aichholzer and H. Burkert (Eds.) Public Sector Information in the Digital Age: Between Markets, Public Management and Citizens’ Rights. Cheltenham: Edward Elgar Publishing Limited.
  • Coleman, E. (2010, February 15). Hectic Times. Retrieved from
  • Eaves, D. (2009, September 30). The Three Laws of Open Government Data. Retrieved from
  • Greater London Authority. (2009). Help Us Free London’s Data. Retrieved from
  • Institute for the Future & Rockefeller Foundation. (2010). A Planet of Civic Laboratories: The Future of Cities, Information, and Inclusion. Palo Alto, CA: Townsend, A., Maguire, R., Liebhold, M., & Crawford, M. Retrieved from
  • Leadbeater, C. & Cottam, H. (2007) The User-Generated State: Public Services 2.0. Retrieved from
  • Open Government Working Group. (2007, December 8). 8 Principles of Open Government Data. Retrieved from
  • Open Society Foundations. (2010). Open Data Study. London: Hogge, B. Retrieved from
  • The Conservative Party. (2010). The Conservative Manifesto 2010. Uckfield, East Sussex: The Conservative Party. Retrieved from
  • Thorpe, C. (2009, October 25). A Good Way to Start Building a Data Store for London. Retrieved from
  • Worthy, B. (2008). The Future of Freedom of Information in the UK. The Political Quarterly, 79(1), 100-108. doi: 10.1111/j.1467-923X.2008.00907.x. Retrieved from
Emer Coleman
Former Director of Digital Engagement
Government Digital Services
Formerly open data and digital in government. These views are definitely my own. You can see what I do at Mostly I am about change.