Governments are often very good, if they want to be, at measuring the “supply side” of open data—mainly because tracking it is relatively easy. Determining the number of datasets available, which agencies supply data to the portal, and if the government publishes an inventory of data, is easy. Measuring genuine demand for open data is harder, but ultimately is much more important since the value of data is in its use.
Typical “demand side” metrics associated with open data portals, including the number of downloads, page views, and unique visits, can obviously be imperfect. They measure visitation and a narrow slice of use. The metrics show visitation and a narrow slice of use, but shed little light on how the data is used. The next steps taken with a downloaded dataset are unknown, as is the visitor’s interactions with data while on the open data portal.
Open data has a multitude of uses, from mobile apps, blog posts, and visualizations to citation in research, reports, and news stories. Most of these uses occur outside of the open data site, which makes them difficult to track, even if the data are API-enabled. In short, measuring value generation from open data is a challenge. And do governments or agencies really have the resources to track that reuse, even if they could?
API calls, the specific operations invoked at runtime to perform tasks, are perhaps the best available metric to gauge reuse. Someone who makes the effort to actually automate their data demand is more likely to actually use that data. Still, API enablement is no guarantee of reuse. Too much data today is either not API-enabled, or more significantly, the API is not optimized to enable easy, customized data retrieval that a developer or researcher needs to do more complex queries.
If APIs are an unreliable metric for reuse, what’s the best way to measure it? Maybe the answer is to look at “demand” for open data from a broader perspective. Before someone reuses data, they engage in some way with it—even if it’s only to see if that data will serve their purposes.
Open data has multiple value propositions—and reuse only speaks to some of them. And since reuse is so hard to measure, maybe we should focus more on people’s engagement with open data. Engagement comes in many forms, from downloads to commenting on data; even the act of clicking on a dataset to inspect it represents some initial type of engagement.
Open data should steal a page from the larger world of Web content. After more than twenty years of experience with the Internet and “customer optimization,” conversion is the most important metric for Internet companies. In its simplest form, conversion is the ratio of customers that complete some desired action or activity divided by the total number of visitors to a website. In other words, it measures the percentage of visitors who actually do something while visiting a website.
Open data has a multitude of uses, from mobile apps, blog posts, and visualizations to citation in research, reports, and news stories.
In many ways, conversion acts as a proxy for the usability, performance, user experience, and content value of a website.
For an open data portal, conversion might relate to a wide variety of actions by visitors—downloads of course, but also comments, questions, requests, viewing a dataset, searches, polls, and more. When a person moves from mere arrival at an open data portal to actively doing something onsite, he or she changes from a casual visitor to an engaged citizen. And that is worth measuring.
In terms of an open data portal, conversion might be called citizen engagement. So what would a metric for engagement for open data portals look like?
One simple formula might be:
This is a more elaborate way of saying:
Even this metric could be segmented to get a more granular view of how people engage with an open data portal. One could look at the number of downloads over number of page views to get a sense of how quickly people find something worth taking away with them. Or track the number of searches over number of unique visits to get a sense how many people are actually looking for something as opposed to the merely curious or casual visitors (which are good also!). Knowing what percentage of searches result in opening a dataset would provide a more fine-grained way to see if people are finding data that interests them.
Measuring the engagement rate on a portal-wide basis is a good start, but it still lacks the precision needed to really understand how visitors actually engage with an open data site. Even better is to understand the engagement rate for each component and page of an open data portal—the homepage, other pages on the site, and each section on them. This is the only way to really know how people are engaging with an open data portal.
Improving your conversion (engagement) rate is not a one-time event: It requires iteration and many small and large changes to a portal to increase and sustain greater engagement. Some changes can be A/B tested; others cannot and require tracking over time to judge impact. While a conversion rate may not be easy to increase, it is definitely measurable.
For open data portals, as with every other kind of website, conversion offers a key insight into how engaging a site is for citizens. In combination with other metrics—like growth in traffic and API calls—tracking a portal’s engagement rate will help governments continuously improve the user experience and value of their open data initiatives.