© Melisback, Fotolia

Open Data
04/17/2012

DataMarket: „App Store for numbers“

The Icelandic start-up “Data Market” is well on the way to establishing itself as a data-turntable for companies and private users. Therefore, the young company does pre-processing on open data which has been released by the officials. Futurezone has interviewed “DataMarket” founder, Hjalmar Gislason about his “AppStore for numbers” and the economical potential of open data.

von Patrick Dax

Four years ago Hjalmar Gislason founded the start-up "DataMarket". The young company runs a marketplace for data, which has approached an international audience for a bit more than a year. "DataMarket" supports companies in pre-processing, publishing and sales of data, and also in the exploration of datasets. Therefore, it provides visualization tools which make it possible to compare and combine data with interactive Charts.

At the moment "DataMarket" offers 23,000 free and charged datasets that could be visualised, analysed and downloaded. They descend from the World Bank, Eurostat, the UN, the Swedish Gapminder foundation, Wikileaks, the world football association FIFA, as well as many other suppliers such as market research institutes and finance data providers.

"Datamarket" is regarded as a classic example of how administrative and government data that has been published for public use could flow in companies. Futurezone has asked the Icelandic company founder about the economical chances of open data.

Governments - on a local and state level - are increasingly opening up their data. How have open data changed the business-opportunities for data marketplaces like DataMarket?
When we launched DataMarket with an international offering in January last year, we knew we had to have a compelling data offering from day one to attract users. We identified the most important sources of statistics and quantitative data (that`s the type of data we focus on) available under open licenses to "seed" our databases. Among the sources we started out with were the UN`s UNdata, World Bank`s data repository and Eurostat`s entire statistics collection. We`ve since continued to add more, such as data from the IMF and the fantastic data collection from the Federal Reseve of St. Louis, and we will continue to add open data to our repository as it becomes available, and our bandwidth allows.

So, open data played a key role for us in the beginning, and continues to do so. Our philosophy is that data that is open and free to begin with will remain open and free on DataMarket.com - it will just be easier to find and use through us. I actually believe that DataMarket.com is the largest collection of statistical data that is available for free online.

For us, the availability of open data has been a key to attract users and attention and we believe we are adding significant value to this data in return. The business value for us comes through selling our back-end data services, such as our API and indirectly as users become aware of premium data, or data publishers realize the potential in our system for their own needs.

How would you rate the quality of the data published by the officials?
I applaude every public organization that makes its data available in any format, period. I think the first step should - almost without exception - be to take whatever exists, in whatever format it exists and publish it online along with the minimum required explanations. People too easily fall into the trap of wanting both the data, and the publishing system to be perfect before releasing anything. Such things always take longer and cost more than people think and the result is that nothing gets published.

That said, there are simple ways to make things a lot easier for the users of the data. Tim Berners-Lee`s five star system is a good measurement, and one that public organizations should strive to meet. In all honesty though anything above three stars is usually not going to give organizations a lot of additional value right away.

My advice to public sector organizations looking to open up their data would be: Aim to reach three stars fast (data available in structured, non-proprietary format), and then start planning for how to evolve your infrastructure to reach five stars and maintain your data offering up to date online at all times.

So, kudos to all those that have made their data available in any format already. The only thing I`d criticize in some cases is the lack of explanations, meta-data and other context, because in the world of data, context is everything.

How can companies use open data?
There are a lot of examples, but obviously there is way more untapped potential. I believe most businesses are unaware of the amount of data relevant to their businesses that is available out there. As is often the case, the real value is created when data that earlier existed in separate silos meets for the first time to create something that is bigger than the sum of its parts.

This can range from simple examples like comparing number of customers (private data) to the population or number of households in a given ZIP code (public data), to more elaborate ones like using traffic data from road administration to select the place for your new fast food restaurant.

Which kind of data is most interesting for "DataMarket" users?
We are in the business of "delivering business data to decision makers", so we put a lot of emphasis on user friendly and visual access to the data right on the DataMarket.com site. The site allows people to search, visualize, compare and download data from a wide variety of data sources - both public and private - all in one place. We want to change the dynamics of data driven decision making from "let`s ask research and reconvene in 2 weeks" to someone in a meeting playing DataMarket like the piano saying "let me bring that up for you".

This kind of usage then sparks interest in lower-level services such as API access and integration with Excel, R or BI tools, all of which we cater to, but the end-user interface is where people learn about DataMarket.com

Do you recommend companies publish their data?
It depends. The initial reaction for most businesses is probably that it makes no sense for them to give anybody else access to any of their data, but there may indeed be several good reasons to do so. I often point to airlines and their booking and price information as a good example of this. As recently as 5 years ago some major airlines were fighting off attempts to collect their data using both technical and legal measures. Today, most airlines pride themselves in providing good, machine-readable acces to their systems to enable travel search engines such as Kayak, Dohop and Opodo to bring them traffic and sales.

And there are other reasons besides sales and marketing. Transparency on a company`s data, such as server uptimes, customer support waiting times and even key financial performance may be a way to build trust or show corporate responsibility.

In presentations and lectures you talked about "DataMarket" wanting to become a "Google for statistics".
Within 4 years we want to live up to the following promise to our end-users: To find the best numbers to base your plans and decicions on, come to DataMarket, type in your keywords (as you would on Google) and we`ll give you structured, actionable data right away. In fact "App store for numbers" might be a more accurate description as some of the data you will find, such as market research data or data from the financial markets, may not be free of charge.

Last December the EU Commission announced a revision of the "Directive on the Re-Use of Public Sector Information" and has wide ranging plans on the release of data within the EU. What expectations do you have for this?
My take when it comes to access to public sector information is simple: Any data that is gathered or created for taxpayers’ money should be open and free of charge unless higher priorities such as privacy or national security indicate otherwise. I hope and expect this to be the rule across Europe in the next 4 to 5 years. Open access spurs innovation, increases accountability and frankly speaking it is only fair that the taxpayers be able to access what they paid for creating in the first place.