Читать книгу Estonian Information Society Yearbook 2011/2012 - Karin Kastehein - Страница 2

CHAPTER 1
OPEN DATA

Оглавление

No country in the world can afford to disregard the topic of open data. Such data has become a part of our everyday life – already now, public sector institutions generate various kinds of data in digital form – a tantalizing source of raw input for all sorts of new services and products. Open data refers to machine-readable data that is available to everyone to use freely and publicly, with no restrictions on use and distribution. In fact, Estonia’s Public Information Act obliges the public sector to make information available to the public either through websites, document registers or databases. This chapter looks at the topic of open data and the principles governing the domain both in Estonia and in other countries.

Open data – a step toward the Internet of the future

Uuno Vallner

uuno.vallner@riso.ee

Ministry of Economic Affairs and Communications


The principle of open data and open government became buzzwords at the dawn of the 21st century. Since that time they have become areas that impact all of society. Open data is the first stage in moving toward the so-called Internet of Things and an interlinked world. Movement of data is seen as a way of dealing with “big data” in the future.

Here in Estonia, too, the goal could be a “linked Estonia” – an interoperable Estonian-ICT-deve-loped network of individuals, organizations, devices, knowledge, information systems and linked data. Creating it will require a breakthrough on such fronts as high-speed data networks and smart devices, interoperable information systems, knowledge networks, semantic networks, linked data, open government technologies (open standards and data, free software).

Open data for re-use

Public sector institutions generate, collect or retain a large quantity of data and information, such as statistics, spatial data, economic figures, environment data, archive materials, books and art collections. Today these resources are to a very great extent digitized and represent a major asset for development of new products and services where they are used as raw input. There is particular interest in re-use of dynamic data in public sector registers.

According to a study commissioned by the European Union1 if the public sector information in the EU’s 27 countries moves toward greater openness and easier access, it will be possible to achieve economic benefits translating into around 40 billion euros per year. The market for re-use in Europe is growing 7 to 40 percent a year. Vice-President of the European Commission Neelie Kroes has called open data “new gold”2: “If oil was black gold, re-use of data could be new gold for Europe.” Opening public sector data will allow the private sector to mash them up with other data and create new commercial services with value added. The public sector could focus on its main activities and discontinue competing with the private sector. But presenting data in a re-usable form will mean expenditures. Based on the European Commission study, 1.4 billion euros of public sector investments would increase Europe’s GDP to 140 billion euros. Thus every cent invested will increase a country’s GDP by a euro.

Open data: a rediscovered gold mine

The topic of data re-use is not new. It began to be talked about in the late 1950s. During the Soviet era, the Information Institute re-used one million records a year3 of reports on magnetic media. The first instance of re-use of a register in re-independent Estonia was the technological solution for the State Gazette developed in 1995-1996. WordPerfect office software was used in this case to publish it on paper. The WordPerfect files were converted into SGML (XML is a derivative of SGML), digitally signed and opened via ftp server for free public use. The more active re-users of legal acts were the Government Office’s document management system, the State Gazette’s online database, IBS search system and EstLex. In recent years, many countries have discovered that opening data is a path to economic stimulus and they have launched extensive projects that support re-use of data:

• The European Commission’s policy on open data4

• Study commissioned by the Commission: “Towards a pan EU data portal – data.gov.eu”5

• Principles of open data in Great Britain6

• Principles of W3C open data7

• Recommendations of the Open Government Data development group8

• US and UK. Recommendations to the OECD with regard to open data policy9

• OFKN Open Data manual (legal affairs, social affairs, technology)

Many countries, regions and local governments have created frameworks for re-use and websites that simplify re-use:

• European open data directory: http://publicdata.eu

• US open data directory http://data.gov

• UK open data directory http://data.gov.uk

• Australian open data directory http://data.gov.au

• Canadian open data directory http://data.gc.ca

• Kenyan open data directory http://opendata.go.ke

• Norwegian open data directory http://data.norge.no

• Dutch open data directory http://data.overheid.nl

• New Zealand open data directory http://data.govt.nz

• Italian open data directory http://data.gov.it

• French open data directory http://data.gouv.fr

• Swedish open data directory (initiative of an individual) http://www.opengov.se

• Philadelphia area open data directory http://opendataphilly.org

• Helsinki Region Infoshare http://www.hri.fi/en

• CKAN open data directory repository http://thedatahub.org

Although opening data for re-use will result in additional expenditures, politicians have recognized the strong influence it will have on national economies and have started actively investing into the creation and development of open data. On his first day in office, US President Barack Obama signed a memorandum on an open and transparent government under which the public sector opened its data for re-use. By autumn 2011, the US open data directory consisted of 390,000 data sets.

What is open data?

Open data and data sets. Data published for re-use is called open data. This term covers machine-readable data that is freely available for everyone over websites and which is not protected by patents or restrictions on use or distribution. If legislation does not specify a fee for obtaining the data, the open data can be obtained free of charge and without access restrictions.

Formats that can be opened and modified by freeware applications are also well-suited for re-use.

The Public Information Act10 makes it obligatory to release to the public, via a government department’s website, document register and databases, the department’s unrestricted information. In addition, the public sector has the obligation of releasing information in response to requests for information. Here we take open data to mean information that is presented to the public in proactively opened formats. But in general, no request for information need be submitted for downloading open data.

Publication of officially generated data has several important objectives, the most specific one being the interest of individuals, companies and the third sector to merely view existing data or use it in software developments to generate value added in some field.

All data generated by both government departments and local governments and the public use of which is not expressly prohibited and which contain data other than personal data is subject to being made public. With regard to data that consists of both personal data and other data, only the latter part is made public.

In the context of open data, data that comprise an integral whole is called a dataset. This includes contract texts, regulation texts, collections of metadata on correspondence, budget and statistics files, databases converted to open format or open network services that issue data from registers. It is not reasonable to treat individual agreements and regulations as a dataset, unlike individual databases. In the case of some datasets, it is sufficient for the user to have access to the data (for reading or copying) while in the case of others there may be a strong interest to re-use the data. Below, the fields are arranged (pursuant to the OECD’s 2006 analysis) according to their re-use value in ascending order:

• culture (libraries, archives, museums, broadcasting)

• politics (press releases, strategies, green books)

• education (lectures, textbooks and study materials)

• science and research (research at universities, institutes and public sector)

• legal information (court, legal acts, patents, trademarks, rights and obligations)

• nature (biology, ecological, geological and geophysical information, information on energy resources)

• agriculture, forestry, fishing

• tourism, accommodations and entertainment

• traffic, transport

• social information (statistics, demographics, health, education)

• business and the economy

• meteorology, environmental information

• spatial data

Technically, the dataset published may be a collection of human-readable text files (such as a collection of legislation or regulations, official notices or contracts) or machine-readable data (such as a database of files exported to csv or xml format or a web service that allows all data to be searched for and downloaded in json or xml format).

A dataset is, in the technical sense, a collection of human-readable text files

The user must be able to do the following:

• browse and search public datasets for a dataset of interest;

• download a dataset found as a whole or, via the search system offered by the services, in parts without having to negotiate for rights or obtain passwords. In an exceptional case, a fee may be charged for the downloading of a dataset;

• to continue to use the database freely, with the right to download it into one’s computer and using it in applications (in free and paid applications) without having to pay (additionally) for it or needing permission to do so.

A public sector institution that creates and publishes a dataset has no obligation to offer data users additional amenities such as conversion to a suitable format, building special network service, translation etc. Nor do officials have the obligation to ensure that data is correct or up to date. Instead, the publisher has to explain in brief the nature of the data and document the expected frequency of the updates.

Licence and fee for dataset. An open dataset must have a licence that allows it to be used, processed and distributed free of charge and without restrictions, either free of charge or for a fee – at the user’s discretion. Specifically, we recommend that a creative commons licence be selected as the licensing option11. Above all, from this list, we recommend CC by 3.0 licence12. This means that in licensing a work, the licensor is the author or the copyright holder, but the licensee is the public at large. You have the right to copy a work (reproduce it), distribute, perform and direct it at the public, and to adapt, arrange and develop it otherwise, including derivative works, on condition that the author is credited.

Open data is published advisably for free download, but the publisher has the right to charge a fee for loading the data in cases set forth in legislation.

Principles for publishing datasets. When publishing data, a compromise between two objectives must be found:

1

http://epsiplatform.eu/content/review-recent-psi-re-use-studies-published

2

Speech by Neelie Kroes “Data is the new gold” 12.12.2011: http://europa.eu/rapid/pressReleasesAction.do?reference=SPEECH/11/872&format=HTML&aged=0&language=EN&guiLanguage=en

3

Uuno Vallner. Retrospektiivsed otsisüsteemid (Retrospective search systems). Tallinn, Estonian Information Institute, 1985 (in Estonian)

4

http://ec.europa.eu/information_society/policy/psi/index_en.htm

5

http://ec.europa.eu/information_society/policy/psi/docs/pdfs/towards_an_eu psi_portals_v4_final.pdf

6

http://data.gov.uk/opendataconsultation

7

http://www.w3.org/TR/gov-data

8

http://www.opengovdata.org

9

https://usoecd.cms.getusinfo.com/data.html

10

https://www.riigiteataja.ee/akt/122032011010?leiaKehtiv (in Estonian)

11

http://creativecommons.org

12

http://creativecommons.org/licenses/by/3.0

Estonian Information Society Yearbook 2011/2012

Подняться наверх