About the Cities2030 data repository
The ICT team of WP6 has launched (September 2022) the Cities2030 data repository. This component is part of the Single Click CRFS Platform (S2CP). S2CP is a CRFS management platform for data collection, analysis, and representation in multiple interfaces. This platform will be used as a data-driven tool for decision support and will include various components, which are detailed in the following link:
Publishing and consuming open data is a cornerstone for the development of applications and the creation of an innovation ecosystem. In this regard, this section explains how Cities2030 data repository has been created, and how users can expose their data publishing it in this S2CP component.
Cities2030 data repository is an open-source solution based on the well-known Open Data Publication network CKAN (Comprehensive Knowledge Archive Network), most widely used by cities, public authorities, and organizations. This repository enables the publication, management, and consumption of open data, usually, but not only, through static datasets (CSV, XLSX, etc.). This allows to catalogue, upload, and manage open datasets and data sources, while supports searching, browsing, visualizing, or accessing open data.
Easy data access
This component incorporates a central keyword search, that can be faceted by tags, location, format, license, publishing organization, etc. The embedded browser allows searching by groups, keywords, and publishers, enabling a standardized interface for viewing datasets and downloading them through links and direct access. Preview and data explorations is possible with some data formats.
This S2CP component allows users to control the visibility of their datasets, enabling the creation of private datasets that only certain users can access. This is core to support the access control and GDPR compliance.
Data repository has advanced geospatial features, covering data preview, search, and discovery. Where structured data with location information is loaded into the Data Store, it is possible to plot the data into an interactive map. The screenshot shows a map view of a sample dataset, with markers showing individual data points and full details shown for records as they are selected. A user searching for datasets can filter the results by geographical location, specifying a bounding box to limit the interesting areas. Different coordinate geometries and formats are supported. To integrate datasets with other systems, metadata can be coded in INSPIRE standard and major metadata schemas (ISO19139 and GEMINI 2.1), including OGC’s CSW standard. The architecture is extensible, making it easy to support other standards and distribution services.
As an example, in the following figure you can see a dataset containing Community Gardening sites in Salford, UK, and its previsualization in the Data Repository component.
Single Sign On
The OAuth2 extension allows site visitors to login using an authentication server. In this way, other S2CP component such as Cities2030 Community can be used as the identify provider leveraging the single sign-on approach required for the access control management of CKAN datasets. This feature is still under development.
Thanks to widely stablished CKAN framework, the Cities2030 data repository is translated by into over 10 languages, supporting all international characters and supporting multilingual search, string translations & more for the European Commission Open Data Portal.
Cities2030 data repository usage
The concepts of Organizations, Datasets, and Groups have been defined for the project, according to Cities2030 actors.
- Organizations are entities that control how can see, create, and update datasets in Data Repository. Each dataset can belong to a single organization, and each organization controls access to its datasets.
- Dataset is a collection of data. Datasets can be marked as public or private. Public datasets are visible to everyone. Private datasets can only be seen by logged-in users who are members of the dataset’s organization. Private datasets are not shown in general dataset searches but are shown in dataset searches within the organization
- Groups: Groups are created to manage collections of datasets. This is used to catalogue datasets considering the CRFS locations such as Labs. This is a very simple way to help data consumers to find and search published datasets.