Talk

Designing a modern Data Management System in Django

Thursday, May 28

12:25 - 12:55
RoomLasagna
LanguageEnglish
Audience levelIntermediate
Elevator pitch

Design and technical challenges of building the data management system in Django for the Norwegian Institute for Nature Research following the FAIR principles. The solution relies on GDAL and cloud native file formats and technologies.

Abstract

We present the journey to design and develop the Data Management System for the Norwegian Institute of Nature Research (NINA): the challenges we encountered, the solutions we evaluated, the decision to develop a new one, the enabling technologies we chose to build upon, and why we released it as open-source software.

https://github.com/NINAnor/dms: NINA Data Management System

This system allows linking together information datasets with existing data sources (scientific data, administration ERP for projects, users, web services), as well as allowing users to share datasets via PyCSW and PyGeoAPI, both within the institution and in national data catalogs.

The system is designed to be format agnostic, allowing the users to use their own storage backend and protocols, but it provides additional functionalities when using cloud-native formats such as Apache Parquet and COG (Cloud Optimized GeoTIFF). GDAL is used to query both spatial and non-spatial files. The NINA DMS supports various metadata schemas (ISO19115, ISO19139, DataCite) and harvesting from different sources (such as IPT).

Integration and synchronization with other services is managed through a set of data pipelines (https://github.com/NINAnor/miljodata-datasync: A set of pipelines to move data from different sources).

TagsGEO and GIS, Web Frameworks
Participant

Niccolò Cantù

Python Developer @ Norwegian Institute for Nature research (NINA) I’m interested in maps and catalogs, Web Development, GIS, Databases and Data Engineering.