Research Data Management

The LRF attaches great importance to efficient research data management (RDM), which ensures the quality, traceability and reusability of our research data. The core of our approach is early and structured data planning before the start of the trial phase. We define data types, data structures and documentation strategies together with the research teams. The consistent collection and versioning of metadata in parallel with data collection ensures a transparent data history and enables comprehensive documentation of data creation.

We use an abstract JSON data model that is maintained and continuously developed by our data stewardship team to ensure the interoperability of our data. This model forms the basis for our institute-wide data structure and enables efficient data management across the entire research process.

This approach ensures that our RDM meets the requirements of the German Research Foundation (DFG) and is based on the FAIR principles (Findable, Accessible, Interoperable, Reusable).

Data planning

Based on the data planning at the beginning of the research project, we define a clear data hierarchy that takes into account the entire life cycle of the research data.

Raw data is mainly stored on local drives, which are regularly synchronized with network storage accessible to the chair . Depending on the data type (e.g. code from models or evaluation tools), management is also carried out via a Git repository.

In the course of finalizing the project, the raw data is processed, which increases the data quality and reduces the quantity. All processed data to be stored long-term is stored on the DataHubat RPTU .

Once the project is complete, the target dataset and the associated scientific results will be published. We are aiming for open access publication via conference papers, preprints and submission of the dataset to a Zenodo repository is achieved. In the case of particularly large datasets that cannot be published in full due to their size (e.g. extensive image data from high-speed cameras), we provide the evaluation algorithms used with a representative raw dataset. Finally, the results are published in a peer-reviewed journal article.

Data history

The properties of the data are consistently stored in metadata records (JSON templates). This structured description enables clear identification and referencing of data records.

Where possible, data collection and relevant processing steps are linked to an electronic lab notebook(eLabFTW) via unique identifiers, which also serves as an interface to the laboratory organization . In this way, the origin, processing steps and tools used - such as sensors, software, test setups, models or calibration procedures - are documented in a traceable manner via the data history.

The metadata thus depicts both the individual processing steps and the methods and tools used along the data creation chain. Storage locations and archiving times are systematically recorded and managed.

Structuring

Our research data management is based on the institute's own metadata model, which is implemented as a hierarchical JSON template . The structure is conceptually based on ISA modeling and combines a fixed project folder structure with machine-readable metadata.

 

Project structure

A fixed folder structure is created for each research project in which all employees work. The project folder contains

  • an administration folder (Project_Administration),
  • several thematic subprojects (Subproject_1, Subproject_2, ...),
  • a Python script (Assembler.py)
  • and all meta information belonging to the project in the MetaData.json file.

The subprojects (Subproject_1, Subproject_2, ...) in turn contain data folders for raw data and processed data.

 

Metadata snippets

The metadata is stored decentrally as JSON snippets in the respective project folders. This enables low-threshold and clear maintenance of the metadata.

The administration folder contains metadata on the project organization, which is based on the requirements of the DFG requirements:

  1. Data description
  2. Documentation and data quality
  3. Storage and technical backup
  4. Legal framework
  5. Data exchange and long-term accessibility
  6. Responsibilities and resources

Additional metadata is available in the sub-projects:

  • Project content and responsible parties
  • Data sets (raw data, simulations, models, derived data)
  • Sensors, software, models or calibrations used
  • References to publications or external data sets.

 

Automatic metadata aggregation

The Python script Assembler.py recursively searches the entire project structure and identifies all existing JSON snippets. These are automatically merged into a complete metadata file. The resulting MetaData.json is archived together with the project and documents the complete data history even if individual raw data is no longer available for storage or data protection reasons.

Data template

Data stewardship

Our data steward is responsible for the strategic development of the institute's own research data management. The design and maintenance of the metadata model and the development of structured data processes contribute significantly to the traceability and long-term usability of research data.

His tasks include the coordination of data management plans, the further development of the institute-wide data infrastructure and guidance on the structuring, documentation and archiving of research data. It also supports the publication of scientific data sets. The aim is to ensure established standards based on the DFG guidelines and the FAIR principles.

 

Contact

PD Dr.-Ing. habil. Thomas Reviol

 

Responsibilities
  • Supervision of the institute's own metadata model
  • Support in the creation of data management plans (DMP)
  • Advice on the DFG guidelines
  • Advice on FAIR principles
  • Support with data publication and DOI allocation
  • Quality assurance of metadata
  • Interface to IT infrastructure and repositories
  • Alignment of data models with the General Data Protection Regulation (GDPR)