Skip to content

General Documentation

General

The Goethe University Data Repository (GUDe) is the institutional research data repository of Goethe University.

  • GUDe offers all members of the university the opportunity to permanently store (for at least 10 years), share with a limited group of people or publish research data.
  • There is no peer review of the submitted research data in GUDe.
  • GUDe is particularly suitable for archiving curated research data that represent an important (intermediate) outcome of a research project.
  • The publication of research data on GUDe can be done independently or in addition to publication on other platforms/publishers.

Terminology

The following terms are used in this documentation and in the Contract Portal to describe the different data packages or information units and their function in the context of GUDe:

Term Refers to ...
Dataset Public datasets without research data
(e.g., projects, institutions, and persons)
Digital Object Research data with accompanying metadata
Submission Both datasets and digital objects
Publication Data Another term for the metadata of a submission

Organization model

In GUDe, all submissions are organized into 'Communities' and subordinated 'Collections' within these communities. Additional communities as well as individual collections can be assigned to these communities. In turn, individual submissions are assigned to the collections, and the vast majority of collections are used to hold research data.

The hierarchy of GUDe's communities and collections is based on the organisational structure of Goethe University (see the following diagram).

    flowchart TD
    TOP["Goethe University Frankfurt"] --> A1[Central Facilities]
    TOP["Goethe University Frankfurt"] --> A2["Cooperating Institutions<br>and Scientific Centers"]
    TOP["Goethe University Frankfurt"] --> A3[Faculties]

    A1[Central Facilities] --> A1a[...]
    A1[Central Facilities] --> A1b[...]

    A2["Cooperating Institutions<br>and Scientific Centers"] --> A2a[...]
    A2["Cooperating Institutions<br>and Scientific Centers"] --> A2b[...]

    A3[Faculties] --> A3a[F01 - Faculty of Law]
    A3[Faculties] --> A3b[F02 - Faculty of<br>Economics<br>and Business]
    A3[Faculties] --> A3c[...]

    A3a[F01 - Faculty of Law] --> A3aa[Faculty of Law: Research Data]
    A3aa[Faculty of Law: Research Data] --> A3aaa[Digital Object]
    A3aa[Faculty of Law: Research Data] --> A3aab[...]
    A3aaa[Digital Object] --> F1aaMeta[Metadata]
    A3aaa[Digital Object] --> F1aaData[Research Data]

    A3b[F02 - Faculty of<br>Economics<br>and Business] --> A3ba[Faculty of<br>Economics<br>and Business:<br>Research Data]
    A3ba[Faculty of<br>Economics<br>and Business:<br>Research Data] --> A3baa[Digital Object]
    A3ba[Faculty of<br>Economics<br>and Business:<br>Research Data] --> A3bab[...]
    A3baa[Digital Object] --> F1baMeta[Metadata]
    A3baa[Digital Object] --> F1baData[Research Data]

    style TOP fill:#ff7474,stroke:#ffcc10;

    style A1 fill:#ff7474,stroke:#ffcc10;
    style A2 fill:#ff7474,stroke:#ffcc10;
    style A3 fill:#ff7474,stroke:#ffcc10;
    style A1a fill:#ff7474,stroke:#ffcc10;
    style A1b fill:#ff7474,stroke:#ffcc10;
    style A2a fill:#ff7474,stroke:#ffcc10;
    style A2b fill:#ff7474,stroke:#ffcc10;
    style A3a fill:#ff7474,stroke:#ffcc10;
    style A3b fill:#ff7474,stroke:#ffcc10;
    style A3c fill:#ff7474,stroke:#ffcc10;

    style A3aa fill:#ffc474,stroke:#ffcc10;
    style A3ba fill:#ffc474,stroke:#ffcc10;

    style A3aaa fill:#96ff74,stroke:#ffcc10;
    style A3aab fill:#96ff74,stroke:#ffcc10;
    style A3baa fill:#96ff74,stroke:#ffcc10;
    style A3bab fill:#96ff74,stroke:#ffcc10;

A special collection exists in the GUDe organisational model for the data sets on persons, institutions and projects.

    flowchart TD
    TOP["Goethe University Frankfurt"] --> A4[Organizations]
    TOP["Goethe University Frankfurt"] --> A5[Persons]
    TOP["Goethe University Frankfurt"] --> A6[Projects]

    A4[Organizations] --> A4a[Dataset]
    A4a[Dataset] --> A4aMeta[Metadata]
    A4[Organizations] --> A4b[...]

    A5[Persons] --> A5a[Dataset]
    A5a[Dataset] --> A5aMeta[Metadata]
    A5[Persons] --> A5b[...]

    A6[Projects] --> A6a[Dataset]
    A6a[Dataset] --> A6aMeta[Metadata]
    A6[Projects] --> A6b[...]

    style TOP fill:#ff7474,stroke:#ffcc10;

    style A4 fill:#ffc474,stroke:#ffcc10;
    style A5 fill:#ffc474,stroke:#ffcc10;
    style A6 fill:#ffc474,stroke:#ffcc10;

    style A4a fill:#96ff74,stroke:#ffcc10;
    style A4b fill:#96ff74,stroke:#ffcc10;
    style A5a fill:#96ff74,stroke:#ffcc10;
    style A5b fill:#96ff74,stroke:#ffcc10;
    style A6a fill:#96ff74,stroke:#ffcc10;
    style A6b fill:#96ff74,stroke:#ffcc10;

Legend (for both diagrams):

Color Meaning
Red Community
Yellow Collection
Green Submission

Research Data

Structure

Before submitting research data in GUDe, the data should be prepared for easy re-use and all files should be given a meaningful designation without special characters.

The scope and number of submissions can be made at the discretion of the submitter - within the limits of technical possibilities. Overlaps and multiple submissions are to be avoided. Submissions can be linked to each other or to external resources (in the metadata) to clarify connections between digital objects.

For structuring files, it is recommended to use *.zip or, even better, *.tar archives (without compression). Folder structures are not suitable for structuring in GUDe, because they are not preserved when the data is uploaded (i.e. without an archive format, all data is on the same structural level, even if it was organised in subfolders on the local file system).

Versioning

In GUDe, there is an feature for versioning digital objects. This allows users to update previous submissions without creating unnecessary redundancies in the system or automatically affecting previous versions (to ensure traceability and compliance with the standards of good scientific practice).

File Formats

Depending on the specific collection and methodology, research data are available in various file formats, which are differently suited for long-term preservation. In accordance with good scientific practice, they should be saved in such a way that they are compatible with different programmes and can be losslessly converted into alternative formats. Other good criteria are human and machine readability as well as the assumed long-term stability (standardised format - known and widely used) of a file format.

The following table provides an up-to-date assessment of the suitability of commonly used data formats for long-term archiving. It is based on a detailed evaluation of recommendations and guidelines from international institutions with an archiving responsibility by the ETH Zurich (ETH-Bibliothek).

If a conversion of unsuitable formats (fourth column) is possible but involves a reduced functionality or loss of information, it is recommended to store the data in both formats. If it is not possible to use a recommended file format, it cannot be assumed that the data will probably still be usable 10 years later.

File Type Recommended Conditionally Suitable Not Suitable
Text
  • PDF/A (*.pdf, preferred subtypes 2b and 2u)
  • Plain Text (*.txt or source code, etc.) encoded as ASCII, UTF-8, or UTF-16 with Byte Order Mark (BOM)
  • XML (including XSD/XSL/XHTML, etc.; schema & character encoding included)
  • (*.pdf) with embedded fonts
  • Plain Text (*.txt, *.asc, *.c, *.h, *.cpp, *.m, *.py, *.r, etc.) (encoded in ISO 8859-1)
  • Rich Text Format (*.rtf)
  • HTML and XML (without external content)
  • Word *.docx
  • PowerPoint *.pptx
  • LaTeX and TeX (including open-source software packages with special fonts and resulting PDF)
  • OpenDocument Formats (*.odm, *.odt, *.odg, *.odc, *.odf)
  • Word *.doc
  • PowerPoint *.ppt
  • Conversion: Convert to PDF/A-2b (or PDF/A-2u)
Spreadsheets and Tables
  • Comma- or Tab-Delimited Text Files (*.csv)
  • Excel *.xlsx
  • OpenDocument Formats (*.odm, *.odt, *.odg, *.odc, *.odf)
  • Excel *.xls, *.xlsb
  • Conversion: Convert to .xlsx
Raw Data and Workspace
  • Plain Text (encoded in ASCII)
  • S-Plus (*.sdd)
  • Conversion: Text format.
  • Matlab (*.mat) from v7.3 MAT-File
  • Network Common Data Format or NetCDF (*.nc, *.cdf)
  • Hierarchical Data Format (HDF5) (*.h5, *.hdf5, *.he5)
  • Matlab Files *.mat (binary)
  • Conversion: HDF5 format
  • R Files *.RData
  • Conversion: HDF5 format (using the rhadf package)
Raster Graphics (Bitmap)

  • TIFF (*.tif, uncompressed, preferably TIFF 6.0+)
  • Portable Network Graphics (*.png, uncompressed)
  • JPEG2000 (*.jp2, lossless compression)
  • Digital Negative Format (*.dng)
  • TIFF (*.tif, compressed)
  • GIF (*.gif)
  • BMP (*.bmp)
  • JPEG/JFIF (*.jpg)
  • JPEG2000 (*.jp2, lossy compression)
Vector Graphics
  • SVG without JavaScript (*.svg)
  • Graphics InDesign (*.indd), Illustrator (*.ait)
  • Encapsulated Postscript (*.eps)
  • Photoshop (*.psd)
CAD
  • AutoCAD Drawing (*.dwg)
  • Drawing Interchange Format, AutoCAD (*.dxf)
  • Extensible 3D, X3D (*.x3d, *.x3dv, *.x3db)
Sound and Audio
  • WAV (*.wav) (uncompressed, pulse-code modulated)
  • Advanced Audio Coding (*.mp4)
  • MP3 (*.mp3)
Video
  • FFV1 Codec (from version 3) in Matroska Container (*.mkv)
  • MPEG-2 (*.mpg,*.mpeg)
  • MP4, also known as MPEG-4 Part 14 (*.mp4)
  • Audio Video Interleave (*.avi)
  • Motion JPEG 2000 (*.mj2, *.mjp2)
  • Windows Media Video (*.wmv)
  • QuickTime Movie (*.mov)

For file formats that either do not appear in the recommendations or are deemed unsuitable, it is advisable to first check whether an alternative format from the list of recommendations can be used. Enhancing the chances of long-term usability may also be achieved by storing embedded objects (such as images, tables, etc.) as separate files. When converting, it is recommended to carefully visually inspect the quality of the result; for example, for texts, pay special attention to formulas, special characters, umlauts, and any other unique fonts.

Note

Linking research data with other data sets such as organisations, projects and persons is strongly recommended!

In GUDe, a core function is the contextualisation and linking of research data with projects, institutions and persons. By maintaining corresponding entries, a semantic network is created which contributes to the findability and creation of publication lists. Before submitting research data, plan how entries are to be linked.

Note:

  • Links are only possible between already accepted submissions.
  • When submitting, you can only create a one-way link to an accepted dataset (i.e. create projects and institutions for linking research data first).
  • For two-way links, the metadata can be edited afterwards via the option Request a Correction.

Metadata

The metadata schema used by GUDe to describe research data is based on the DublinCore, OpenAIRE and DataCite standards and offers a number of optional fields in addition to the required information. Each digital object can be classified in terms of subject matter and form using keywords, subject classifications and categorisations as well as standardised terms and language codes. The metadata fields for digital objects and for the records of persons, projects and institutions are described below and supplement the help texts and notes in GUDe.

Legend (for all tables):

Symbol Description
(*) These fields are mandatory in GUDe
(○) These fields can be used to create a link to other records in GUDe. If the link is successful, the circle in GUDe is displayed in green.
(⟳) Some metadata fields are repeatable, i.e. can be filled in several times.

Achtung!

Only regular font and Unicode symbols may be used in all metadata free text fields! Text formatting via HTML, Markdown or similar systems is not supported!

People

Note

A personal data record or profile is automatically created for a person when registering in GUDe. For reasons of data protection, profiles can only be created by the registered person and not by third parties.

Personal data records consist of the following metadata:

Field Description
Name* The name of the person without academic titles. GUDe expects the following format: Family Name(s), Given Name(s). When providing middle names, double names, etc., consistency with previous publications should be ensured. It is recommended to use the same formatting as in a possibly existing ORCID profile.
ORCID The ORCID iD of the person. This field cannot be filled out directly. To fill out this field, the corresponding GUDe account must be linked to the respective ORCID account (under Account > Profile > ORCID Settings).
GND The GND-ID referring to the person's entry in the Integrated Authority File (Gemeinsame Normdatei).
Affiliation* The organizational affiliation(s) of the person. Here, the institute/or working group should be selected at which the person is employed or at which the person wrote his/her thesis.
Projects The project or projects with which the person is associated.

Recommendation

Goethe University strongly recommends that all researchers register with ORCID and include the ORCID iD with all name and affiliation information (cf. the guidelines for standardized affiliation naming at Goethe University).

Organizations

The creation of data records for institutions can be done by the users as required. The faculties and their institutes are already created in GUDe.

  • The name of the institution is - if possible - to be given in English.
  • By default, data records for institutions are published directly and without further editing.

Internal institutions of Goethe units must be linked according to their organisational affiliation. A model based on the QIS/LSF - Hochschulportal is implemented for this purpose. Links between institutions across several levels are not permitted. Although it is technically possible to link external organisations and their units, it does not serve any purpose in GUDe and corresponding entries are deleted by the editors.

    flowchart TD
    A["Scientific Institutions (e.g., Goethe University or external institutions)"] --> B[Cooperations and Centers]
    A["Scientific Institutions (e.g., Goethe University or external institutions)"] --> C[Central Units]
    C[Central Units] --> D[Organizational Units]
    A["Scientific Institutions (e.g., Goethe University or external institutions)"] --> E[Faculties]
    E[Faculties] --> F[Institutes and Departments]
    F[Institutes and Departments] --> G[Working Groups and Research Teams]
    style A fill:#ff7474,stroke:#ff7474;
    style B fill:#f5b642,stroke:#f5b642;
    style C fill:#f5b642,stroke:#f5b642;
    style D fill:#96ff74,stroke:#96ff74;
    style E fill:#f5b642,stroke:#f5b642;
    style F fill:#96ff74,stroke:#96ff74;

Records for organisations are described by the following fields:

Field Description
Name* The full name of the organizational unit, preferably in English.
Acronym or short form The most common acronym or abbreviation for the institution.
Type* The fundamental type of the institution. Here, it is selected whether the institution is a faculty, a research group, a central unit, a scientific institution, a cooperation, a center/institute/department/unit, or an external institution.
Parent Organisation The corresponding next higher organizational level. The organizational entry can be linked to the next organizational level. If necessary, several links are useful.
Identifier* Type of Identifier* For the unique identification of organizations, three different identifier options are available: GND-ID (Integrated Authority File), Wikidata ID, and ROR ID (Research Organization Registry). The GND-ID and Wikidata ID can be used for all organizational areas. However, the ROR ID exists only at the top level and can therefore only be used for external institutions.
Identifier* Depending on the type, an identifier usually consists of a unique alphanumeric string.
Website The URL to the website of the organizational unit. The shortest unique and functional link should be used. URL shorteners should not be used.
Editorial Contact The main responsible contact person for the organizational unit. Depending on the structure of the organizational unit, this field can be used for the spokesperson, the responsible person of the organization, or the data steward in GUDe. If available, indicating the data steward in the context of GUDe is usually more appropriate. For entering the name, GUDe expects the format: Family Name(s), Given Name(s).

Projects

All users can submit datasets for projects. Projects can also be created for coordinated programmes (e.g. DFG) at Goethe University.

Field(s) Description
Name* Full name of the project.
Acronym or short form The most common acronym or abbreviation for the project.
URL The URL to the website of the institution. The shortest unique and functional link should be used. URL shorteners should not be used.
Sponsors* Name* The full name of the sponsor(s) as well as the funding ID of the project. To find the correct spelling, the Crossref Funding Registry or the Research Organization Registry (ROR) can be used. Multiple sponsors can be provided.
Award Number* The grant number under which the project is funded by the funding organization (e.g., the DFG project number).
Editorial Contact The main responsible contact person for the project. Depending on the structure of the project, this field can be used for the spokesperson, the responsible person of the overall project, or the data steward in GUDe. If available, indicating the data steward in the context of GUDe is usually more appropriate. For entering the name, GUDe expects the format: Family Name(s), Given Name(s).

Research Data

Research data (cf. the Research Data Management Policy of Goethe University ) are often very heterogeneous and, depending on their context of origin, can be tables, images, audio files, video files, models, software etc. Accordingly, GUDe provides a general form for describing the research data to be submitted. The selection of form fields (metadata) follows common library and scientific standards and is divided into five main categories: General, Description, Related Resources, Funding, Notes and Licensing.

The following tables provide an overview as well as hints and tips on how to fill in the form.

Note

Subject-specific metadata for individual or all files of a submission can - if appropriate - be uploaded in the form of structured (machine-readable) text files together with the research data.

General

In this section, general information about the research data can be entered.

Field(s) Description
Title* Title* The full main title of the publication and any common translations of that title.
Language* The language in which the main title of the publication was written. If the language is not present in the list, (Other) should be selected.
Other Titles Other Titles Additional common titles for the publication can be listed here.
Language The language in which the additional title of the publication was written. If the language is not present in the list, (Other) should be selected.
Author* Author* Author's name without academic titles. GUDe expects the following format: Family Name(s), Given Name(s). Consistency with previous publications should be maintained when providing middle names, double names, etc. It is recommended to use the same spelling as in a potentially existing ORCID profile.
If the person has a personal dataset in GUDe, a link is created with this dataset and further information is automatically taken from the personal dataset. If a link is created, this is indicated by a green ring on the right edge of the field.
ORCID The ORCID iD of the author. This field can be filled in directly, unlike the ORCID iD in personal records.
GND The GND-ID referring to the author's entry in the [Integrated Authority File](https://explore.gnd.network) (Gemeinsame Normdatei).
Affiliation* The organizational affiliation(s) of the author. Members of Goethe University should, according to the Affiliation Guidelines of Goethe University, always indicate either a Level 3 organizational unit (the institute or department) or, if applicable, Level 4 (the research group). For external contributors, however, an level 1 affiliation can always be provided. Unlike a personal record, only one affiliation can be specified here! Therefore, the most relevant organizational unit for the submission should be selected.
If the organizational unit has a record in GUDe, a link to that record will be established. When a link is created, this will be indicated by a green ring at the right edge of the field.
As this is a mandatory field, researchers without affiliation to an academic institution can enter No Affiliation here.
Contributor Contributor* Name of the contributor without academic titles. GUDe expects the following format: Family Name(s), Given Name(s). Consistency with previous publications should be maintained when providing middle names, double names, etc. It is recommended to use the same spelling as in a potentially existing ORCID profile.
If the person has a personal dataset in GUDe, a link is created with this dataset and further information is automatically taken from the personal dataset. If a link is created, this is indicated by a green ring on the right edge of the field.
ORCID The ORCID iD of the contributor. Unlike the ORCID field in personal datasets, this field can be personally filled out.
GND The GND-ID of the contributor, referring to the contributor's entry in the [Integrated Authority File](https://explore.gnd.network) (Gemeinsame Normdatei).
Affiliation* The organizational affiliation(s) of the contributor. Members of Goethe University should, according to the Affiliation Guidelines of Goethe University, always indicate either a Level 3 organizational unit (the institute or department) or, if applicable, Level 4 (the research group). For external contributors, however, an affiliation at Level 1 can always be provided. Unlike a personal record, only one affiliation can be specified here! Therefore, the most relevant organizational unit for the submission should be selected.
If the organizational unit has a record in GUDe, a link to that record will be established. When a link is created, this will be indicated by a green ring at the right edge of the field.
For researchers completely without affiliations, No Affiliation should be indicated here, as it is a mandatory field.
Type of Relation* This entry specifies how the person contributed to the publication. The following relations are available:
  • ContactPerson
  • DataCollector
  • DataCurator
  • DataManager
  • ProjectManager
  • RightsHolder
  • Supervisor
  • Other (for contribution types not listed above)
Project(s) The full name of the project that produced the research data.
If the project has a dataset in GUDe, a link is created with said dataset. If a link is created, this is indicated by a green ring on the right edge of the field.
Faculty The name of the faculty at Goethe University to which the research data should be assigned.
DFG-Subject The subject from the DFG Classification System to which the research data should be assigned.
MeSH (Medical Subject Headings) Relevant MeSH terms related to the research data can be provided here.
Date of Issue* Year* The year of publication. If the submission has not been previously published elsewhere, the current date should be used.
Month The month of publication. If the submission has not been previously published elsewhere, the current date should be used.
Day The day of publication. If the submission has not been previously published elsewhere, the current date should be used.
Publisher The full name of the publisher of the research data. Goethe University Frankfurt is automatically entered as the publisher and cannot be removed.
DOI The DOI of the research data set. To assign a new DOI to a submission, this field must be left empty, and the access restriction of the dataset must be set to either Public or Embargo. If the submission has been previously published with a DOI, it can be provided here instead.

Description

This section can contain descriptive metadata, such as a summary or the available file types of the submission.

Field(s) Description
Type of Data* Various file types that make up the research data can be selected here. The following file types are available for selection:
  • Audiovisual
  • Computational Notebook
  • Data Paper
  • Dataset
  • Image
  • Interactive Resource
  • Model
  • Output Management Plan
  • Software
  • Sound
  • Standard
  • Text
  • Workflow
  • Other (for file types not listed above)
Language* The language used within the research data. If the language is not present in the list, (Other) should be selected.
Subject Keywords Keywords relevant to the submission can be assigned here. Keywords should be separated by commas.
Abstract* Abstract A summary of the content of the submission or the associated publication.
Language The language in which the summary was written. If the language is not present in the list, (Other) should be selected.
Description Description An additional description of the research data can be entered here.
Language The language in which the description was written. If the language is not present in the list, (Other) should be selected.

In this section, metadata for related resources can be entered, specifying how these resources are related to the submission.

Field(s) Description
Related Resource Type of Identifier The type of identifier to be assigned. The following options are available:
  • DOI
  • ISBN
  • URL
Identifier The actual value that identifies the resource to be linked using the identifier type.
For DOI, enter a value starting with 10. followed by additional characters.
For ISBN, both ISBN-10 (format: x-xxx-xxxxx-x) and ISBN-13 (format: xxx-x-xx-xxxxxx-x) can be provided.
For URL, use the shortest unique and functional link pointing to the resource. Do not use URL shorteners here.
Type of publication The type of publication from which the related resource originates. The following publication types are available for selection:
  • Audiovisual
  • Book
  • Book chapter
  • Collection
  • Computational Notebook
  • Conference Paper
  • Conference Proceeding
  • Data Paper
  • Dataset
  • Dissertation
  • Image
  • Interactive Resource
  • Journal Article
  • Model
  • Output Management Plan
  • Peer Review
  • Preprint
  • Report
  • Service
  • Software
  • Sound
  • Standard
  • Text
  • Workflow
  • Other (for publication types not listed above)
Type of relation The way the related resource is connected to the submitted research data. The following relation types are available, where A represents the submission and B represents the related resource:
  • Is Cited By
  • Cites
  • Is Supplement To
  • Is Supplemented By
  • Is Documented By
  • Documents
  • Is Reviewed By
  • Reviews
  • Is Required By
  • Requires

Funding

Field(s) Description
Funder Funder The unabbreviated name of the funding organization. For finding the correct spelling and persistent identifier, one may refer to Crossref's Funding Registry or the Research Organization Registry (ROR).
Type of Identifier The following persistent identification types are available for selection:
  • Crossref Funder ID
  • GRID
  • ISNI
  • ROR
  • Other (for identification types not listed above)
Funder Identifier The actual value that identifies the funding organization using the selected identifier type.
Award Number The award number under which the research is funded by the funding organization (e.g., DFG project number).
Award Title The title of the award by which the research is funded by the funding organization.
Award URL (landing page provided by the funder) The URL leading to the funding organization's webpage. Do not use a URL shortener here.

Notes

In this section, a free-text message to the editorial team can be recorded.

Field Description
Notes for the Editor Annotations directed to the editorial team.

Licensing

In this field, it can decided under which license the submission should be published in GUDe.

Field Description
License The terms for reusing the published research data are regulated by assigning a license. A list of selectable licenses and more information about each license can be found under Legal Information > Licenses.

README Files

A README file contains information about the files in the Goethe University Data Repository (GUDe) that are published or preserved as research data. It provides clear and concise details about the data collection, processing, and analysis. The README file is a plain text file, often written in a simple markup language like Markdown, and is located at the root level of a research dataset. For extensive datasets, multiple README files may be used or the documentation may be integrated directly into the files themselves.

The use of a README file is particularly beneficial in the case of an institutional repository like GUDe, as the repository offers only a generic application profile and does not support domain-specific metadata schemas. A README file can serve as a strategy for extended documentation in this context. It also aids in locating research data after successful use, as the simultaneous downloading and cataloging of metadata has not yet become widespread.

When using a README file to capture domain-specific metadata, scientific conventions for taxonomic, geographic, as well as geological names and keywords should also be observed. Where possible, standardized taxonomies and vocabularies should be used.

When creating the README file, the principle of 'as much as necessary - as little as possible' should be adhered to. Therefore, only the relevant entries from the following list that are important for the correct interpretation, evaluation, and reuse of the data should be implemented. For numerous examples of README files and additional guidance, a search for 'readme research data' on the internet may be helpful.

Points Guidelines
Overview & General Information
  1. Dataset title
  2. Citation suggestion (including DOI)
  3. Contact information: names and roles of contributors – also with ORCID iD if available – affiliated institutions, etc.
  4. Information about the funding of the dataset or underlying research (funding organizations, program, project, etc.)
Reuse & Context
  1. Licenses and restrictions for reuse (of potentially parts) of the dataset.
  2. All sources used for data collection
  3. Citations to publications based on this dataset (including links/DOIs)
  4. Relationships to other publications and related datasets (citation including links/DOIs)
Data Collection
  1. Period of data collection
  2. Geographical location of collection
  3. Used methods (if applicable, including references, documentation, links)
  4. Experimental and environmental conditions of data collection (including standards and calibration)
  5. Uncertainty, precision, and accuracy in data collection
  6. Known issues and caveats (sampling, missing values, etc.)
Organization
  1. Explanations of the hierarchy and structure of individual files and folders
  2. File naming system (with examples)
  3. Relationships and dependencies between files
  4. Additional documentation files within the dataset (notes, companion files, etc.)
  5. For central files, a brief description of their content
Codebook
  1. List of all used codes, symbols, abbreviations, variables, etc. (including complete designation and definition)
  2. For tabular data: Definition of column headings
  3. Explanations of units of measurement and data formats
Processing & Quality Management
  1. Applied procedures for quality control
  2. Overview of methods used in data collection, processing, and analysis – including the software(s) used (with version number(s))
  3. List of used file formats (if applicable, with software recommendations for opening the files)
  4. Handling of missing data
Change Log
  1. Overview of changes in the dataset in subsequent versions when published
  2. Notes and explanations of version control systems used (e.g., git)

Correction

In GUDe, there is an option for correcting the metadata of digital objects. This allows users to update spelling errors and other mistakes in the metadata of previous submissions without having to submit a new version of the digital object.

Access Permissions

When submitting research data, it is possible to set access permissions in a very detailed way. This allows hiding parts of the submission at the file level or even not publicly archiving the entire entry, or making only the metadata public. In addition to the general access permissions of the submission, access permission to individual files of the digital object can also be set.

The following access restrictions are available:

Property Description
Public The metadata or the specified file(s) are publicly visible immediately after the submission is accepted.
Embargo The metadata or the specified file(s) are visible to the public only after a certain embargo period has expired. The embargo duration is determined by the user by specifying a specific date.
Private The metadata or the specified file(s) are not publicly visible and can only be viewed by users with the appropriate permissions (usually the submitting person themselves).

If no specific restrictions have been activated, the digital object inherits the default access permissions of the collection to which they are submitted. These default access permissions are displayed in the submission form in the table directly above the access permission input mask.

Note

If a collection with finer-grained default access permissions than the three options mentioned above is needed, the contact form can be used to ask for a new collection with such permissions to be created.

Sensitive Data

According to the terms of use, sensitive personal data (such as passwords or medical data containing personal information) must not be uploaded to GUDe. If present, such files must be completely removed from the respective dataset before submitting the research data to GUDe.

Licenses

For successful reuse of research data, a legal agreement between the authors or rights holders and interested researchers ('users') is necessary if the research data reach the creative threshold required by copyright law.

In the Goethe University Data Repository (GUDe), it is possible to enable the reuse of published research data under Creative Commons licenses or restrict reuse to the narrow boundaries of copyright law. The CC licenses are standardized contracts that allow good control over the granted possibilities of usage and are available in the following variants:

Icon Central Contractual Terms
CC0 The research data can be freely used without further conditions or attributions.
CC-BY Attribution of authors
CC-BY-ND Attribution of authors & no derivatives allowed
CC-BY-NC Attribution of authors & non-commercial use
CC-BY-SA Attribution of authors & share-alike
CC-BY-NC-SA Attribution of authors & non-commercial use & share-alike
CC-BY-NC-ND Attribution of authors & non-commercial use & no derivatives allowed
All Rights Reserved This object is protected by copyright and/or related rights. You are free to use it in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses, you may need to obtain permission from the rights holder(s).

Warning!

The CC0 Public Domain license is independent of the rules of good scientific practice, which continue to require the correct attribution of the origin of reused data!

For further advice on open access and open licenses, the Open Access team of the University Library is available for contact.

Advanced Application Usage

Warning!

No active support for using the listed application possibilities is guaranteed!

GUDe REST Client

The GUDe REST Client allows users to create new submissions in DSpace-CRIS 7 repositories like GUDe using a Python3 script and REST API, and it is accessible here via the internal network of Goethe University.

The DSpace Python REST Client Library was created by Kim Shepherd with support from the University of Hohenheim for The Library Code GmbH. The GUDe REST Client fork was developed by the IT Services of Goethe University Frankfurt for use in the Goethe University Data Repository (GUDe), and both projects are released under the BSD 3-Clause License.

Installation on Linux-Based Systems

To install the GUDe REST Client on a Debian/Ubuntu-based system, Python3 pip needs to be installed first. To do this, the following may be entered in the BASH console:

apt install python3-pip
For Arch Linux-based systems, the following command should be used instead:
pacman -S python-pip
After that, the GUDe REST Client git repository needs to be cloned to the local machine:
git clone https://gitlab.rz.uni-frankfurt.de/gude/gude-client.git gude-client

Warning!

It should be noted that the repository can only be cloned via the internal network of Goethe University or via a connection to it via VPN!

Then, in the console, one must navigate to the directory of the cloned repository and install the GUDe REST Client's dependencies using pip:

cd gude-client
pip install -r dependencies.txt
Now the GUDe REST Client is ready to use.

Usage

After installing the GUDe REST Client, the create-submission.py script needs to be configured or modified according to personal preferences. For a successful authentication at the REST API of GUDe, we recommend the generation of a token in the account settings of GUDe.

After customizing the script, it can be run to create a submission. On most systems, this may be done using one of the following commands:

python3 create-submission.py
python create-submission.py


License Note

All content licensed under: CC0 1.0