Skip to content

FAQs

Terminology

What is the Goethe University Data Repository (GUDe)?

The Goethe University Data Repository (GUDe) is the institutional repository of Goethe University. It provides all members and affiliates of the university the opportunity to store research data which has been generated in collaboration with researchers from Goethe University, in accordance with the FAIR criteria, for the long term (at least 10 years), share it with a limited group of people, or publish it.

GUDe is particularly suitable for archiving curated research data that represent an important (intermediate) result of a research project. GUDe does not conduct a content-based peer review of the submitted research data. Publication of research data on GUDe can occur independently or in addition to publication on/with other platforms/publishers. The repository is operated jointly by the University Computer Center (HRZ) and the University Library Johann Christian Senckenberg of Goethe University and is based on the open source software DSpace-CRIS v7.

What is the Goethe University Data Repository not?

The Goethe University Data Repository (GUDe) is not a discipline-specific repository, but rather a supplementary offering for members and affiliates of Goethe University. It is a good choice when no suitable discipline-specific repository exists for the submission of research data or when such a repository cannot be used for various reasons. The directory re3data is available to find discipline-specific repositories.

GUDe is not intended for day-to-day work with research data. For this purpose, sync-share services such as Hessenbox are more suitable. Instead, GUDe is meant for storing validated and consistently curated research data.

GUDe is not intended for qualification theses and traditional open access publications such as journal articles (including preprints and postprints), monographs, edited volumes, research reports, working papers, series of writings, annual reports, journals, conference contributions, or posters. Such publications should instead be submitted to the University Publication System of the University Library.

What are research data?

The Research Data Management Policy of Goethe University defines research data as data that are collected, observed, simulated, derived, or generated in the course of research (e.g., as a result of research processes, experiments, measurements, simulations, software development, source research, surveys, and questionnaires). Research data also include the necessary software and documentation for reproducibility. Where possible, digital preservation of research data is recommended. When digitizing, regulations for archiving original data should be observed.

What are metadata?

Metadata are data about data. They serve to describe the research data and, together with them, form a digital object in the context of GUDe. When submitting data, metadata such as the submission title, authors, keywords, and related publications are entered through a form.

For the successful reuse of research data, good documentation of the deposited research data in accordance with the FAIR principles is essential. Therefore, submitting to GUDe requires the entry of accompanying metadata to contextualize the research data in terms of subject and content. Further documentation may be provided, depending on the type, scope, and discipline, in an associated publication (which should also be linked to in the metadata), in the form of an additional README file, or in discipline-specific metadata.

The metadata schema used by GUDe for describing research data is based on the Dublin Core, OpenAIRE, and DataCite standards. It includes both mandatory as well as a number of optional metadata fields. Each digital object can be classified substantively and formally by keywords, subject classifications, standardized terms, and language codes.

What is the relationship between digital objects, datasets, submissions, and publication data in GUDe?

The relationship between these terms is also illustrated in the following Euler diagram:

Euler Diagram

What are Communities and Collections in GUDe?

In GUDe, all research data are organized into 'Communities' and subordinated 'Collections'. Both communities as well as collections can be assigned to superordinate communities. Collections, in turn, have individual datasets assigned to them. Collections contain research datasets, but also, as auxiliary structures, datasets for persons, institutions, and projects. The basic structure of GUDe is aligned with the organizational structure of Goethe University, as shown in the diagram below:

    flowchart TD
    TOP["Goethe University Frankfurt"] --> A1[Central Facilities]
    TOP["Goethe University Frankfurt"] --> A2["Cooperating Institutions<br>and Scientific Centers"]
    TOP["Goethe University Frankfurt"] --> A3[Faculties]
    TOP["Goethe University Frankfurt"] --> A4[Organizations]
    TOP["Goethe University Frankfurt"] --> A5[Persons]
    TOP["Goethe University Frankfurt"] --> A6[Projects]

    A1[Central Facilities] --> A1a[...]
    A1[Central Facilities] --> A1b[...]

    A2["Cooperating Institutions<br>and Scientific Centers"] --> A2a[...]
    A2["Cooperating Institutions<br>and Scientific Centers"] --> A2b[...]

    A3[Faculties] --> A3a[F01 - Faculty of Law]
    A3[Faculties] --> A3b[F02 - Faculty of<br>Economics<br>and Business]
    A3[Faculties] --> A3c[...]

    A4[Organizations] --> A4a[Dataset]
    A4a[Dataset] --> A4aMeta[Metadata]
    A4[Organizations] --> A4b[...]

    A5[Persons] --> A5a[Dataset]
    A5a[Dataset] --> A5aMeta[Metadata]
    A5[Persons] --> A5b[...]

    A6[Projects] --> A6a[Dataset]
    A6a[Dataset] --> A6aMeta[Metadata]
    A6[Projects] --> A6b[...]

    A3a[F01 - Faculty of Law] --> A3aa[Faculty of Law: Research Data]
    A3aa[Faculty of Law: Research Data] --> A3aaa[Digital Object]
    A3aa[Faculty of Law: Research Data] --> A3aab[...]
    A3aaa[Digital Object] --> F1aaMeta[Metadata]
    A3aaa[Digital Object] --> F1aaData[Research Data]

    A3b[F02 - Faculty of<br>Economics<br>and Business] --> A3ba[Faculty of<br>Economics<br>and Business:<br>Research Data]
    A3ba[Faculty of<br>Economics<br>and Business:<br>Research Data] --> A3baa[Digital Object]
    A3ba[Faculty of<br>Economics<br>and Business:<br>Research Data] --> A3bab[...]
    A3baa[Digital Object] --> F1baMeta[Metadata]
    A3baa[Digital Object] --> F1baData[Research Data]

    style TOP fill:#ff7474,stroke:#ffcc10;

    style A1 fill:#ff7474,stroke:#ffcc10;
    style A2 fill:#ff7474,stroke:#ffcc10;
    style A3 fill:#ff7474,stroke:#ffcc10;
    style A1a fill:#ff7474,stroke:#ffcc10;
    style A1b fill:#ff7474,stroke:#ffcc10;
    style A2a fill:#ff7474,stroke:#ffcc10;
    style A2b fill:#ff7474,stroke:#ffcc10;
    style A3a fill:#ff7474,stroke:#ffcc10;
    style A3b fill:#ff7474,stroke:#ffcc10;
    style A3c fill:#ff7474,stroke:#ffcc10;

    style A4 fill:#ffc474,stroke:#ffcc10;
    style A5 fill:#ffc474,stroke:#ffcc10;
    style A6 fill:#ffc474,stroke:#ffcc10;
    style A3aa fill:#ffc474,stroke:#ffcc10;
    style A3ba fill:#ffc474,stroke:#ffcc10;

    style A4a fill:#96ff74,stroke:#ffcc10;
    style A4b fill:#96ff74,stroke:#ffcc10;
    style A5a fill:#96ff74,stroke:#ffcc10;
    style A5b fill:#96ff74,stroke:#ffcc10;
    style A6a fill:#96ff74,stroke:#ffcc10;
    style A6b fill:#96ff74,stroke:#ffcc10;
    style A3aaa fill:#96ff74,stroke:#ffcc10;
    style A3aab fill:#96ff74,stroke:#ffcc10;
    style A3baa fill:#96ff74,stroke:#ffcc10;
    style A3bab fill:#96ff74,stroke:#ffcc10;

Legend:

Color Meaning
Red Community
Yellow Collection
Green Submission

In which Community and which Collections should I submit my research data?

Note

For the release of research data by the editorial team, at least one of the submitting individuals must belong to the respective organizational unit or project.

Institutional affiliation within Goethe University is the essential criterion for selecting the appropriate community and a generic collection is available for submissions at the level of faculties: "name of department: Research Data". If the research data has been created as part of a collaboration between faculties and/or other organisational units, then a choice can be made between the associated communities at one's own discretion.

Members of scientific centres (e.g. C3S, CSC, ...) or coordinated programmes (e.g. LOEWE Centres/Focal Points or Collaborative Research Centres) can also submit research data in the section Cooperating Institutions and Scientific Centers. If not already available, the creation of corresponding communities and/or collections can be requested directly via the Contact Form by the principle investigators or project coordinators.

When do I need a dedicated community or collection for my research data?

Communities and collections do not primarily exist to structure and summarise research data, but rather to manage access rights and the standard visibility of submissions in GUDe.

If, for example, an interdisciplinary and cross-institutional research project wants to give external cooperation partners access to (many) non-published research data, the easiest way to do this in GUDe is via a subcommunity, or there is a need for the (semi-)automated upload of research data into a collection in which research data are generally not published.

For linking research data to projects, persons and organisations, the corresponding data sets are used in GUDe and on this basis suitable lists (bibliographies) can also be created. The request for creating additional collections for (transregional) projects or interdisciplinary collaborations can be made directly via the contact form by the principle investigators or project coordinators.

Can I release research data in my own community or collection without the editorial team?

The GUDe editorial team checks submitted research data against standards and recommendations (documentation standards) for formal criteria and obviously illegal content. If this is not desired for the sake of faster processing or (semi-)automated submission, separate arrangements and agreements must be made between the rights holders and Goethe University, depending on the specific application. In this case, please contact us via the contact form.

General Questions

Who can use GUDe?

External Collaborators

Currently, external collaborators cannot submit data, but they can be granted read access to collections with restricted permissions. If needed, one may contact GUDe Support via the contact form

GUDe can be used by members and affiliates of Goethe University as well as external collaborators. Registration in GUDe is done using the respective HRZ account for Goethe University members and affiliates.

External collaborators can use either Shibboleth or ORCID for login.

During registration, our data protection information must be observed.

What are the costs for using GUDe?

During the testing phase from the 1ˢᵗ of March 2023, until (expected) the 31ˢᵗ of December 2024, the use of GUDe is free of charge according to the terms of use.

Note

For all submissions accepted during the testing phase, no further costs will be incurred after the testing phase ends.

During the current testing phase, a cost model is being developed, which will be implemented after the testing phase concludes.

What ongoing costs can be expected for further administration and use?

During the testing phase from the 1ˢᵗ of March 2023, until (expected) the 31ˢᵗ of December 2024, the use of GUDe is free of charge according to the Terms of Use.

Note

For all submissions accepted during the testing phase, no further costs will be incurred after the testing phase ends.

During the current testing phase, a cost model is being developed, which will be implemented after the testing phase concludes.

What costs can I indicate in a DFG proposal?

To find out which costs can be indicated in a DFG proposal, we refer internal staff to the Research Support department and external institutions to the respective DFG guidelines.

Are there maintenance times?

To keep GUDe up to date, we perform maintenance work every Monday from 9 to 11 a.m. During this time, there may be usage impairments or a complete unavailability of GUDe.

In addition to these maintenance works, GUDe is sporadically unavailable between 4-5 a.m., and users will be logged out during this time because backups of the data are being made, requiring a temporary shutdown of GUDe.

Attention!

This means that data should either not be uploaded overnight or should be uploaded well before these specified times to ensure that the upload is completed definitively and with a safety margin before these mentioned periods.

How do I submit research data to GUDe?

Research data can be added either through MyGude > ⨁ > Research Data or directly via Drag & Drop into MyGude. In doing so, one will be asked to which area the research data should be assigned. Additional files can be added, and there is the option to edit the visibility and the metadata of the digital object. After reviewing (and potentially correcting) the provided information and agreeing to the Publication Agreement, the submission request can be sent.

I cannot find the collection to which I want to assign my research data listed directly when creating my submission. What can I do?

During the creation of a submission, the display of GUDe collections is unsorted. Therefore, it can be cumbersome to find the corresponding collection by scrolling. To associate research data with a collection that is not displayed directly, it is advisable to use the search function that is located above the list.

Note

One can just start typing directly to use the search function without needing to click on the search field first.

Note

It is planned that the display of collections in GUDe will be sorted through a future update.

Can I create my own organizations or projects for my submission?

Yes. Under MyGude > ⨁ > Organization and MyGude > ⨁ > Project, one can create ones own organization and project datasets. When creating these datasets, the following guidelines should be observed.

What should I consider when creating organizations?

For describing institutions and their connections, a simple model based on the QIS (Quality Improvement System) is used:

    flowchart TD
    A["Scientific Institutions (e.g., Goethe University or external institutions)"] --> B[Collaborations and Centers]
    A["Scientific Institutions (e.g., Goethe University or external institutions)"] --> C[Central Facilities]
    C[Central Facilities] --> D[Organizational Units]
    A["Scientific Institutions (e.g., Goethe University or external institutions)"] --> E[Departments]
    E[Departments] --> F[Institutes and Departments]
    F[Institutes and Departments] --> G[Working Groups and Research Groups]
    style A fill:#FFC000,stroke:#FFC000;
    style B fill:#ff0000,stroke:#ff0000;
    style C fill:#ff0000,stroke:#ff0000; 
    style D fill:#92D050,stroke:#92D050; 
    style E fill:#ff0000,stroke:#ff0000; 
    style F fill:#92D050,stroke:#92D050;
    style G fill:#C4BD97,stroke:#C4BD97; 

The University Library, HRZ (Hochschulrechenzentrum or University Computing Center), and the departments of Goethe University already exist as entities. All other organizational entities (e.g., institutes or working groups) should be created by the submitters themselves as needed.

Attention!

It should be noted that, whenever possible, the name of the organizational entity should be provided in English.

It is possible to link new organizational entities with additional organizational entities at the next higher level (for external institutions, this field should be left empty).

Attention!

Linking institutions across multiple levels should be avoided. In case of doubt, one can contact the respective editorial team.

Note

The publication of institutions is done directly and without further editorial review by default.

Note

In the event of restructuring or changes to institution names and designations, a new version of the associated institution entries can be created, or one can seek assistance from the editorial team.

Who can create profiles for individuals?

Due to data protection regulations, only the respective individual is allowed to create their own profile.

Attention!

A profile (or a personal dataset) is automatically created through the registration of the respective individual in GUDe. There is no separate activation process! The profile can be deleted through the user's account if wanted (it should be noted that all existing connections will be irreversibly lost).

It should be ensured that the following format is used for names: Family name(s), Given Name(s). If available, when providing middle names, double names, or similar, consistent spelling should be maintained with previous publications.

Attention!

Prior to a joint submission, it is necessary to take appropriate communication measures: all involved individuals (especially authors) who want their submission to be linked to their profile must log in to the system and create or activate their profile. It is not possible to select contact persons from a predefined list of individuals.

Should I use GUDe or a discipline-specific repository for my research data?

In general, a discipline-specific repository is more suitable than an institutional repository like GUDe for publishing research data, as discipline-specific repositories typically offer better options for indexing and search.

Note

When choosing a discipline-specific repository, it is important to consider criteria such as regular backup cycles, data security, and sustainable service. The portal re3data provides support in finding suitable discipline-specific repositories.

Note

When choosing a repository, one should also consider the respective terms of use and publication agreements.

Are submissions subject to peer review?

No. There is no peer review of the submitted research data by Goethe University or by the editorial team of the university library.

How will the submitted research data be reviewed? Which criteria does the editorial staff of the university library use to check submitted research data?

The approval for the publication of digital objects in GUDe is based on standards and recommendations (documentation standards) developed in collaboration with the faculties.

GUDe may only be used for the storage and publication of digital objects and their contextualization with datasets. The rights holder affirms that submissions do not contain unlawful content, such as glorification of violence, incitement of hatred, or defamation. Goethe University only examines, controls, and monitors digital objects stored and published in the repository for obviously unlawful content.

All submissions intended for public access in GUDe are also reviewed to determine if they contain confidential/internal information or data subject to GDPR processing. Especially refer to Section 4 (Data Protection) of the Publication Agreement for questions regarding data protection.

'How can I assess the "reuse value" of my submitted research data by myself?

https://github.com/DataCurationNetwork/data-primers/

Can I submit Open Educational Resources (OER) to GUDe?

OER materials are only accepted in exceptional cases in GUDe, as GUDe cannot distinguish between OER materials and research data! This poses a significant problem when GUDe's content is queried by search engines and indexes. Therefore, submissions in GUDe are not listed in common OER repositories and would be considerably harder for interested parties to find.

Recommendation

For such purposes, we recommend using a free discipline-specific platform: Twillo (formerly OER-Portal Niedersachsen)

Who can I contact for questions?

We provide support for every step of the research data submission process and are available for all inquiries through the contact form.

I have more than one personal dataset, how did this happen and can I correct it?

This can occur when an individual has used different login methods that are not automatically linkable (e.g., ORCID/Shibboleth and an HRZ employee account). This happens because creating a new account automatically generates a corresponding profile. Since the system cannot automatically determine that two different login methods belong to the same individual, this results in multiple accounts and therefore multiple profiles.

In such situations, we recommend, if available, using the HRZ employee account as the primary one. If there is no HRZ employee account available, we recommend using the account with which more datasets are linked as the primary one.

Redundant individual entries/entities can be deleted by users themselves. There is an option under Account > Profile > Delete for this purpose. Note that all personal data will remain in existing submissions, but all existing connections between these submissions and the user profile will be permanently deleted.

Attention!

If multiple user accounts exist, each with its own connections, before deleting a user profile, it is recommended to first submit a support ticket requesting the transfer of the existing connections to the profile that will not be deleted!

How do I cite my research data submitted in GUDe?

To cite research data submitted in GUDe, the DOI (Digital Object Identifier) should be used, which is automatically generated for publicly accessible submissions.

Attention!

When submitting research data, one will also receive a GUDe internal Handle link that leads to the submitted data. This link is not suitable for citation purposes, as we cannot guarantee that these internal Handles will remain unchanged over time.

Submitting Research Data

What steps are necessary to use GUDe?

GUDe can be used once the contractual terms have been agreed to during the first login.

Do any separate contractual agreements need to be established before submission?

No. As part of the submission process, the Publication Agreement must be agreed to. Together with the contractual terms that must be agreed to during the first login, no further contractual arrangements are needed for submitting an application.

What kind of data can I archive?

GUDe has been designed for the submission of research data. Accordingly, GUDe is suitable for submitting raw data as well as analysed data: possible data types include, among others, tables, text, graphics, CAD, audio, and video files. A more detailed list of recommended file formats can be found under the question Which data formats are suitable?.

Can I publish software on GUDe?

GUDe is only suitable for publishing research software to a limited extent.

If a license other than Creative Commons is to be used for the software (as such would be unusual for software), a corresponding support ticket must be submitted before data submission. We will then adjust the license selection dialog accordingly.

Attention!

When selecting Type of Data, one should choose Software!

Note

The following link may help in choosing an appropriate license: Choose a License.

Which data formats are suitable?

Research data can exist in various data formats, depending on the type of collection and the method used. These formats have different suitability for long-term use. When securing research data in the scientific field, it is particularly important to consider compatibility and the possibility of lossless conversion to alternative formats. Other important criteria include transparency for both humans and machines as well as the expected long-term stability of the file format (keywords: standardization and dissemination).

The following table from the ETH Zurich Library [Source] provides a current assessment of the suitability of commonly used data formats. This assessment is based on experience as well as a comprehensive evaluation of recommendations and guidelines from international institutions involved in archiving. The table also includes recommendations for converting data formats that are either unsuitable or only conditionally suitable. When conversion might result in reduced functionality or information loss, it is recommended to store both formats in GUDe. Additionally, it is advisable to carefully review the quality of the outcome, especially for texts, to ensure that formulas, special characters, umlauts, and special fonts are accurately reproduced.

File Type Recommended Conditionally Suitable Not Suitable
Text
  • PDF/A (*.pdf, preferred subtypes 2b and 2u)
  • Plain Text (*.txt or source code, etc.) encoded as ASCII, UTF-8, or UTF-16 with Byte Order Mark (BOM)
  • XML (including XSD/XSL/XHTML, etc.; schema & character encoding included)
  • (*.pdf) with embedded fonts
  • Plain Text (*.txt, *.asc, *.c, *.h, *.cpp, *.m, *.py, *.r, etc.) (encoded in ISO 8859-1)
  • Rich Text Format (*.rtf)
  • HTML and XML (without external content)
  • Word *.docx
  • PowerPoint *.pptx
  • LaTeX and TeX (including open-source software packages with special fonts and resulting PDF)
  • OpenDocument Formats (*.odm, *.odt, *.odg, *.odc, *.odf)
  • Word *.doc
  • PowerPoint *.ppt
  • Conversion: Convert to PDF/A-2b (or PDF/A-2u)
Spreadsheets and Tables
  • Comma- or Tab-Delimited Text Files (*.csv)
  • Excel *.xlsx
  • OpenDocument Formats (*.odm, *.odt, *.odg, *.odc, *.odf)
  • Excel *.xls, *.xlsb
  • Conversion: Convert to .xlsx
Raw Data and Workspace
  • Plain Text (encoded in ASCII)
  • S-Plus (*.sdd)
  • Conversion: Text format.
  • Matlab (*.mat) from v7.3 MAT-File
  • Network Common Data Format or NetCDF (*.nc, *.cdf)
  • Hierarchical Data Format (HDF5) (*.h5, *.hdf5, *.he5)
  • Matlab Files *.mat (binary)
  • Conversion: HDF5 format
  • R Files *.RData
  • Conversion: HDF5 format (using the rhadf package)
Raster Graphics (Bitmap)

  • TIFF (*.tif, uncompressed, preferably TIFF 6.0+)
  • Portable Network Graphics (*.png, uncompressed)
  • JPEG2000 (*.jp2, lossless compression)
  • Digital Negative Format (*.dng)
  • TIFF (*.tif, compressed)
  • GIF (*.gif)
  • BMP (*.bmp)
  • JPEG/JFIF (*.jpg)
  • JPEG2000 (*.jp2, lossy compression)
Vector Graphics
  • SVG without JavaScript (*.svg)
  • Graphics InDesign (*.indd), Illustrator (*.ait)
  • Encapsulated Postscript (*.eps)
  • Photoshop (*.psd)
CAD
  • AutoCAD Drawing (*.dwg)
  • Drawing Interchange Format, AutoCAD (*.dxf)
  • Extensible 3D, X3D (*.x3d, *.x3dv, *.x3db)
Sound and Audio
  • WAV (*.wav) (uncompressed, pulse-code modulated)
  • Advanced Audio Coding (*.mp4)
  • MP3 (*.mp3)
Video
  • FFV1 Codec (from version 3) in Matroska Container (*.mkv)
  • MPEG-2 (*.mpg,*.mpeg)
  • MP4, also known as MPEG-4 Part 14 (*.mp4)
  • Audio Video Interleave (*.avi)
  • Motion JPEG 2000 (*.mj2, *.mjp2)
  • Windows Media Video (*.wmv)
  • QuickTime Movie (*.mov)

In addition to suitability for long-term reuse of data formats, there are security restrictions on uploading certain file formats in GUDe (more on this in the next table). These formats also include older office formats (e.g., .doc, .ppt, .pptm, etc.), which must* be converted to a more suitable format before uploading.

Additionally, uploading executable and potentially harmful file formats is currently not allowed.

Disallowed File Formats or Extensions
A - E ade, adp, ani, app, appcontent-ms, appref-ms, asp, aspx, asx, bas, bat, cab, cdxml, ceo, cer, chm, cmd, cnf, cnt, com, cpl, crt, csh, cur, der, diagcab, doc, docm, exe
F - K fxp, gadget, grp, hlp, hpj, hta, ico, inf, ins, isp, its, jar, jnlp, job, js, jse, ksh
L - P mad, maf, mag, mam, maq, mar, mas, mau, mav, maw, mcf, mda, mdb, mde, mdt, mdw, mdz, mht, mhtml, msc, msh, msh2, msh2xml, mshl, mshlxml, mshxml, msi, msp, mst, msu, one, ops, osd, pcd, pif, plg, ppt, pptm, prf, prg, printerexport, ps2, ps2xml, psc2, pscl, psdl, psdml, psl, pslxml, pssc, pst, pub, pyc, pyo, pyw, pyz, pyzw
Q - Z reg, scf, scr, sct, settingcontent-ms, shb, shs, theme, tmp, udl, url, vb, vbe, vbp, vbs, vhd, vhdx, vsmacros, vss, vst, vsw, webpnp, ws, wsb, wsc, wsf, wsh, x11xls, xbap, xls, xls, xlsb, xlsm, xnk

If one intends to submit research data in a format listed as disallowed in GUDe, this must be requested and agreed upon in advance via the contact form.

Where can I find instructions and specific guidance for converting office documents to PDF/A?

For converting office documents into the PDF/A format, the University Library provides a helpful guide that specifically addresses programs licensed at the Goethe University. 2

In addition to this guide, the Saxon State and University Library Dresden (SLUB) offers additional instructions tailored to specific software.34

We recommend not uploading individual files larger than 20 GB and not exceeding a total upload size of 100 GB per submission when submitting research data via the web interface.

If larger amounts of data need to be deposited in GUDe, an appointment for transferring the research data via an external hard drive can be arranged using the contact form. The following information must be provided:

  1. The number of files
  2. The average and maximum file size
  3. The distribution of file sizes
  4. How the files are currently structured, and how many submissions they should be divided into

Alternatively, the files can be uploaded independently via the REST API (some programming skills are required for this).

What is the maximum number of files a submission should have?

A submission should not consist of more than 100 individual files. For serializing files, it is recommended to use *.zip or, even better, *.tar files (without compression). Subfolders that are not located in an archive like *.zip or *.tar are generally not suitable for structuring in GUDe, as folders are not preserved when uploading data (i.e., without an archive format, all data are on the same structural level, even if they were originally organized in subfolders on the local system).

How should I structure and name my data?

It is recommended to avoid submitting duplicate or overlapping research data. If research data evolves, they can be submitted as a new version. Changes in the data description should be clearly indicated. Meaningful names for research data should be used, and special characters should be avoided.

Depending on the nature and scope of the research data, it may be advisable to divide them into multiple submissions (e.g., raw data and analysis data). Before submitting, decisions should be made about how data should be linked and then one should plan accordingly.

Continuously and in significant quantities, professional communities and dedicated individuals create various general and specialized (container) formats, such as reprozip, ro-crate, or TheELNFileFormat. These formats serve the purpose of structuring and standardizing research data. To ensure that the submitted work can be effectively utilized by the broadest possible audience, we recommend utilizing unencrypted containers in which the data is stored following the recommended formats (refer to the table). Common serializations like *.tar or *.zip are advisable for this purpose. The discipline-specific (container) format should be documented, for example, in the Description section of a README file.

If two submissions on GUDe need to be mutually linked, there are two possible scenarios: both submissions have not yet been submitted/accepted, or one of the submissions has already been submitted and accepted.

If both submissions have not been submitted or accepted yet, the linking can only be established after the two submissions have been submitted and accepted. For this purpose, the Request a Correction function should be used individually for each submission.

If one submission has already been accepted before the second submission is made, the second submission can be linked to the already accepted submission during the submission process. However, to ensure that the already accepted submission also refers to the new submission, the Request a Correction function must be used.

Can I include domain-specific metadata with my research data in GUDe?

As an institutional repository used for securing and archiving a wide variety of research data, GUDe provides only a standardized schema (based on Dublin Core, Open AIRE, and DataCite) for describing research data. We recommend depositing existing domain-specific metadata in the form of structured text files (e.g., *.json or *.yaml) and/or documenting them in a parallel README file.

What information should be included in a README file?

A README file contains information about the file(s) that are published or archived as research data in the Goethe University Data Repository (GUDe). It provides a clear and concise description of all relevant details regarding data collection, processing, and analysis. A README file is a plain text file – often written in a simple markup language like Markdown – and is located at the main level of a research dataset. For very large datasets, it may be beneficial to create multiple README files or embed documentation directly into the files.

This approach can be particularly advantageous for an institutional repository like GUDe since it offers only a general application profile and does not support domain-specific metadata schemas. Additionally, a README file can help researchers find research data more effectively after use, as simultaneous downloading and recording of metadata (e.g., in reference management software) is not yet widely adopted.

Caution!

When writing README files, the principle of as much as necessary - as little as possible applies. One should only include what is useful and/or necessary for the correct interpretation, evaluation, and reuse of the dataset.

Note

The terms 'readme research data' can be used with a search engine of choice on the internet to find various examples of README files and further guidance.

Note

Furthermore, scientific conventions of the respective field should be used for taxonomic, geographic, and geological names and keywords. Whenever possible, terms from standardized taxonomies and vocabularies should be used.

Can I format text in free-text fields (e.g., Abstract)?

GUDe cannot process formatting inputs in LaTeX or HTML in the free-text fields. However, the use of Unicode is supported (e.g., bullets as well as superscripts and subscripts). If advanced formatting is desired, the corresponding free-text field should be additionally provided as a README file in the submission.

When including links, they should not be shortened using URL shorteners (e.g., tinygu). Additionally, all links should ideally be provided without optional parameters in the URL.

Note

The general guideline here is to use the shortest functional version of the original link.

Is there an API for automated data submission?

GUDe provides various interfaces for automated querying and utilization of its content and metadata for different purposes.

Interface Link Purpose
REST API HAL-Browser Internal use for GUDe automation
OAI-PMH Still in development Harvesting metadata from GUDe (not available yet)

OAI Interface

Still in development!

If needed, one can contact GUDe Support: Contact Form

For harvesting metadata from all published submissions (under CC0 1.0 License), GUDE plans to provide an OAI-PMH interface. The use of this interface will not be restricted.

REST Interface

For automated use of GUDe, the publicly documented REST API is available to members and affiliates of Goethe University. Unfortunately, neither the University Computing Center (HRZ) nor the University Library can provide support for integrating GUDe into internal workflows.

Note

If one wants to test ones REST API script before actual implementation, the GUDe Support may be contacted via the contact form to gain access to a development instance that can be used for testing purposes.

We request notification before initiating development regarding the nature and scope of the planned automated use of GUDe via our contact form. Additionally, when submitting a large number of research data, separate agreements regarding editorial aspects are required.

How do I generate an API token for my account?

The function to generate ones own API tokens can be found under Account > Account > API access token > Generate new token.

Linking Submissions

In GUDe, contextualizing research data and linking them with projects, institutions, and individuals is a central function. By maintaining relevant entries, a semantic network is formed, contributing to the discoverability and creation of publication lists. In GUDe, corresponding entries are created by submitters and curated by the University Library's editorial team.

How are research data linked with a text publication (e.g., journal articles) or other resources?

To link research data with a text publication or other resources, one can add corresponding links during the submission of the research data under Related Resources > Add more. The following information must be provided:

  • Identifier type (e.g., DOI)
  • Corresponding identifier value (e.g., the DOI number)
  • Publication type (e.g., JournalArticle)
  • Linking type (e.g., IsSupplementTo)

For linking with third-party projects, we recommend creating dataset entries with the essential project details before submitting the research data. For DFG projects, GEPRIS is available for research, and for EU projects, CORDIS can be used.

When submitting research data, profiles that have been created in GUDe will be offered for selection when entering authors or contributors. A successful link with a profile will be indicated by a green circle.

Linking Authors with ORCID

To link an author's name to an ORCID, the author in question can connect their profile to ORCID. This can be done through the function available under Account > Profile > ORCID Settings > Connect to ORCID ID. Manually inputting the ORCID in personal datasets is not possible; however, it can be manually entered for the specific submission when providing author information. It is strongly recommended to permanently link authors' accounts with their corresponding ORCID iD.

Note

Goethe University strongly recommends all researchers to register with ORCID (https://orcid.org) and include the ORCID iD in all name and affiliation details (cf. Affiliation Policy).

For indicating the affiliation of authors and their contributions (i.e., linking to an institutional entry), the Affiliation Policy of Goethe University should be adhered to. Authors and contributors affiliated with Goethe University should always indicate an organizational unit of level 3 (Institute/Department) or preferably level 4 (Working Group).

For external authors and contributions, affiliation with an organizational unit of level 1 can always be indicated.

What should I do if I want to specify more than one affiliation per author in a submission?

For submissions it is currently not possible to specify more than one affiliation per author due to technical reasons. If this is necessary, the primary affiliation in the metadata should be Goethe University. Additional affiliations can be noted in a separate README file if needed.

Note

While it is not possible for submissions to have more than one affiliation per author, it is possible to list multiple affiliations in the corresponding personal dataset. However, only permanent affiliations should be noted here, and affiliations specific to a submission should not be included.

Managing

How can I create a collection?

The creation of additional sections and/or collections for projects (including cross-regional projects) or interdisciplinary collaboration can be requested through the contact form.

How do I set up a new user group?

To create a user group for shared access to a non-public submission, it must be requested through the contact form.

Is versioning available?

Yes. In GUDe, it is possible to securely save or publish new versions of digital objects. Before a new metadata entry form opens, a brief description of the type and extent of the changes is required. The submission of the new version proceeds in the same steps as a new submission of research data (though the form is already pre-filled with the information from the previous version).

When should I create a new version?

A new version should be created when archived research data have undergone significant or final changes. In cases of frequent minor changes to research data, it is recommended to create a new version only when relevant milestones have been achieved.

Attention!

Each new version also receives its own DOI.

Note

If minor typographical errors are discovered in the dataset after publication, corrections can be requested without a new DOI assignment through the contact form.

When should I use the correction function?

The correction function should be used when minor errors (e.g., typos) are found in the metadata of a dataset.

Attention!

The correction function only allows corrections to metadata, not errors in the submitted dataset. For dataset corrections, a request must be made through the contact form.

Can I publish my unpublished submission retrospectively?

Yes. For unpublished research data, it is possible to create a new version of the submission and modify the access permissions to make the new version publicly accessible.

How do I register a Digital Object Identifier (DOI)?

A DOI is generated and assigned for each published dataset that did not previously have a DOI. It should be noted that this only applies to publicly accessible datasets.

Note

DOI reservation is not yet implemented in GUDe. Therefore, a DOI cannot be reserved for future publication for non-publicly accessible publications.

Note

A submission published with an embargo also receives a DOI.

Does a DOI refer to a file or a dataset?

A DOI always refers to an entire dataset.

How do I delete a dataset?

A published dataset cannot be independently deleted by the user, as governed by the publication agreement.

Exception!

If the dataset violates applicable laws, its deletion can be requested through the contact form. For more information, see How do I report a violation of data protection rules?

How do I correct metadata?

To correct typographical and spelling errors in metadata of digital objects that were not noticed during the editorial process, a correction function is available in the detail view. The correction is treated in the editorial process like a new submission (including editorial review). The correction function is limited to metadata; new research data cannot be uploaded, and access conditions cannot be changed (more on this in Can I publish my unpublished submission retrospectively?).

How do I correct datasets?

For typographical errors in project and organization datasets, a correction can be directly requested from support: contact form. For name changes, it is common practice in GUDe to create a new version of the dataset (see more in Is versioning available?) instead of requesting a correction, and to link it to the old version.

Changes to personal datasets can only be made by the respective user. Profile changes for a user will not update the metadata of previous submissions to reflect the latest information. This is to enhance citation and discoverability of submitted research data.

Publishing

How can I publish research data in a non-public collection (e.g., after the 10-year retention period for good scientific practice)?

For such cases, the contact form can be used.

Are the data automatically visible to the public?

Submitted data intended for publication become visible only after they have been reviewed and approved by the editorial team. If submitted data are published with an embargo, only the metadata of the dataset are visible after editorial approval until the embargo period expires, provided the metadata were not also published with an embargo.

Note

If an entire digital object is subject to an embargo, the system forwards the metadata of the submission to DataCite for immediate DOI assignment. However, neither the metadata nor the submitted research data are visible in GUDe during the embargo period.

Privately submitted data are not automatically visible.

Am I even allowed to publish my research data?

In principle, all research data for which the necessary rights are owned can be published on GUDe. This includes research data that have been independently produced and for which the rights have not already been transferred to a third party (e.g., through publication in other repositories or as per project financing conditions).

If there is uncertainty, we can be reached through the contact form.

Can data be published with an embargo?

It is possible to subject the submitted digital objects to an embargo before publication. If the digital object has already been published elsewhere and exclusive usage rights have been transferred, the possibility of parallel publication in GUDe must be assessed first.

Note

If an entire digital object is subject to an embargo, the system forwards the metadata of the submission to DataCite for immediate DOI assignment. However, neither the metadata nor the submitted research data are visible in GUDe during the embargo period.

How long will the data be retained?

The deposited digital objects will be stored for at least ten years. Beyond this period, no guarantee is provided, and Goethe University is entitled – but not obligated – to delete digital objects after the ten-year period has elapsed. However, the university strives to preserve the digital objects beyond the aforementioned period. In case of deletion, the metadata of the digital object will continue to be retained for the ongoing citability of the research data in GUDe.

Can I also publish my research data elsewhere?

Yes. Research data can generally be published elsewhere at ones own discretion. By agreeing to the publication agreements, Goethe University has been granted only a simple right of use.

Can I publish sensitive personal data?

No. All sensitive personal data (such as passwords or medical data containing personal information) should be removed from the respective dataset before submitting research data (if present).

Note

Non-critical voluntary information that authors provide about themselves (e.g., contact information and affiliations) is exempt from this regulation.

Can I submit password-protected files as research data?

No, submissions must not include password-protected files as research data.

Licensing

Why do I need to select a license for the publication of research data?

A license governs the rights and obligations for potential reuse by third parties. Therefore, at the end of the submission, research data can be provided with an appropriate license. Before licensing, ensure that the necessary rights to grant a license for all submitted data are possessed by the submitter. For published digital objects, changing the license is no longer possible.

Which license is suitable for my data?

For successful reuse of research data, a legal agreement between the creators or rights holders and interested researchers (reusers) is necessary if the research data meets the creativity threshold required by copyright law. In GUDe, reusers can enable reuse of published research data under Creative Commons licenses or restrict usage to the narrow limits of copyright law: 'All rights reserved'. CC licenses are standardized contractual agreements that allow good control over the granted exploitation possibilities and are available in the following variants:

Icon Central License Terms
CC0 Research data may be freely used without additional conditions or requirements.
CC-BY Attribution of the creators
CC-BY-ND Attribution of the creators & no derivatives allowed
CC-BY-NC Attribution of the creators & non-commercial use
CC-BY-SA Attribution of the creators & share-alike
CC-BY-NC-SA Attribution of the creators & non-commercial use & share-alike
CC-BY-NC-ND Attribution of the creators & non-commercial use & no derivatives allowed
All Rights Reserved This object is protected by copyright and/or related rights. You are free to use the object in any way that is permitted by copyright law and/or related rights. For other uses, you need permission from the rights holder.

Attention!

The CC0 Public Domain license should be seen independently of the Good Scientific Practice rule, which continues to require correct attribution of the origin of reused data!

Note

In accordance with the FAIR principles and the Alliance of Scientific Organizations, we recommend assigning a CC0 or CC-BY license.

Note

More information about CC licenses is also available on the following page: https://www.ub.uni-frankfurt.de/publizieren/faq_publikationsserver_en.html#which_licences

For inquiries regarding copyright in academia, various information sources are available to you:

  • A comprehensive resource specifically created for academia is the guide provided by the Federal Ministry of Education and Research (BMBF). Available at Zenodo.

  • For legally binding advice, the Legal Department of Goethe University is at your disposal, including via email.

  • If you require specific consultation regarding the licensing of research data, you can contact the Open Access Team of the University Library via email.

Please note that the GUDe editorial team and the Research Data Team of the University Library cannot provide legal advice.

This can be reported through the contact form.

Access

How can I control access to my research data?

When submitting research data, access to the research data can be finely controlled. On the one hand, individual files within the submission can be hidden. On the other hand, the entire entry can be invisibly archived or only the metadata made available to the public. If no specific restrictions have been activated, the research data inherit the standard access permissions of the collection in which they are submitted. These standard access permissions are displayed in the submission form in the table directly above the access permission input field.

The accessibility of the submitted research data can be determined in three ways. Through

  1. the properties of the collection in which the research data are submitted.
  2. the settings during submission:
    • globally for the entire digital object
    • individually for each uploaded file

A simple example is shown in this illustration here. In the table at the beginning of the form, both the metadata and the files of the digital object are indicated as public (public) for the selected collection. If the chosen collection is a restricted project- or workgroup-specific collection, a list of users with access is provided here.

Access Conditions Item

For overwriting collection properties, one of the three following options can be selected from the Access condition type dialog:

  1. Public
  2. Private
  3. Embargo

Note

If an entire digital object is subject to an embargo, the system forwards the metadata of the submission to DataCite for immediate DOI assignment. However, neither the metadata nor the submitted research data are visible in GUDe during the embargo period.

By applying these settings, the metadata and files of the digital object adopt these properties. If only individual files need to be restricted, their properties can be set after uploading the files (more on this in Fig. Access for Individual Files).

Access Conditions File

Can I temporarily make restricted-access research data accessible for an external review?

For non-public datasets, it is currently not possible in GUDe to provide restricted-access research data to external parties (e.g., for an external review). This feature is planned to be implemented in GUDe as soon as possible.

How can I make restricted-access research data accessible to my external collaborators?

Access to restricted (unpublished) research data for external collaborators can currently only be set up by the support team.

In the first step, external collaborators must register in GUDe via Shibboleth or ORCID and accept the terms of use. Then, the ORCID iD or Shibboleth UID of the external collaborators and the data to be shared (or the collections or areas) must be communicated to us through the contact form.

Data Protection

How do I report a violation of data protection rules?

Violations of data protection rules can be reported through the contact form.


License Notice

All content licensed under: CC0 1.0


References