Invenio Blog

Follow news and updates on Invenio world

Zenodo/InvenioRDM participation at GREI Annual Meeting, Chicago

Sara Gonzales, Kristi Holmes, Martin Fenner Nov 19, 2024 InvenioRDM

Merchandise Mart, River North, Chicago, IL by w_lemay, CC BY-SA 2.0 via Wikimedia Commons

On September 19-20, 2024, members from the seven participating repositories in the Generalist Repository Ecosystem Initiative (GREI) met in Chicago for the GREI Year 3 Annual Meeting to celebrate the achievements of Year 3 of the National Institutes of Health (NIH)-funded, data sharing and open science-focused, four-year initiative. The event was held at Chicago’s historic Merchandise Mart at MATTER, a global healthcare startup incubator, community nexus, and corporate innovation accelerator.

The GREI program’s primary mission is to establish a common set of cohesive and consistent capabilities, services, metrics, and social infrastructure across generalist repositories. The program’s secondary mission is to raise general awareness and help researchers adopt FAIR principles to better share and reuse data. This is particularly important for the health research community to fulfill data sharing requirements per the NIH’s Policy on Data Management and Sharing. Generalist repositories are essential infrastructure to enable data sharing when discipline-specific or institutional repositories cannot be identified or do not exist.

The plans for the third year of GREI have been nearly fully implemented, and the group that gathered in Chicago last month reported on outcomes and completed planning exercises for priorities for Year 4. The Year 3 plan is published and available on GitHub, and is broken down into eight objectives. Below is an outline of how Zenodo representatives contributed to each objective, foundational work which often aligned with improvements being made for the InvenioRDM development and user communities:

Desirable Characteristics for All Data Repositories (for sharing scientific data)

  • Zenodo team members co-led the effort to update the Generalist Repository Comparison Chart, a tool to help researchers select the best generalist repository to meet their needs. The new version is projected for Spring 2025
  • Zenodo team members co-led the team which created the Generalist Repository Selection Flowchart, a tool designed to guide users through a series of considerations while selecting the best repository to share data

Consistent Metadata

  • Zenodo and InvenioRDM continue to adhere to the recommended GREI standard, which incorporates the DataCite metadata schema. In addition, InvenioRDM already has capability for further suggested metadata enhancements, such as incorporation of LCSH and MeSH subject headings and CRediT contributor role taxonomy roles

Search & Browse

  • Work is ongoing to explore a cross-GREI-repository search option. Meanwhile, InvenioRDM has already achieved a further recommendation: incorporation of ROR, the identification registry for research organizations

Analytics & Reporting

  • Zenodo team members are co-leading the effort to share the Make Data Count usage metrics with the wider community via DataCite

Use Cases

  • Zenodo team members co-led the task group responsible for updating the Use Case catalog with two new use cases: one outlining a scenario where a researcher utilizes a generalist repository in lieu of a local institutional repository, and one scenario where a researcher must deposit portions of a heterogenous dataset into several different repositories, one of which is a generalist repository

Connecting Digital Objects

  • Zenodo team members are working on the effort to map DataCite relationTypes to all types of non-data research objects, and to submit any needed updates to DataCite via API
  • InvenioRDM has also completed work on the FAIR signposting recommendation, thanks to Guillaume Viger’s work on signposting

QA/QC

  • Zenodo team members are contributing to a comprehensive review and report on approaches to handling personal or sensitive data, as well as a comprehensive review of how QA/QC on data is done at each GREI repository

Training & Community Engagement

  • Zenodo team members are actively involved in the multiple GREI webinars, conference proposals, and blog posts produced each project year

The GREI-Zenodo team members look forward to continuing our efforts through all the above projects, and new ones in Year 4, to provide an interoperable, robust repository option to US-based health researchers. Additional GREI milestones that can positively impact InvenioRDM development will be reported on in future telecons.

This work was supported by the National Institutes of Health (NIH) Office of Data Science Strategy/Office of the NIH Director pursuant to OTA-21-009, “Generalist Repository Ecosystem Initiative (GREI)” through Other Transactions Agreement (OTA) Number OT2DB000013.

InvenioRDM v12.0 released

Martin Fenner, Sara Gonzales, Guillaume Viger Aug 20, 2024 InvenioRDM

We are happy to announce the release of InvenioRDM v12.0! Released on August 1, 2024, it is the first InvenioRDM release since v11.0 was released on January 23, 2023, and it includes not only many bug fixes and small improvements but also several major new features listed below.

Since the v11.0 release several InvenioRDM partners have launched InvenioRDM instances in production, including Zenodo, which migrated to InvenioRDM in October 2023. A number of these partners have migrated already to InvenioRDM v12.0 or plan to do so in the coming months.

German translations for v12 will be coming in v12.1 targeted for release in October 2024. We hope that other language translations will follow suit. Work on InvenioRDM v13.0 has already started and you can track it here.

Try it

Want to try the new features in v12.0? Just head over to the demo site: https://inveniordm.web.cern.ch. If you want to install it, follow the installation instructions at https://inveniordm.docs.cern.ch/install/.

Release Notes

See the full release notes at https://inveniordm.docs.cern.ch/releases/versions/version-v12.0.0/ and the upgrade guide at https://inveniordm.docs.cern.ch/releases/upgrading/upgrade-v12.0/.

What’s new?

Usage statistics compliant with MakeDataCount and COUNTER

A major new feature in this release is the integration of invenio-stats, a powerful and flexible extension for measuring usage statistics of InvenioRDM records. Usage statistics are compliant with the MakeDataCount and COUNTER standards. Usage statistics are displayed in the record landing page and record search:

Record inclusion in multiple communities

A record can now belong to multiple communities. Including a record in multiple communities will let different curators change the files or metadata of the record.

Powerful and reliable record access

Giving and requesting access to records have seen a complete overhaul in this release. Record owners or curators can share them directly with other users or with groups, as well as control whether, with whom, and how access can be demanded.

Notifications

A notification system has been introduced. Users can now receive email notifications depending on their preferences when they are involved in certain activities.

Moderation of users and records

The administration panel now includes a "User Management" section to suspend, block and delete users, and undo all those actions. Records can also be deleted (with a grace period for appeal or undoing), which empowers administrators to enforce institutional policies and fight spam.

DOIs for concept records and no DOIs for restricted records

InvenioRDM now mints a concept DOI for every record by default, similar to what Zenodo has done for many years. Along with this update, restricted records will now stop minting a DOI upon publication thus keeping private records truly private.

Even better previewers

Audio and video previewers come by default. Tiling support for the International Image Interoperability Framework (IIIF) API standards is available. Text previewing has been made much more resilient.

External resources integration (e.g. GitHub)

The landing page for a record can now provide nicer visuals for configured related works. A configuration variable can be set to highlight some of the referred platforms.

Skippable community submission review

With InvenioRDM v12, you can adjust whether a record review is always required for your community, or if curators, managers, and owners can submit a record without the review process.

Banners

With the addition of invenio-banners, you can easily add and manage important alerts and messages on your InvenioRDM instance, such as system maintenance notifications or announcements.

Static pages - administration panel

Another notable addition is the integration of invenio-pages with the administration panel. It exposes a convenient UI to create and manage static pages.

Breaking changes

Make sure to read the Breaking Changes section in the release notes.

Limitations and known issues

  • Translations for v12 will be coming in v12.1 targeted for release in October 2024.
  • Sharing a secret link to a restricted record in a restricted community does not provide access to the record yet. Work on this is tracked here.

Requirements

InvenioRDM v12 requires:

  • Python 3.9, 3.11 or 3.12
  • Node.js 18+
  • PostgreSQL 12+
  • OpenSearch v2

Support for older versions of Elasticsearch/Opensearch, PostgreSQL and Node.js has been phased out.

Questions

If you have questions related to the InvenioRDM v12.0 release, don't hesitate to jump on Discord and ask us!

Credit

The development work of this impressive release wouldn't have been possible without the help of these great people:

  • CERN: Alex, Anna, Antonio, Carlin, Fatimah, Javier, Jenny, Karolina, Lars, Manuel, Nicola, Pablo G., Pablo P., Pablo T., Yash, Zacharias
  • Northwestern University: Guillaume
  • TU Graz: Christoph, David, Mojib
  • TU Wien: Max
  • Uni Bamberg: Christina
  • Uni Münster: Werner
  • Front Matter: Martin
  • KTH Royal Institute of Technology: Sam
  • Caltech: Tom

Conference Spotlight: InvenioRDM Workshop Day at Open Repositories 2024

Sara Gonzales Jul 19, 2024 InvenioRDM

The 19th annual Open Repositories conference took place June 3 - 6, 2024 in Göteborg, Sweden. Attendees benefitted from hundreds of presentations on innovations in the world of digital repositories, all focused around this year's conference themes of Community, Transparency, and Sustainability. In addition, ten workshops were presented on June 3rd, the InvenioRDM Workshop being one of only two to receive a full-day allotment.

At our third annual OR InvenioRDM Workshop, over a dozen of our community members collaborated to present and discuss the software, and to showcase features and customizations currently in place at our own institutions. These include:

In the afternoon, we offered two separate tracks. In the first we hosted self-organizing technical discussions, including a hands-on demonstration of Devbox presented by project partners from Uni Münster. We also encouraged participants to install InvenioRDM on their machines with help from technical experts, many of whom had already spoken in the earlier showcase session.

In the second track, we focused on community discussions with two additional presentations:

An informal survey of our 30+ participants afterward revealed that the workshop was generally very favorably received. Participants enjoyed the general introduction to the platform, as well as feature demonstrations and the opportunity to speak with members of the community. While some participants found installation challenging, others found deployment easy. Overall, participants found the experience to be valuable, with some encouraging us to offer the workshop again next year with minor tweaks.

Our active community participated in more than just the InvenioRDM workshop, however. Project partners also participated in:

At the end of the conference, it was announced that the next Open Repositories will be hosted in Chicago from June 15 - 18, 2025. We will aim to host an InvenioRDM workshop once again, and we hope to see you there!

InvenioRDM Partner Meeting / Developer Workshop: Action Items

Sarah Wiechers Apr 24, 2024 InvenioRDM

From March 18 - 22 the InvenioRDM partner community met in Münster, Germany, to discuss current challenges and plans for the future of the project. While many smaller questions and issues were addressed immediately on-site, e.g. during one of the “merge parties” or in work sessions resulting from and building upon previous discussions, many sessions ended with clear lists of planned actions for the coming weeks and months.

To showcase the kinds of actions we decided on, and to have something to measure our success by at the next developer workshop, we're sharing here our full list of action items. In the fall, approximately at the half-way-point between the last and the next workshop, the community will meet at a themed Telecon to evaluate this list and see what the status on these plans is.

In the following you can find a list of all sessions and the actions decided upon by the participants. Some of the plans mentioned were already set into action - a huge thank you goes to the motivated community members working on all these actions!

Community engagement and handling pull requests

  • document this process in invenio-docs (curator role, responsibilities, templates, schedule, reviewer guide)
  • make PR checklist more extensive (criticality, bug/feature, packages impacted, target release (old, current, new brach))
  • merge party this afternoon and on a regular basis
  • dream parties (form pitch to architecture proposal, mockups)
  • demo parties
  • public documentation of people to talk to for packages and features (e.g. list of module maintainers)
  • dedicated role for community contribution management
  • identify people involved from the community and their involvement (e.g. searchable expertise board)

Translations / i18n, i10n

  • test if yaml-files can have four letter codes
  • group tests Münster approach way of overwriting python strings, java-script
  • Zack and David try to implement the Münster approach into the official cookiecutter so its easier to use and has a global storage of frontend translations

Large file management

Invenio RDM Version 12

  • fix the blockers until end of April
  • document features
  • translate in May
  • going forward and release v12 before open repositories

Invenio CRIS

  • further call for interest in InvenioRDM visio conferences (presentation)
  • Created a GitHub Discussion to gather more use cases and interest from the community inveniosoftware/product-rdm#153
  • decide between specific in communities and more ambitious in a specific module based on traction
  • If invenio-communities is the way to go, try to put specific fields in community data model in a bucketed way
  • deepen code review

Customising Schema

  • Write a feature request for overlay config for required fields per instance / community.
  • Improve docs on Custom Fields - checklist on when they're useful, community examples.
  • Case studies / knowledge sharing of the 'hard way' customisations that have been achieved.

Deployment

  • do a Session on small one VM deployments
  • do a Session on Helm charts
  • update deployment documentation

Digital Preservation

  • Brush up Invenio-SIPStore (Max) Check out Invenio-Archivematica, or look at the AM Python client (Max) Check if an AM plugin for Invenio is needed (Panna?)

Workflow to keep internal roadmap updated

  1. GitHub Discussions implemented for feature request forum
  2. CERN will do quarterly cleaning & updates of roadmap items (Jan, Apr, Jul, and Oct) and will host talks about issues at corresponding Telecons

RDM Curation Workflow

  • open question is if/how this could go into the core or if it should remain an extra package maintained by TU Graz
  • ensure possibility to configure aspects in order to meet discussed requirements, use cases, and user stories
  • upload package to GitHub for future collaboration
  • after follow-up discussion, one idea is to extend the functionality of the requests in the core and build the curation on top of these changes
    • more details
    • Split approval and acceptance of requests
    • Requests can be extended with requirements/checks. These have to be fulfilled before a request can be accepted.
      • Can be specified per request type
      • Repository wide requirements are applied to all request types

Harvesting vocabularies

  • Start a coordinated sprint in May (CERN and Uni Münster).

Creating a place to collect Invenio modules etc.

  • Awesome Invenio - links to the actual GitHub Repository where everyone with a GitHub account can add their or other projects and create a PR for it. It's mainly curated by @egabancho for the time being.

Community workflow improvements

  • Pitch governance documentation update
  • Propose group structure (InvenioRDM Interest and Task Groups)
  • Propose new/additional way of subscribing to the list-serv
  • Pin and list welcome message and resource list to the welcome channel in Discord

Developing with Devbox

  • share repository with all used scripts with community

Repeatable installs

  • always copy package-lock.json from ../var/instance/assets/ to my-site/. Make sure that git ignores the creation/modification file dates if the file content didn't change
  • have an extra CLI param in invenio-cli that will:
    • copy the package-lock.json from my-site to ../var/instance/assets/
    • if the my-site/pipfile.lock is newer than my-site/package-lock.json, then overwrite or fail (some python deps might have changed)
    • run npm ci instead of npm install
  • the development workflow with linked editable (watched) modules should be tested, to ensure nothing breaks
  • when in NODE_ENV=production, the above should always happen. When in NODE_ENV=dev, then I can use a new param invenio-cli install --keep-deps
  • New issue tracked here: inveniosoftware/product-rdm#179

One VM Deployment

Preservation of workshop outcomes

  • Create a Google form for filling in session information (→ Steve)
  • Process information from the Google form entries (→ Markus)
  • Create a repo in the inveniosoftware namespace (→ someone from CERN)

Kubernetes Helm-charts requirements

Workgroup structure - Governance

  • please contact Lars if you know a good mailing list software! - in progress

InvenioRDM feature request form

Usage statistics

Evaluate proposed strategies for handling PRs

  • regular merge parties right after Telecons (cut Telecons to 30 minutes) (Sara)
  • train more people as maintainers with merge rights (in on-site meetings) (CERN)
  • figure out a clear labeling system (Karolina)

InvenioRDM Partner Meeting Summary, March 2024

Sara Gonzales Apr 23, 2024 InvenioRDM

The InvenioRDM partner community met in Münster, Germany from March 18 - 22, 2024, for our first in-person annual meeting in over four years. Forty-six attendees from over fifteen institutions spent the five days making connections, planning, and diligently working on all things InvenioRDM.

Organizer Sarah Wiechers, software developer and research data manager for the Service Center for Data Management, ULB Münster, proposed the use of Open Space Technology, a meeting organization methodology in which the agenda and topics discussed are voted upon and implemented by the meeting’s attendees on the meeting date. This methodology was highly effective, allowing all community members the opportunity to pitch ideas and vote on their favorites for discussion.

University of Münster’s InvenioRDM Community GitHub site contains our full output of discussions, decisions, and plans. Key topics addressed were the timely handling of pull requests, planning for the v12 release, improvements to community workflows, translations, large file management, deployment, digital preservation, vocabularies and fixtures, Kubernetes Helm-charts requirements, and much more. To help manage feature requests going forward, we have agreed to implement GitHub Discussions, a tool which will allow for providing context, asking questions, and upvoting most-wanted features. GitHub Discussions is now part of our workflow for keeping the Roadmap updated.

We were thrilled to be able to work with so many old and new friends in beautiful Münster, and a huge thanks goes to our local hosts at University of Münster for their hard work and hospitality. We hope that you will be able to join us at next year’s meeting, with Hamburg currently slated as the host city. In the meantime, please take advantage of our following updated community workflows, including:

  • Development-focused chats during half of each telecon
  • A new email list - coming soon
  • Newly established Interest Groups (long-standing) or Task Groups (deliverable-focused). For the current list of all Groups, see the new Onboarding page

Prism: The New Feinberg Repository for Global Dissemination of Research

Galter Health Sciences Library & Learning Center Nov 28, 2023 InvenioRDM

The Prism institutional repository has launched at Northwestern University. Prism preserves and makes available articles, conference presentations, preprints, datasets, and other items created by faculty, staff, and students. Prism helps openness, maximizes reproducibility, and enhances research connections within Feinberg School of Medicine and across the globe.

Prism replaces the former DigitalHub and includes many much-anticipated features, such as the ability to create metadata-only records for offsite datasets, set embargo dates for releasing content to the public, create and curate communities of practice, and share private links to view and edit with colleagues. These new features complement existing features such as the ability to assign Digital Object Identifiers that make records citable, indexing by Google to make research widely discoverable, and a responsive staff at Galter Library to answer questions and provide support.

Kristi Holmes, the Director of Galter Health Sciences Library, Associate Dean of Knowledge Management and Strategy, and professor of Preventive Medicine, played a leading role in the development of Prism. According to Holmes, “It is essential to have a robust institutional repository that can keep up with the latest technologies and trends. As models for open access and data sharing continue to evolve, it's clear that institutional repositories will play an increasingly critical role to make research Findable, Accessible, Interoperable, and Reusable (FAIR).” She continues, “What makes this project particularly special is the strong collaborative approach we’ve taken both at Northwestern and also in partnership with the larger open source community. I’m grateful to our team at Galter Library for their incredible work.”

“We are excited to announce the launch of Prism as the institutional repository for Feinberg School of Medicine” says Karen Gutzman, Head of Research Assessment and Communications at Galter Health Sciences Library and Learning Center. Prism builds on a strong research foundation first made possible in the DigitalHub repository. “One of the most exciting features of Prism is the ability to create communities on topics, projects, or events” says Gutzman. Communities include open access research from Feinberg on COVID-19, training presentations from the Biostatistics Collaboration Center, and the NUCATS Grant Repository, a centralized resource for grant writers and investigators internal to Northwestern provided by the NUCATS Institute. “Prism is an excellent home for the NUCATS Grants Repository, allowing us to easily share exemplar grant templates and other resources with investigators across the Northwestern University community,” says Dr. Richard D’Aquila, Director of NUCATS.

Prism is built on the InvenioRDM software, which also forms a strong and sustainable foundation for Zenodo. “With its user-friendly interface and advanced features, InvenioRDM is truly a game-changer in the world of repositories. This platform is designed to make research more accessible and open to the public, promoting innovation and collaboration within the academic community,” says Holmes. Over the past several years, CERN and Northwestern have collaborated as core co-developers of the software, partnering with the global Invenio Open Source Community to develop InvenioRDM as a turnkey, scalable, and top-of-the-class user experience software for repositories. The InvenioRDM software offers a reliable environment for science, empowering preservation, credit, discovery, and sharing while maintaining integrity in its responsiveness to the evolving needs of the research community, including data sharing policy compliance. Northwestern contributions to the open source project are led by Matt Carson, Senior Data Scientist and Head of the Galter Library Digital Systems Department, with developer Guillaume Viger leading technical work and Sara Gonzales, Senior Data Librarian at Galter Library, contributing to a wide range of efforts, including serving as the Community Manager.

Notably, Northwestern and CERN recently expanded this collaboration through an award from the NIH Office of Data Science Strategy/Office of the NIH Director pursuant to OTA-21-009, Generalist Repository Ecosystem Initiative (GREI), to Zenodo to help researchers improve discoverability of their data and lead to greater reproducibility and reuse of data. Through this award and others, the infrastructure reflects Desirable Characteristics of Data Repositories for Federally Funded Research and continues to evolve to meet research needs and support a vibrant data ecosystem.

Conference Spotlight: InvenioRDM Workshop Day at Open Repositories 2023

Sara Gonzales Jul 10, 2023 InvenioRDM

Last month InvenioRDM project partners convened in Stellenbosch and Cape Town, South Africa for the 18th International Open Repositories Conference (OR2023). We were thrilled to offer our second InvenioRDM workshop on the 12th of June (see our wrap-up of the first workshop: OR2022 workshop). During three eventful hours we spoke with librarians, data managers, administrators, and developers about the basics of joining the InvenioRDM project and standing up an instance. We also shared information on customizations and add-ons for those interested in the platform’s advanced capabilities.

Following Nicola Tarocco’s introduction and demo of key InvenioRDM functions and features, our sessions included:

  • Maximilian Moser’s presentation on customizations for his local InvenioRDM instance TU Wien Research Data, as well as a CLI command enabling very large file upload
  • Dan Granville’s presentation on IIIF in InvenioRDM, exemplified through his work with Data Futures GmbH
  • Guillaume Viger’s presentation on the launch of Northwestern University’s Prism instance, with a particular focus on successful migration of awkward data
  • Matt Carson’s presentation on InvenioRDM’s support of the FAIR principles, with additional information on the platform’s support for data policy initiatives
  • Maximilian and Guillaume’s presentation on the details of, and endless possibilities for, deployment of InvenioRDM instances
  • Zacharodimos Zacharias’ presentation on his and the CERN team’s iterative and productive efforts to achieve fast and efficient data migration as a test for Zenodo’s move to InvenioRDM later this year.

The workshop attendees were engaged, brought great questions, and gave us much food for thought. Our conversations with potential new partners were so extensive that we were not able to include all planned presentations! Be on the lookout for future blog posts containing this bonus content.

But wait, there’s more! InvenioRDM’s OR2023 representation included Nicola’s fantastic lightning talk on behalf of InvenioRDM in the annual Repository Rodeo. In addition, Maximilian Moser and David Eckhard of TU Graz presented their work on a communication protocol to connect machine-actionable DMPs with InvenioRDM, and Zacharodimos Zacharias presented on InvenioRDM’s mature integration with ROR (the Research Organization Registry).

All of the InvenioRDM Workshop Day presentations are now available for download from Zenodo. We hope to host another InvenioRDM workshop at the 19th Annual Open Repositories Conference June 3-6th, 2024 in Gothenburg, Sweden. If we’ve missed you at our previous two workshops, we hope to see you there!

Introducing the InvenioRDM GitHub Archiver (IGA)

Mike Hucka and Tom Morrell Jun 26, 2023 InvenioRDM

The InvenioRDM GitHub Archiver (IGA) is a new software tool created by the Caltech Library. InvenioRDM is the basis for many institutional repositories, such as CaltechDATA, that enable users to preserve software and data sets in a long-term archive. The metadata contained in the record of a deposit is critical to making the record widely discoverable by other people. However, creating detailed records and uploading assets can be a tedious and error-prone process if done manually. This is where our new tool comes in.

IGA creates metadata records and sends releases automatically from GitHub to an InvenioRDM-based repository server. It constructs a metadata record using information it gathers from the software release, the GitHub repository, the GitHub API, and various other APIs as needed. Here are some of IGA’s other notable features:

  • Automatic metadata extraction from GitHub releases, repositories, and codemeta.json and CITATION.cff files
  • Thorough coverage of InvenioRDM record metadata using painstaking procedures
  • Recognition of identifiers that appear in CodeMeta and CFF files, including ORCiD, ROR, DOI, arXiv, and PMCID
  • Automatic lookup of publication data in DOI.org, PubMed, Google Books, & other sources if needed
  • Automatic lookup of organization names in ROR (assuming ROR id’s are provided)
  • Automatic lookup of human names in ORCiD.org if needed (assuming ORCID id’s are provided)
  • Automatic splitting of human names into family and given names using ML-based methods if necessary
  • Support for record versioning
  • Support for InvenioRDM communities
  • Support for overriding the metadata record IGA creates, for complete control if you need it
  • Ability to use the GitHub API without a GitHub access token in many cases
  • Extensive use of logging so you can see what’s going on under the hood

Data and software archived in a repository need to be described thoroughly and richly cross-referenced in order to be widely discoverable by other people. Of particular interest to software developers is that a repository like CaltechDATA offers the means to preserve software projects in a long-term archive managed by their institution. IGA helps make the creation of metadata and InvenioRDM records for software and data managed in GitHub as easy as possible.

More information about the InvenioRDM GitHub Archiver is available at caltechlibrary.github.io/iga/.

How does InvenioRDM support proper resource curation and validation?

Matt Carson May 30, 2023 InvenioRDM

With mandates and directives such as the European Union’s Plan S, the recently-introduced National Institutes of Health Policy for Data Management and Sharing in the United States, and the White House Office of Science and Technology Policy Memo requiring free and immediate access to federally funded research in the U.S. by the end of 2025, increased emphasis is being placed on curation, validation, and preservation of research output. In addition, efforts such as the Generalist Repositories Ecosystem Initiative (GREI) aim to bring next-generation generalist repositories into alignment on key features required for discovery and reuse of research outputs. Zenodo has joined this initiative and will migrate to the InvenioRDM platform later this year, which will play a crucial role in helping to support best practices in research data management for this popular service.

From ensuring deposited records abide by FAIR principles, to granular access control, InvenioRDM supports resource curation and validation with a variety of functionalities. Below are some of the key features that make the platform a great option for sharing and preserving your work:

Deposits:

  • Deposited records check all FAIR principles (Findability, Accessibility, Interoperability, Reusability)

  • Configurable confirmation text before deposit to ensure agreement with security and privacy policies

  • Long-running record drafts for pre-publication edits

  • A wide variety of licensing options

  • Ability to set an embargo period for records or files with automated embargo removal at a specified date

  • Contextualization of records via secondary identifiers to publications, project sites, external data, and other associated research outputs

  • Citations in multiple formats provided for each record

  • Empowers communities to self-curate domain specific records

  • Ability to archive software via GitHub integration within records, allowing data to be linked to code [coming soon]

  • Usage metrics and analytics (COUNTER-compliant statistics gathering for record access and downloads) [coming soon]

Documentation:

  • Extensive support documentation

  • Possible addition of informative pages e.g., Terms of Use, Privacy Policy, Deposit Agreement.

Metadata and Discovery:

  • Supports industry standards for interoperability

  • Ability to broker metadata to search and discovery services

  • Hand curation by repository administrators for metadata quality assurance

  • Versioning of all deposits, with a per-version permanent identifier and a per-deposit (across all versions) concept permanent identifier

  • Configurable registration with external identifier systems e.g., DOI

Access, Security, and Quality Control:

  • Authentication configurable with institutional credentials or third-parties (e.g., ORCiD, GitHub)

  • Checksum for uploaded files ensure file integrity throughout each file's lifetime

  • Restricted access capability

  • Can be configured for secure storage

  • Automated periodic file audits

New version of CaltechDATA launches on InvenioRDM

Tom Morrell Sep 21, 2022 Caltech

Caltech Library is pleased to announce that CaltechDATA, our institutional data and software repository, launched a major upgrade on Wednesday September 21, 2022.

CaltechDATA has served as critical research infrastructure for campus since 2017, and it hosts over 20,000 records containing datasets and software for a wide variety of disciplines. With this launch, CaltechDATA now runs on the open-source InvenioRDM platform and brings many new features that Caltech researchers have requested:

  • Easier record creation with autocomplete for creators, affiliations, subjects, and funders
  • Automatic record versioning
  • Private share link for reviewers
  • Improved record views, with dynamic citations and an expanded file previewer

This version of CaltechDATA also introduces communities, which enable groups at Caltech to create their own record curation and approval processes. Researchers can collect records into a single browse and search interface. A curation pipeline allows records to be submitted by Caltech users, and then approved by a defined set of curators. We’ve pre-seeded a small number of initial communities, and look forward to seeing what researchers create.

InvenioRDM is a customizable open-source repository platform developed by CERN and twenty partner organizations, including Caltech Library. It is built on the twenty-year history of the Invenio repository platform, whose most-successful implementation is the Zenodo generalist repository hosted by CERN. InvenioRDM takes the features of Zenodo and makes them customizable for institutions. InvenioRDM will enable Caltech Library to more rapidly roll out new features and collaborate with other institutions to establish repository best practices.