Location

School of Law Seminar Room 3.09

Start Date

3-6-2026 4:00 PM

End Date

3-6-2026 4:30 PM

Description

The rise of open research information resources is transforming the way we track, analyse and study research systems. Increasingly sources like OpenAIRE, OpenAlex, Crossref, DataCite, ORCID, ROR and others are being used as the basis for making decisions, designing interventions

and understanding progress. This operates both at the small scale, where access to data and evidence is easier than it has ever been, to the very large scale analysis of whole systems. Modern open data sources provide access, including access to full copies of the data, but there has been less focus on providing this access in a way that allows complex querying and joining of whole data archives - for example to compare the coverage of research outputs by OpenAlex and OpenAIRE or analyse global information on clinical trials using affiliation data

from OpenAlex and clinical trials information from Pubmed. Another valuable possibility is the ability to incorporate local data enrichments from national or regional data sources to support local data needs, or improve the overall pool of data. Google BigQuery has emerged as a powerful tool for combining and working on these large datasets at scale. Multiple groups (including the InSysPo team at Campinas, SUB Göttingen and Sesame Open Science), have created versions of specific open datasets in the BigQuery system, which anyone can access and run their own analyses. Here, the ‘provider’ pays for storage, and the user covers the costs of processing.

For large scale analysis that require entire data sources to be combinable and actionable at scale, this approach can add something valuable to the overall Open Research Information ecosystem. In this presentation, we will share examples of use cases and discuss key questions around coordination and sustainability of this approach (including potential alternatives to Google Big Query), and what would be needed for different stakeholders to make this attractive.

Share

COinS
 
Jun 3rd, 4:00 PM Jun 3rd, 4:30 PM

Sharing the load: Building an open research information collective

School of Law Seminar Room 3.09

The rise of open research information resources is transforming the way we track, analyse and study research systems. Increasingly sources like OpenAIRE, OpenAlex, Crossref, DataCite, ORCID, ROR and others are being used as the basis for making decisions, designing interventions

and understanding progress. This operates both at the small scale, where access to data and evidence is easier than it has ever been, to the very large scale analysis of whole systems. Modern open data sources provide access, including access to full copies of the data, but there has been less focus on providing this access in a way that allows complex querying and joining of whole data archives - for example to compare the coverage of research outputs by OpenAlex and OpenAIRE or analyse global information on clinical trials using affiliation data

from OpenAlex and clinical trials information from Pubmed. Another valuable possibility is the ability to incorporate local data enrichments from national or regional data sources to support local data needs, or improve the overall pool of data. Google BigQuery has emerged as a powerful tool for combining and working on these large datasets at scale. Multiple groups (including the InSysPo team at Campinas, SUB Göttingen and Sesame Open Science), have created versions of specific open datasets in the BigQuery system, which anyone can access and run their own analyses. Here, the ‘provider’ pays for storage, and the user covers the costs of processing.

For large scale analysis that require entire data sources to be combinable and actionable at scale, this approach can add something valuable to the overall Open Research Information ecosystem. In this presentation, we will share examples of use cases and discuss key questions around coordination and sustainability of this approach (including potential alternatives to Google Big Query), and what would be needed for different stakeholders to make this attractive.

 

To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.