3rd edition WOOC

Workshop
on Open Citations and
Open Scholarly Metadata
2022

Watch it again on Youtube
Online event
5 Oct 2022, 15:20-18 CEST
2-hour and 40 minutes online event | 5 OCT 2022, 15:20 - 18:00 CEST

The 3rd Workshop
on Open Citations
and Open Scholarly Metadata
2022

A 3-hour online workshop for researchers, scholarly publishers, funders, policy makers, and opening citations advocates, interested in the creation, reuse, and improvement, of open citation data and open scholarly metadata, with invited speakers.

4 Sessions

Topics

Reforming research assessment with open citations and open scholarly metadata

Artificial intelligence to support the extraction and analysis of citations and metadata

Sustainability of open science infrastructures

Availability of open citations and open scholarly metadata in scholarly disciplines

Call for contributions

Authors are invited to submit a short abstract (500 words max.) that fits one of the four sessions topics. Authors of selected contributions will be asked to record a 5-minute video presentation, publish it on any favourite social platform (e.g. Youtube), and forward the URL to organisers. Organisers will collect video links and make them available on both the workshop website and in a dedicated online space for engaging with interested parties and encouraging a broader discussion.

Submission deadline: 11 September 2022 23:59 CEST

Notification deadline: 18 September 2022 23:59 CEST

Video preparation deadline: 30 September 2022 23:59 CEST

Invited speakers

Claudio Aspesi

Independent Consultant, Zurich, Switzerland

John Chodacki

Director, University of California Curation Center (UC3)

Ana-Maria Istrate

Research Scientist, Chan Zuckerberg Initiative

Jennifer Lin

Product Director, Indeed

Alberto Martín-Martín

Assistant professor, Facultad de Comunicación y Documentación, Universidad de Granada

Organizers

David Shotton

University of Oxford

OpenCitations

Silvio Peroni

University of Bologna

OpenCitations

Chiara Di Giambattista

Research fellow at the University of Bologna, Communications Directir and Commmunity Development Manager of OpenCitations

Ludo Waltman

Centre for Science and Technology Studies (CWTS), Leiden University

VOSViewer

Philipp Mayr

GESIS – Leibniz-Institute for the Social Sciences

EXCite

Giovanni Colavizza

University of Amsterdam

ScholarIndex

Matteo Romanello

Swiss Federal Institute of Technology in Lausanne

ScholarIndex

Program

Introduction

Silvio Peroni
15.20-15.30

Workshop Introduction and authors' contribution acknowledgement

  • Zeyd Boukhers (Fraunhofer FIT, Germany). Evaluation Scheme of FAIRness in Scholarly Data. [video]
  • Fabian Beck (University of Bamberg). Assessing Discussions of Related Work through Citation-based Recommendations and Network Visualization. [video]
  • David Pride and Petr Knoth (KMi, The Open University). Enriching open citation metadata with citation type classification. [video]
  • Muhammad Ahsan Shahid (GESIS). OUTCITE - An integrated platform for comparing reference extraction toolkits and provisioning their output. [video]
  • Eric Schares (Iowa State University). Exploring a University’s Cited Reference Patterns with OpenAlex. [video]
  • George Macgregor (University of Strathclyde). Enhancing discovery and enriching the scholarly graph with the Research Outputs Metadata Schema (Rioxx). [video]
  • Dominika Tkaczyk (Crossref). Things you wish you didn’t have to know about metadata matching. [video]
  • Leyla Jael Castro (ZB MED Information Centre for Life Sciences; NFDI4DataScience consortium), Zeyd Boukhers (NFDI4DataScience consortium), Olga Giraldo (ZB MED Information Centre for Life Sciences; NFDI4DataScience consortium), Adamantios Koumpis (Institute for Medical Informatics, University of Cologne; NFDI4DataScience consortium), Oya Beyan (Institute for Medical Informatics, University of Cologne; NFDI4DataScience consortium), Dietrich Rebholz-Schuhmann (ZB MED Information Centre for Life Sciences; Institute for Medical Informatics, University of Cologne). NFDI4DataScience registry for reproducible Data Science and Artificial Intelligence. [video]
  • Kim Fidomski (Fraunhofer Institute for Applied Information Technology FIT). It is time for a distributed architecture for scholarly data management. [video]
  • Bianca Kramer (Sesame Open Science), Hans de Jonge (Dutch Research Council - NWO). The availability of completeness of open funder metadata. [video]

Authors contributions’ abstracts and slide presentations can be found on the WOOC2022 Zenodo Community.

Session I

Alberto Martín-Martín
15.30-15.50

Managing the scholarly record

Scholarly articles, books, reports, data, code... these and other scholarly outputs are sometimes collectively referred to as the "scholarly record". Although this term is not as widely used as "scholarly publication" or "scholarly literature", it is quite fitting, because this content and its associated metadata are not just "public" information assets, they are in fact the main documentary evidence of the scholarly activities that take place across the globe in different scholarly institutions. Furthermore, like any other organization that generates documents as a result of its own activity, the global scholarly enterprise needs suitable access to past records to support future research activities, as well as to evaluate and improve the scholarly ecosystem itself. Thus, an effective handling of these records is essential for the adequate running of the system. In practice, however, elements of the so-called scholarly record are seldom treated as such. Instead, both content and metadata derived from scholarly activities are often commoditised, leading to barriers in accessing, contributing to, and/or reusing the scholarly record. In this talk I will argue that the principles of records management, a field which has seldom had ties to scholarly communication, can be useful in the current process of digital transformation that the scholarly ecosystem is undergoing. In fact, some initiatives such as the FAIR principles, and several declarations in favor of scholar-led publishing could be seen as sharing many characteristics with this framework.

Session II

Ludo Waltman
15.50-16.10

Let’s move beyond citations!

Major progress has been made in achieving full openness of all citations. This is a hugely impressive milestone. However, it is not enough. We do not just need citations to be open, we need all publication metadata to be open. In this talk I will discuss why this is so important and will provide an up-to-date overview of the state of open metadata. I will reflect on different ways in which openness of metadata can be realized and will argue that we need authoritative sources of open metadata.

Claudio Aspesi
16.10-16.30

Fixing scholarly communications may require moving beyond citations

Citations have become a common metric to assess research. However, scholarly communications will continue to evolve, as the transition to digital distribution will enable new tools and models and as academic leaders grapple with redefining approaches to academic reward and promotion. Citations may evolve (as data analytics will allow to perform increasingly sophisticated analysis) or may become irrelevant (for example, in the construct of the "record of versions", contributions to intellectual advancement may become much more important). It took 215 years between the printing of the Gutenberg Bible and the launch of the first scholarly journal, so it is important to be humble about the future, avoid prescriptive straitjackets and encourage experimentation.

5 min break

Session III

Jennifer Lin and Ana-Maria Istrate
16.35-16.55

Essential frontiers: open data & software citations, an automated ML approach

Science is progressive, and every discovery, set of data, and publication builds on previous work. Today, it's impossible to put every new development in the context of what's gone before. Comprehensive open citations can both enable the attribution of scientific progress as well as the evaluation of research and its impacts. For citations to live up to its promise as a vehicle for the discovery, dissemination, and evaluation of all scholarly knowledge, the open citation frontier needs to expand beyond traditional bibliographic metadata into other essential scientific resources such as research data and software. We describe a new open corpus of dataset and software mentions in biomedical papers created by applying machine learning to full text biomedical literature. We share the process of extraction and transformation of mentions into citations, as well as opportunities and challenges that come with disambiguating and linking the mentions in an open dataset of this size.

Philipp Mayr
16.55-17.15

Handling Non-source items in the project OUTCITE

In this invited talk, we will detail the computational processes we develop in the project OUTCITE, to reduce the amount of non-source items (NCI) in our Social Science article collection. First, we present a detailed analysis of the composition of NCI in our corpus. Then, we describe approaches to match against our own and external bibliographic indices. In the end, we elaborate on novel resolving strategies involving the Bing Websearch API that we apply to NCI which are in particular hard to match.

Session IV

Chiara Di Giambattista
17.15-17.35

Where were we? The role of the community in the present and future of OpenCitations

OpenCitations is a community-based open infrastructure organization for open scholarship dedicated to the publication of open bibliographic and citation data by the use of Semantic Web (Linked Data) technologies. Since its foundation in 2010, OpenCitations exists for the people that use its data for research purposes every day, and thanks to the support received from the global scholarly community. With its valid services and numerous ongoing activities, OpenCitations has quickly become a well-acknowledged infrastructure that perfectly fits into the current Open Science environment. The collaboration with international networks and projects has made it possible for OpenCitations to expand its team and accomplish significant milestones. However, more activities are planned for the future, and the involvement of the community is crucial to help in this process. In this talk, I will provide an overview of OpenCitations’ recent developments, plans and of how the community can embrace OpenCitations' mission.

John Chodacki
17.35-17.55

Make Data Count: The State of Open Data Citations

David Shotton
17.55-18.00

Roundup discussion and conclusion