3rd edition WOOC

Online event
5 Oct 2022, 15-18 CEST
3-hour online event | 5 OCT 2022, 15:00 - 18:00 CEST

The 3rd Workshop
A 3-hour online workshop for researchers, scholarly publishers, funders, policy makers, and opening citations advocates, interested in the creation, reuse, and improvement, of open citation data and open scholarly metadata, with invited speakers.

4 Sessions


Reforming research assessment with open citations and open scholarly metadata

Artificial intelligence to support the extraction and analysis of citations and metadata

Sustainability of open science infrastructures

Availability of open citations and open scholarly metadata in scholarly disciplines

Call for contributions

Authors are invited to submit a short abstract (500 words max.) that fits one of the four sessions topics. Authors of selected contributions will be asked to record a 5-minute video presentation, publish it on any favourite social platform (e.g. Youtube), and forward the URL to organisers. Organisers will collect video links and make them available on both the workshop website and in a dedicated online space for engaging with interested parties and encouraging a broader discussion.

Submit your abstract

Submission deadline: 11 September 2022 23:59 CEST

Notification deadline: 18 September 2022 23:59 CEST

Video preparation deadline: 30 September 2022 23:59 CEST

Invited speakers

Claudio Aspesi

Independent Consultant, Zurich, Switzerland

John Chodacki

Director, University of California Curation Center (UC3)

Ana-Maria Istrate

Research Scientist, Chan Zuckerberg Initiative

Jennifer Lin

Product Director, Indeed

Alberto Martín-Martín

Assistant professor, Facultad de Comunicación y Documentación, Universidad de Granada

Victoria Moody

Director of Research and innovation sector strategy, Jisc

Co-Investigator and Deputy Director, UK Data Service


David Shotton

University of Oxford


Silvio Peroni

University of Bologna


Chiara Di Giambattista

Research fellow at the University of Bologna, Communications Directir and Commmunity Development Manager of OpenCitations

Ludo Waltman

Centre for Science and Technology Studies (CWTS), Leiden University


Philipp Mayr

GESIS – Leibniz-Institute for the Social Sciences


Giovanni Colavizza

University of Amsterdam


Matteo Romanello

Swiss Federal Institute of Technology in Lausanne




Workshop Introduction and authors' contribution acknowledgement

Session I

Lift, shift and re-use: Repurposing open access in new ways: The possibility (and probable necessity) of ensuring the capability and sustainability of research infrastructure assets through the enhanced deployment of open science approaches, creating new pathways for established approaches and extending their impact

In a time of increasing pressures on researchers, resources and a need to reduce the bureaucracy which draws attention away from a focus on excellent, impactful research, what are some of the emerging areas of challenge and opportunity for open science in the area of ensuring the capability and sustainability of research infrastructure assets?

What principles might be of benefit and which approaches will need to be advanced from established open access pathways and implemented in new ways to optimise and balance a complex landscape in a time of accelerating technology development and rapid growth in commercial supplier partnerships? Building on excellent practice across the international research landscape, what might be the optimal strategic approach and portfolio of policies, tools and services needed to support this optimisation?

Focusing on the foundational research assets of software, hardware, code, machines, management data and policy, this session will offer an infrastructure perspective of the opportunities and imperatives for replicating established routes to open science in new ways. It will outline an ambition for ‘dimensionally ethical’ research, drawing on excellent practice in train and offering a few perspectives to reflect on for further consideration. 

Managing the scholarly record

Scholarly articles, books, reports, data, code... these and other scholarly outputs are sometimes collectively referred to as the "scholarly record". Although this term is not as widely used as "scholarly publication" or "scholarly literature", it is quite fitting, because this content and its associated metadata are not just "public" information assets, they are in fact the main documentary evidence of the scholarly activities that take place across the globe in different scholarly institutions. Furthermore, like any other organization that generates documents as a result of its own activity, the global scholarly enterprise needs suitable access to past records to support future research activities, as well as to evaluate and improve the scholarly ecosystem itself. Thus, an effective handling of these records is essential for the adequate running of the system. In practice, however, elements of the so-called scholarly record are seldom treated as such. Instead, both content and metadata derived from scholarly activities are often commoditised, leading to barriers in accessing, contributing to, and/or reusing the scholarly record. In this talk I will argue that the principles of records management, a field which has seldom had ties to scholarly communication, can be useful in the current process of digital transformation that the scholarly ecosystem is undergoing. In fact, some initiatives such as the FAIR principles, and several declarations in favor of scholar-led publishing could be seen as sharing many characteristics with this framework.

Session II

Let’s move beyond citations!

Major progress has been made in achieving full openness of all citations. This is a hugely impressive milestone. However, it is not enough. We do not just need citations to be open, we need all publication metadata to be open. In this talk I will discuss why this is so important and will provide an up-to-date overview of the state of open metadata. I will reflect on different ways in which openness of metadata can be realized and will argue that we need authoritative sources of open metadata.

Fixing scholarly communications may require moving beyond citations

Citations have become a common metric to assess research. However, scholarly communications will continue to evolve, as the transition to digital distribution will enable new tools and models and as academic leaders grapple with redefining approaches to academic reward and promotion. Citations may evolve (as data analytics will allow to perform increasingly sophisticated analysis) or may become irrelevant (for example, in the construct of the "record of versions", contributions to intellectual advancement may become much more important). It took 215 years between the printing of the Gutenberg Bible and the launch of the first scholarly journal, so it is important to be humble about the future, avoid prescriptive straitjackets and encourage experimentation.

5 min break

Session III

Essential frontiers: open data & software citations, an automated ML approach

Science is progressive, and every discovery, set of data, and publication builds on previous work. Today, it's impossible to put every new development in the context of what's gone before. Comprehensive open citations can both enable the attribution of scientific progress as well as the evaluation of research and its impacts. For citations to live up to its promise as a vehicle for the discovery, dissemination, and evaluation of all scholarly knowledge, the open citation frontier needs to expand beyond traditional bibliographic metadata into other essential scientific resources such as research data and software. We describe a new open corpus of dataset and software mentions in biomedical papers created by applying machine learning to full text biomedical literature. We share the process of extraction and transformation of mentions into citations, as well as opportunities and challenges that come with disambiguating and linking the mentions in an open dataset of this size.

Handling Non-source items in the project OUTCITE

In this invited talk, we will detail the computational processes we develop in the project OUTCITE, to reduce the amount of non-source items (NCI) in our Social Science article collection. First, we present a detailed analysis of the composition of NCI in our corpus. Then, we describe approaches to match against our own and external bibliographic indices. In the end, we elaborate on novel resolving strategies involving the Bing Websearch API that we apply to NCI which are in particular hard to match.

Session IV

Where were we? The role of the community in the present and future of OpenCitations

OpenCitations is a community-based open infrastructure organization for open scholarship dedicated to the publication of open bibliographic and citation data by the use of Semantic Web (Linked Data) technologies. Since its foundation in 2010, OpenCitations exists for the people that use its data for research purposes every day, and thanks to the support received from the global scholarly community. With its valid services and numerous ongoing activities, OpenCitations has quickly become a well-acknowledged infrastructure that perfectly fits into the current Open Science environment. The collaboration with international networks and projects has made it possible for OpenCitations to expand its team and accomplish significant milestones. However, more activities are planned for the future, and the involvement of the community is crucial to help in this process. In this talk, I will provide an overview of OpenCitations’ recent developments, plans and of how the community can embrace OpenCitations' mission.

Roundup discussion and conclusion