Skip to content

CCC | Metadata Challenges
Preservation Add article and data set to repository DATA Archive article anddata set DATA Re-use & Measurement Evaluate research impact Track and report funding compliance Evaluate research impact Assess historical subscription and publication data to inform institutional deals Evaluate research impact Monitor compliance Evaluate research impact Enable appropriate sharing and use of content Assess historical subscription and publication data to inform institutional deals Publication Index metadata to enable search & discovery DATA Prepare article for publication / Push to hosting site DATA Peer review article DATA Make updates from peer review NO FUNDINGFOUND INSTITUTIONAL AGREEMENTFOUND CASCADED TOOTHER JOURNALS DECLINE ACCEPT SUBSCRIPTION OA FUNDER AGREEMENTFOUND DATA Submit article and check OA funding eligibility Fund OA publication (Agreements or APCs) DATA Fund OA publication (Agreements or APCs) DATA Pay APCs DATA Research & Authoring Select publication for submission Compose the article according to publisher requirements DATA Post pre-print /share early outputs Author article / pre-print DATA Managereferences DATA Analyzedata DATA Collectdata DATA Full literature review DATA Attend conference(s) for scientific debate DATA Store pre-print DATA Store pre-print on pre-print server Use pre-print data torecruit research forjournal submission DATA Proposal Submission Select reviewers and begin application review Releasefunding Monitor progress DATA Manage funds and funder compliance Log funding terms in grant management system DATA APPROVE Submit application for funding DATA Preparation Review application, provide info on ethics, IP, RDM plan requirements DATA Assess funder fit and start grant application DECLINE Idea Development DATA Consult on idea, outline costs and log project (e.g., CRIS) Reviewexistingresearch Meet with colleagues and library/research office staff Seekcollaborators RESEARCHER FUNDER PUBLISHER INSTITUTION I have a research idea... Metadata Challenges Start over Start over Services Services

Icon legend

Service providers

CRIS System providers:
Pure (Elsevier), Converis (Clarivate), DSpace-CRIS

PIDs:
ORCID, ROR, Ringgold, ISNI, RAiD

Challenges

Underutilization of ORCID

Some institutions don’t require researchers to use ORCID; records can be outdated if authors don’t consistently update; ORCID may not be accessible to authors in some geographies.

Challenges

Poor Search Experience, Missed Discovery Opportunities

It’s hard to find content when it isn’t tagged properly with high-quality metadata. Additionally, researchers can have trouble authenticating to literature in the absence of standard IDs.

Impact

If authors can’t be identified with a standard ID, they may not be able to authenticate to content, get credited appropriately for their work, secure OA funding, or complete downstream processes without unnecessary manual effort. Costly manual effort is also required of publishers, institutions, and funders to disambiguate authors retrospectively.

Service providers

Grant management systems:
Altum, GrantHub, Oracle PeopleSoft

PIDs:
ORCID, ROR, Ringgold, ISNI, RAiD, FundRef, grant IDs

Impact

Without disambiguated grant and funder details, grants may not be effectively utilized in later publication stages, leaving OA funding unclaimed and shifting coverage to research institutions. In an ecosystem that values a sustainable OA shift, this impacts everyone.

Impact

Hindered conflict of interest management among peer reviewers threatens research integrity, and low-quality data results in low accuracy of later-stage funding identification, tracking, and analysis of research output.

Impact

Lack of registered grant DOIs makes it difficult and costly to link funding to particular research outputs, resulting in missed OA opportunities as well as incomplete analysis to inform future funding investments.

Challenges

Inconsistent Metadata Capture

Variability across grant application process/systems results in possible loss of metadata necessary to determine OA funding entitlements at a later stage, e.g., institutional affiliations.

Challenges

Lack of Systems Interoperability

Researchers depend on a variety of systems to do their work, but the systems don’t always work together. Missing integrations between the systems that researchers use (e.g., CRIS, grant management, curriculum management systems, etc.) often results in gaps in metadata and PID capture.

Challenges

Legacy System Limitations

Low adoption of standardized PIDs (FundRef, RAiD, Ringgold, ISNI, ROR) due to limitations of legacy systems and/or lack of awareness.

Challenges

Low-quality Data

Free text fields are great for gathering feedback; they’re not designed to capture granular data like an organizational identifier. Researchers often confuse proposal numbers with grant IDs later in the publication process—they need structure to improve the accuracy of data capture.

Challenges

Metadata Gaps

There may be points at which the data set is cut down for confidentiality purposes, and possibly lost unintentionally during the review and funding management process.

Service providers

Applicable standards/formats:
JATS, JSON, etc.

Reference manager and collaboration software:
Mendeley, F1000Workspace, Zotero, Endnote, CiteIt

Electronic lab notebooks:
Hivebench, OneNote

Collaboration networks:
ResearchGate, Academia.edu

Discovery tools:
Google Scholar/Google, ProQuest Summon, OCLC WorldCat, Dimensions, Symplectic Discovery, VIVO

Pre-print servers:
arXiv, bioRxiv, Authorea, AfricArxiv

Open Access/Institutional repositories:
Figshare, Dryad, DIGITAL.CSIC, e-IEO, Zenodo, DataCite, CLOCKSS, protocols.io

Challenges

Researcher Inequities & Research Barriers

Valid research coming from under-represented researchers is hard to find due to lack of metadata, including DOIs.

Search and discovery are difficult due to inconsistency in identifying the user and enabling appropriate access to research.

Authors from under-represented areas may not have equitable access to search and discovery services or equitable opportunities for publication.

Challenges

No Unique ID for Conferences

Though emerging PIDs will include output from conferences, no unique conference identifier exists, making tracking progress hard for funders and institutions.

Challenges

Difficulty in Organizing References

Integrating with different citation tools using different PIDs makes managing references hard, and sometimes URLs are saved in place of DOIs.

Challenges

Poor Connections Across Research Outputs

Free-text fields prevail over PID pick-lists, and there’s inconsistent application of PIDs across research outputs e.g., data sets, equipment, setting(s), samples, software.

Challenges

Marketing Misses the Mark

Researchers receive solicitations from publishers; but some are irrelevant. Better metadata associated with the researcher, their co-authors, the topic and the funding source(s) can help target the right offer to the right researcher, saving publishers time and effort and strenghtening relationships with authors.

Challenges

Risk of OA non-compliance

Metadata lost upstream makes managing funding compliance onerous.

Impact

Global inequities hinder scientific progress.

Impact

Inability to easily find, verify, and reuse the data and artifacts underlying research, making it difficult to accurately interpret, cite and reproduce research findings.

Impact

Lack of available information about both corresponding author and all co-authors leads to manual input to identify funder and institutional mandates at best and missed funding requirements at worst.

Service providers

Submission/Peer Review/Production:
ScholarOne, Editorial Manager/ProduXion Manager, Kriyadocs, River Valley, eJournalPress, BenchPress

Contributor roles:
CRediT

DOI registration:
Crossref, DataCite

Peer review tools:
PLOS Peer Review Center, StatReviewer, Publons

Plagiarism technology:
iThenticate, Turnitin

OA funding management:
CCC RightsLink, OA Switchboard, ChronosHub, Oable, SciPris

Digital content development & hosting:
HighWire, SilverChair, Ingenta, Research4Life, Atypon Literatum

Open Access Publishing:
F1000Research, eLife

PIDs:
DOI, ORCID, Ringgold, ISNI, ROR

Challenges

Over-reliance on Authors

Authors don’t know their affiliation ID or input secondary versus primary affiliation; too much manual data entry is error prone.

Challenges

Missed Funding Opportunities

Under-utilization of metadata validation services.

If the researcher has submitted before, outdated information from their existing profile can be pulled into the submission.

Inconsistency between journal policies and metadata procedures.

Lack of funding information captured at submission and validated at acceptance.

Demand for increased interoperability between PIDs.

Impact

Without granular, accurate organizational affiliation identifiers for a manuscript, coupled with incomplete funding details, authors may miss the opportunity to get OA funding upon acceptance or miss the chance to opt into OA due to affordability concerns. OA initiatives/deals driven by institutions and funders may lack uptake as a result. Publishers are also unable to automate processes that reduce the cost of business model transformation. Manual effort is required to retrospectively cover the publication with proper funding sources, driving up the cost of publishing. No one benefits in this scenario.

Challenges

Difficulty Flagging Conflicts of Interest

Affiliation information and other metadata that is not consistently captured or validated within the submission system makes it difficult to identify conflicts of interest, monitor compliance with sanctions, etc.

Challenges

Inconsistent & Incomplete Metadata

There is an accepted metadata standard for transferring manuscripts, but the quality of data is inconsistent and elements are dropped.

Challenges

Over-Reliance on Authors

Publishers and service providers are limited in providing systematic support to authors to comply with various funding mandates (e.g., attribution, license type) when funding IDs are missing from the publication workflow.

Impact on License Type

Affiliations are not static and can change at different points in the process e.g., at the time research was conducted versus the researcher’s current affiliation. This can impact whether the author retains publication rights, and which license governs reuse.

Impact

Publishers and institutions take on the time and expense of manually finding the papers that should have matched to an agreement and collaborating on a resolution.

Challenges

Missed Funding Opportunities & Costly Billing Complications

If institution affiliation manually input by the author does not use a standardized name or PID (e.g., abbreviations, nicknames), this can interfere with matching to the correct OA funding source.

Using email address for affiliation identification can impact funding entitlements, especially if the email account is old, the researcher has multiple affiliations, or a personal account is used.

Funder and grant ids are frequently missing from metadata, impacting funding entitlements.

Impact

Poor affiliation disambiguation causes authors and institutions to pay one-off APCs that are otherwise eligible for pre-paid deals or discounts, adding unnecessary overhead to billing and reconciliation.

Challenges

Complications in Honoring OA Deals

Complex institutional deals (e.g., carveouts) require granular metadata to accurately determine affiliation information for funding eligibility.

Challenges

Inconsistent Metadata Capture

Lack of awareness or resources to upgrade JATSXML so there are often data elements dropped at the journal level during production due to incompatibility.

Grant IDs can change between submission and publication. If an extension is granted during the peer-review process, authors don’t always remember to update this information for both articles and data sets.

Limited adoption of standardized protocols and metadata during the production process results in inconsistent metadata and manual data enrichment.

Challenges

Unnecessary Manual Intervention

Publishers are sometimes manually entering PIDs prior to registering DOIs for a more complete publication record.

Impact

Funder/grant affiliation is essential to later editorial and production workflows to support compliance, and when missing, puts an administrative burden on the author that diverts attention away from core research.

Impact

Impact

This is a laborious practice with high economic and opportunity costs that could be reduced with earlier, automated PID assertion and/or validation.

Service providers

Open access / institutional repositories:
Figshare, Dryad, DIGITAL.CSIC, e-IEO, Zenodo, DataCite, CLOCKSS, protocols.io

Challenges

High Opportunity / Operational Costs

Institutions are spending time manually curating data for archiving and reporting.

Challenges

Research Data Sharing Presents Barriers to Open Science

Many datasets don’t have DOIs, making them difficult to find, access, and reproduce.

Service providers

Article level metrics:
Altmetrics, Panorama

Licensing services:
Creative Commons, CCC MarketPlace, CCC Rightslink, Publishers' Licensing Services

Challenges

Problematic Research Impact Measurement

Difficult to track research/researcher impact due to lack of adoption of metadata standards.

Impact

Researcher rewards and recognition decisions, or future opportunities for funding, may be based on incomplete or inaccurate data, affecting reputation and career advancement.

Challenges

Problematic Deal Modeling

Lack of consistent affiliation and funding data makes modeling, implementing, and tracking future agreements hard for institutions and publishers.

Data is not standardized across publisher platforms, creating unnecessary manual work to gather and normalize data for analysis.

Impact

The transition to modern models of OA publication is onerous and error-prone, prolonging a mixed-model landscape and the availability of open outputs to advance science.

Challenges

Problematic Research Impact Measurement

Difficult to track funder impact due to lack of adoption of metadata standards.

Impact

Incomplete data makes it challenging to inform future future funding investments and to accurately report activities to the public.

Challenges

Problematic Research Impact Measurement

Difficult to track research impact due to lack of adoption of metadata standards.

Challenges

Problematic Deal Modeling

Lack of consistent affiliation and funding data makes modelling future agreements difficult for publishers and institutions.

Impact

The transition to OA is delayed, putting some publishers at risk of losing authors to funding mandates and losing revenue that is necessary to sustain operations.