All Collections
About Litmaps
Our Database – FAQ
Our Database – FAQ

Where do we get our articles from? Why can't I find a paper? Why are references sometimes missing? Answers to common questions.

Digl Dixon avatar
Written by Digl Dixon
Updated over a week ago

Where do our articles come from?

We operate using "open access metadata"

  • "Metadata" is everything about an article that isn't its contents:

    • Title

    • Publication date

    • Abstract

    • References

  • "Open Access" simply means for this metadata to be made publicly accessible by the publisher, normally through a data repository like Semantic Scholar or Crossref.

This means that we don't just index Open Access articles – there are lots of closed-access articles which have open access metadata.

Litmaps Coverage of the Scholarly Literature

We ingest data from multiple third party data providers, listed below:

While we have a substantial dataset (270 million and climbing!), it's important to note that there will still be absences from our catalogue: some articles simply don't have their metadata open for access by tools like ours. Even if they might show on a Google Search, we may not officially have access to them.

Do you modify the data in any way?

No. If you notice our article information is different to what you're used to, this may be because of versioning (see below).

Why is my article missing references/citations?

Sometimes articles in Litmaps display a lower number of references and/or citations than an article is expected to have. This can be for several reasons:

  1. The first, and more probable reason, is that those references/citations don't have open access metadata. We're legally only able to index open access metadata, so we can't represent any citations/references that don't have this supplied. We're actively working on ways to better represent un-indexed references and citations within Litmaps.

  2. Some newer articles may have a delay in their citations and references appearing.

  3. Citations of older articles by very recent articles may similarly have a delay before appearing.

  4. We may be displaying a version of the article which has poor citation information: see How do you handle article versioning, below. While we try our best to select the highest-quality data source, sometimes errors can slip in.

Why can I see an article listed twice?

The world of academic databases in messy... very messy! Often articles are listed with incorrect identifiers, or identical papers can be listed with different identifiers.

We run a deduplication process when ingesting articles from our providers. This does a very good job at identifying duplicate records. However, especially when combined with versioning complications (see below), there can be small windows where "duplicate" articles can slip into our database.

If you find duplicate articles, please leave us a Support message mentioning the articles in question: It's useful for us to be able to investigate the cases where duplicate articles appear, and we may be able to refine our deduplication process with your help.

How do you handle article versioning?

This is a complex issue. If you're unfamiliar with the publication process, articles will often be released in several versions: pre-print before release, followed by the formal publication, and often followed by ongoing revisions.

We've decided to only represent the most recent version of an article, when multiple are found. We'll also make sure to choose a version that has a minimum standard of metadata: some newer versions of articles get entered with missing fields, in which case we'd rather represent an older, higher-quality, version.

We would like to be able to represent all versions of an article within the Litmaps UI at some point in the future.

How can I check if you have my article?

The best way to find an article in Litmaps is currently to search using its DOI.

Otherwise, you can manually check within our data providers.

Note that some of these providers are themselves collations of other databases! Commonly requested (and supported) datasets include:

Did this answer your question?