PMC Open Access Subset
The PMC Open Access Subset includes millions of journal articles and preprints that are made available under license terms that allow reuse. Not all articles in PMC are available for text mining or other reuse; many are under copyright. Articles in the PMC Open Access Subset are made available under Creative Commons or similar licenses that allow more liberal redistribution and reuse than a traditionally copyrighted work. The PMC Open Access Subset is one part of the PMC Article Datasets.
- Not all articles in PMC are available for text mining and other reuse.
- The PMC Cloud Service, PMC OAI-PMH Service, PMC FTP Service, E-Utilities and BioC API are the only services that may be used for automated retrieval of PMC content. Systematic retrieval (or bulk retrieval) of articles through any other automated process is prohibited.
- Users of this dataset are directly and solely responsible for compliance with copyright restrictions and are expected to adhere to the terms and conditions defined by the copyright holder (see the PMC Copyright Notice).
Files for the PMC Open Access Subset are available for automated retrieval in several types of packages:
- individual articles packages on the PMC FTP Service include the full text and metadata in XML, the article PDF (if available), as well as the media files and supplementary materials for the article
- bulk packages on the PMC FTP Service include XML or plain text format files for 100,000s of articles per package
- Individual XML or plain text files are available for retrieval in a number of ways, including the PMC Cloud Service, the PMC FTP Service, the PMC OAI-PMH Service, E-Utilities and BioC API Service
Details about the files and directory structure are available on the FTP Service page and the Cloud Service page.
Find all Open Access Subset articles in:
- PMC with this search filter: open access[filter]
- PubMed with this search filter: pubmed pmc open access[filter]
Learn about additional search filters that restrict results to certain license types.
The PMC Open Access Subset articles and related metadata are available for retrieval via
- Cloud Service,
- FTP Service,
- PMC OAI-PMH Service,
- PMC OA Web Service API
- E-Utilities and
- BioC API Service.
- Commercial Use Allowed - CC0, CC BY, CC BY-SA, CC BY-ND licenses
- Non-Commercial Use Only - CC BY-NC, CC BY-NC-SA, CC BY-NC-ND licenses; and
- Other - no machine-readable Creative Commons license, no license, or a custom license. NOTE: Distribution of articles in this group is limited on the cloud. See the section on Public Health Emergency COVID-19 Initiative Articles for more information.
Public Health Emergency COVID-19 Initiative Articles
- On the FTP Service, articles collected under the Public Health Emergency (PHE) COVID-19 Initiative with with timebound license statements that allow for secondary analysis and reuse for the duration of the global pandemic are included in "Other".
- On the Cloud Service, distribution of articles in the "Other" group is limited to those articles with a PHE COVID-19 Initiative timebound license statement and are available in the phe_timebound directory. Other articles without standardized licenses in this group are NOT available.
To retrieve the complete PMC Open Access Subset, you must retrieve packages from all of these groupings.
How to Cite
- PMC Open Access Subset [Internet]. Bethesda (MD): National Library of Medicine. 2003 - [cited YEAR MONTH DAY]. Available from https://www.ncbi.nlm.nih.gov/pmc/tools/openftlist/.