Digital Data Management, Curation, and Archiving

Agency for Healthcare Research and Quality (AHRQ)

On February 9, 2015, the AHRQ released its plan for establishing a public access policy in compliance with OSTP Public Access memo and covering AHRQ funded scientific publications and digital research data. Some specific objectives of the policy include:

  • Facilitation of easy public search, analysis of and access to peer-reviewed scholarly publications.
  • Full, free public access to publication metadata.
  • Preservation and archiving of publications and metadata.
  • Integration of multple data sources into new data sets.

Similar to NASA, the AHRQ will require authors to submit peer-reviewed publications to PubMed Central and expects to implement their plan by October, 2015. The AHRQ anticipates contracting with a commercial repository to manage and publish datasets. The commercial repository will make published data available to the public free of charge.

ASPR

The Office of the Assistant Secretary for Preparedness and Response (ASPR) Public Access Plan notes that whereas the Office is in the process of establishing a grant portfolio, research funded wholly or in part will therefore be subject to the requirements of the Febraury, 2013 OSTP Memo. As such:

  • Peer reviewed publications will be made accessible via PubMed Central
  • Researchers receiving ASPR grants will be required to develop and adhere to data management plans which specify the strategies to be used for data sharing and preservation
  • Data underlying published research findings will be made freely available in public repositories at the time of publication

Centers for Disease Control and Prevention

Per the CDC Public Access Plan, also released in February 2015, access to peer reviewed publications has previously been available via the CDC Stacks digital repository but will be supplemented through dual hosting of publications in PubMed Central. Additionally, funded researchers are expected to make supporting data freely available at the time of an article's publication and to include plans for sharing data in submitted data management plans. Other details include:

  • Development of a data management plan requirement for submitted proposals
  • Creation of a DMP catalog and registry
  • Elaboration of specific guidance regarding what types of data to share

Department of Defense Public Access Policy

In a July 2014 Memorandum, the Department of Defense directed associated Components to make publicly available all "unclassified, unlimited peer-reviewed scholarly publications and digitally formatted scientific data arising from research and programs funded wholly or in part fy DoD." In compliance with the Office of Science and Technology Policy memorandum of February 22, 2013, the DoD Public Access Policy requires that:

  • Authors will make their peer-reviewed journal manuscripts freely available to the public within 12 months of publication.
  • Investigators will ensure public access to unclassified, publicly releasable data, samples, and other materials within a reasonable time and at no more than incremental cost.
  • Digital data will be publicly accessible to search, retrieve, and analyze.
  • DoD funded research be governed by a Data Management Plan.

From the Department of Defense Memorandum, July 09, 2014.

Department of Energy

From the Plan:

 

The Office of Science affirms that the following principles related to the management of digital research data directly support fulfillment of its mission.

  • Effective data management has the potential to increase the pace of scientific discovery and promote more efficient and effective use of government funding and resources. Data management planning should be an integral part of research planning.
  • Sharing and preserving data are central to protecting the integrity of science by facilitating validation of results and to advancing science by broadening the value of research data to disciplines other than the originating one and to society at large. To the greatest extent and with the fewest constraints possible, and consistent with the requirements and other principles of this Statement, data sharing should make digital research data available to and useful for the scientific community, industry, and the public. 
  • Not all data need to be shared or preserved. The costs and benefits of doing so should be considered in data management planning.

Department of Transportation

The DOT Public Access Plan describes requirements for three types of research output: publications, data, and research project records. Key points include:

  • All publications meeting scope criteria must be submitted to the National Transportation Library.
  • A maximum 12 month embargo for publications.
  • All DOT funded research proposals must include a data management plan describing preservation and storage location i information.
  • Digital data not subject to certain exclusion criteria must be stored and publicly accessible for search, retrieval, and analysis.
  • Digital data sets subject to the plan will be inventoried in the DOT Public Data Listing.

Food and Drug Administration

The FDA released its Plan to Increase Access to Results of FDA-Funded Scientific Research in February, 2015. Similar to other agencies with recent announcements, the Plan applies to published articles as well as data, with an FDA specific implementation of NIH's PubMed Central to serve as the publication repository. With regard to data and to the extent possible, the FDA will:

  • Create or modify policies to required data management plans to be developed and adhered to for sponsored research
  • Promote best practices in data management and preservation
  • Support researchers in depositing data in appropriate, publicly accessible repositories

In addition to the Public Access Plans, the FDA has also launched its OpenFDA initiative and a corresponding API to facilitate discovery of and access to FDA funded datasets.

National Aeronautics and Space Administration

On February 11, 2015, NASA released its plan for meeting the requirements of the OSTP Public Access memorandum. Noting its longstanding practice of promoting full and open sharing of data, the NASA plan will further extend open access to all NASA sponsored scientific research, including scholarly publications and digital scientific data.

As of August, 2016, implemented aspects of the plan include:

  • An online research access point has been created for NASA-Funded Research Results.
  • The PMC based PubSpace portal has been established. Note that NASA-funded authors are required to deposit copies of peer-reviewed scientific publications and associated data into PubSpace:
    • NASA is using PMC to permanently preserve and provide easy public access to the peer-reviewed papers resulting from NASA-funded research. Beginning with research funded in 2016, all NASA-funded authors and co-authors (both civil servant and non-civil servant) will be required to deposit copies of their peer-reviewed scientific publications and associated data into NASA’s publication repository called NASA PubSpace.  This EXCLUDES patents, publications that contain material governed by personal privacy, export control, proprietary restrictions, or national security law or regulations. NASA PubSpace is part of PubMed Central (PMC) which is managed by the NIH. (http://www.nasa.gov/open/researchaccess/pubspace)
  • The NASA Data Portal now serves as a catalog of datasets generated by NASA-funded research. Primarily a metadata catalog, the portal is not a data repository although the referenced datasets are available to the public free of charge.

National Institutes of Health

In February, 2015, the NIH released its Plan for Increasing Access to Scientific Publications and Digital Scientific Data from NIH Funded Scientific Research. Whereas public access to scholarly publications resulting from NIH funded research is already provided through PubMed Central per the Institutes' 2008 Public Access Policy, the updated plan provides the following elaboration with regard to digital scientific data:

  • Further development of policies to policies to make data underlying publications freely available in public repositories at the time of initial publication.
  • Development of a data management plan requirement for all grants, cooperative agreements, contracts, or intramural funds.
  • Increased public access to data through the use of established public repositories and the promotion of standards supporting discoverability and interoperability.

National Institute of Standards and Technology

Released in April, 2015. the NIST Public Access Plan affirms NIST's intention to

make freely available to the public, in publicly accessible repositories, all peer-reviewed scholarly publications and associated data arising from unclassified research and programs funded wholly or in part by NIST. Subject to the same conditions and constraints listed above, NIST will also promote the deposit of scientific data arising from unclassified research and programs, funded wholly or in part by NIST, to make it available free of charge unless otherwise excepted, in publicly accessible databases.

Aspects of the plan include:

  • All proposals that will generate scientific data will be required to adhere to a Data Management Plan descrbing how data will be managed and shared.
  • Public discovery and download of peer-reviewed publications and associated data free of charge no later than 12 months following publication.
  • Partnership with NIH to use a branded instance of PubMed Central as the repository platform for sharing peer reviewed publications.
  • Development of a Enterprise Data Inventory and Common Access Platform to facilitate data sharing.

NIST Public Access Plan

National Oceanic and Atmospheric Administration (NOAA)

NOAA released its Plan for Increasing Public Access to Research Results in February, 2015. Significant points include:

  • Intramural and extramural researchers will be required to submit final, pre-publication manuscripts to the NOAA Institutional Repository. The repository will be developed using a technology stack similar to PubMed Central. A 12 month embargo will be allowed, and researchers may petition for a longer embargo.
  • Broadening of the DMP requirement to include any NOAA program that produces environmental data.
  • Creation of a data inventory/catalog and minimum requirement to use the Federal government's common core metadata schema.
  • Development of processes and requirements for linking publications and data.

More info is available from the links below.

National Science Foundation

On March 18th, 2015, the NSF released its Publi Access Plan, per the requirements of the OSTP Public Access memorandum. Effective January 2016, the plan applies to peer reviewed publications and data resulting from NSF-funded research, and requires research products to be available for free download, reading, and access after no more than a 12 month embargo. Additionally, research products will be managed to "ensure long term preservation" and provided with unique, persistent identifiers. Additional details from the NSF Public Access to Results of NSF-Funded Research stipulate that the version of record or final accepted manuscript of a publication must:

  • Be deposited in a public access compliant repository designated by NSF (see https://par.nsf.gov/);
  • Be available for download, reading and analysis free of charge no later than 12 months after initial publication;
  • Possess a minimum set of machine-readable metadata elements in a metadata record to be made available free of charge upon initial publication;
  • Be managed to ensure long-term preservation; and
  • Be reported in annual and final reports during the period of the award with a persistent identifier that provides links to the full text of the publication as well as other metadata elements.

NSF Public Access Plan: http://www.nsf.gov/pubs/2015/nsf15052/nsf15052.pdf

US Department of Agriculture Open Government Resources

On November 7th, 2014, the USDA released their Implementation Plan to Increase Public Access to Results of USDA-funded Scientific Research. Consistent with the policies and requirements of the OSTP Public Access memo of February, 2013, the plan described the processes by which policies, processes, systems and outreach will be developed to increase public access to scholarly publications and digital scientific data.

The USDA has additionally updated their Open Government Plan of the USDA and the supporting Open Government Initiative website.

US Geological Survey

From the USGS Public Access Plan website: https://www2.usgs.gov/quality_integrity/open_access/

Beginning October 1, 2016, the USGS will require journal article manuscripts of record (also referred to as final accepted manuscripts) which are considered as outside publications and USGS series publication information products be handled as follows:

  • For journal publications all accepted manuscripts must be deposited in the internal USGS Information Product Data System (IPDS) repository which serves as a dark archive.
  • Journal articles are released from the outside publisher or the USGS IPDS dark archive and made available free-of-charge from the USGS Publications Warehouse no later than 12 months after publication.
  • The USGS will provide a mechanism, via the USGS Publications Warehouse, for accepting petitions for changes to the 12-month period and a decision for a reduced embargo period for a specific journal publication will be made based on need.
  • Maintenance of the scholarly publication in a manner that ensures long-term preservation.
  • A persistent identifier that provides link a to the full text of the scholarly publication as well as metadata elements, and this link will be reported in annual and final reports during the period of the award for researchers extramural to the USGS.

[...] Beginning Oct. 1, 2016, the USGS will require digital research data collected with USGS funds meet the following requirements:

  • Scientific data that are used to support the conclusions in scholarly publications will be made available free-of-charge for public access simultaneously with or prior to the release of an associated scholarly publication, unless the agency determines that a demonstrated circumstance restricts the data from being made publicly available, for example in cases where access must be restricted because of security, privacy, confidentiality, or other constraints.
  • Final project scientific data approved for release are made available free-of-charge at the end of the project, unless the agency determines that a demonstrated circumstance restricts the data from being made publicly available, for example in cases where access must be restricted because of security, privacy, confidentiality, or other constraints.
  • Scientific data follow the requirements of the data management plan (DMP) which is part of the approved project plan. The DMP details information such as acquisition method, quality assurance, security, disposition, and if applicable, circumstance restricting public access (IM OSQI 2015-01).
  • Metadata (using USGS endorsed metadata standards) must accompany the scientific data and also this metadata record must be submitted to the USGS Science Data Catalog.

The selected points are not comprehensive - for more information, please contact RDS or see the USGS website or public access plan linked below.

Background on Public Access Plans

On February 22, 2013, the White House Office of Science and Technology Policy issued the memorandum, Increasing Access to the Results of Federally Funded Scientific Research. The memo requires Federal agencies with annual research and development budgets in excess of $100 million to develop plans supporting increased public access to the results of Federally funded research, including peer reviewed publications and data. The policies referenced here represent the formal responses and plans from affected agencies.

The original memo and additional relevant policies are available via the link below.