Digital Data Management, Curation, and Archiving

NSF General Purpose DMP

Roles and Responsibilities

Explain how the responsibilities regarding the management of your data will be delegated. This should include time allocations, project management of technical aspects, training requirements, and contributions of non-project staff - individuals should be named where possible. Remember that those responsible for long-term decisions about your data will likely be the custodians of the repository/archive you choose to store your data. While the costs associated with your research (and the results of your research) must be specified in the Budget Justification portion of the proposal, you may want to reiterate who will be responsible for funding the management of your data. Consider these questions:

  • Outline the staff/organizational roles and responsibilities for implementing this data management plan.
  • Who will be responsible for data management and for monitoring the data management plan?
  • How will adherence to this data management plan be checked or demonstrated?
  • What process is in place for transferring responsibility for the data?
  • Who will have responsibility over time for decisions about the data once the original personnel are no longer available?

Expected Data

Give a short description of what 'data' will mean in your research - explain what the contents of each dataset will be, including size and amount if known. It would also help if you can identify your methods for collecting data. Consider these questions:

  • What data will be generated in the research?
  • What data types will you be creating or capturing?
  • How will you capture or create the data?
  • If you will be using existing data, state that fact and include where you got it. What is the relationship between the data you are collecting and the existing data?
  • What data will be preserved and shared?

Period of Data Retention

This section will allow you to account for any delay in the accessibility of your data after your research is done. Consider any reasons why you would not make the data immediately available - for instance, maybe you have political, commercial or patent concerns that will require you to postpone access to the data you produce. Consider these questions:

  • How long will the original data collector/creator/principal investigator retain the right to use the data before opening it up to wider use?
  • Explain details of any embargo periods for political/commercial/patent or publisher reasons.

Data Format and Dissemination

This portion of the DMP asks you to combine an explanation of the format of your data and how that format will allow for fast and easy access to the data. One of the main thrusts of the DMP requirement is the NSF's intention to encourage data sharing among researchers - consider this when answering the questions below. Think about how you can not only make your data available to researchers "on-demand," but also how you can more proactively make your data accessible without a specific request. In this section you are also asked to account for issues of privacy, confidentiality and ownership that may arise from the dissemination of your data. Think about what you have done to comply with your obligations in your Institutional Review Board Protocol. Consider these questions:

  • Which file formats will you use for your data, and why?
  • What transformations (to more shareable formats) will be necessary to prepare data for preservation / data sharing?
  • What form will the metadata describing/documenting your data take?
  • How will you create or capture these details?
  • Which metadata standards will you use and why have you chosen them? (e.g. accepted domain-local standards, widespread usage).
  • What contextual details (metadata) are needed to make the data you capture or collect meaningful?
  • What other types of information should be shared regarding the data, e.g. the way it was generated, analytical and procedural, information?
  • What metadata/ documentation will be submitted alongside the data or created on deposit/ transformation in order to make the data reusable?
  • How and when will you make the data available? (Include resources needed to make the data available: equipment, systems, expertise, etc.)
  • What is the process for gaining access to the data?
  • Will any permission restrictions need to be placed on the data?
  • Are there ethical and privacy issues? If so, how will these be resolved?
  • What have you done to comply with your obligations in your IRB Protocol?
  • Who will hold the intellectual property rights to the data and how might this affect data access?
  • What and who are the intended or foreseeable uses/users of the data?

Data Storage and Preservation of Access

This portion of the Data Management Plan asks the researcher to provide a long-term strategy for archiving and preserving the data from the research described in the proposal. Consider these questions:

  • What is the long-term strategy for maintaining, curating and archiving the data?
  • Which archive/repository/database have you identified as a place to deposit data?
  • What procedures does your intended long-term data storage facility have in place for preservation and backup?
  • How long will/should data be kept beyond the life of the project?

UNM's Suggested Answer Text:

The data will be archived for a minimum of 10 years at the University of New Mexico (UNM) Libraries' LoboVault institutional and data repository. After this time, the data will be appraised per established collection and archival management policies for transfer to an external repository, long term archiving in LoboVault, or alternative disposition. LoboVault is an Open Archives Initiative (OAI) compliant repository built using the DSpace repository application, which enables Dublin Core metadata and data set objects to be shared and harvested by other archival systems through the OAI-PMH protocol.

LoboVault is a designated long term digital archives resource maintained by the University of New Mexico Libraries. In addition to the use of Dublin Core for descriptive metadata, the archive provides daily file integrity and format verification and will additionally create and maintain technical and administrative metadata using the widely adopted Metadata Encoding and Transmission Standard (METS) and Preservation Metadata Implementation Strategies (PREMIS) metadata standards. These additional metadata include digital file signatures and checksums for bitwise integrity validation and chain of custody documentation. Primary responsibility for curating and preparing the data for archiving rests on the Data Librarians at the University of New Mexico Libraries.

Additional Possible Data Management Requirements

You are asked to explain in this section how you plan on satisfying any additional, program-specific data management requirements, if any exist. If not you may leave this section blank.

NSF Directorate Specific Guidelines

Biological Sciences Directorate (BIO)

The current version of DMP guidance for the Directorate for Biological Sciences is available at http://www.nsf.gov/bio/pubs/BIODMP_Guidance.pdf. The recommended organization is provided below, with guidance.

Sections in bold are from the NSF BIO DMP Guidance referenced above. Additional guidance and suggestions as provided by the NSF BIO data management plan template from the online DMP Tool (https://dmptool.org/): https://dmptool.org/requirements_templates/201/basic.docx

Data and Other Materials Produced

Describe the types of data, physical samples or collections, software, curriculum materials, and other materials to be produced in the course of the project. (For collaborative proposals, the DMP must cover all the various data types being collected by each collaborator.)

Describe the data to be collected (actual observations) during your research including amount (if known). Name the type of data, the instrument or collection approach, and how the data will be sampled. If actual data are interpreted, note the interpretation. Describe any quality control measures. Also describe the final derivative products (datasets and software or computer code) and the analysis used including analytical software packages that are required for replication, etc. Describe data both (digital and analog) and physical materials (samples and collections) gathered or generated during the tie of the award. Consider these questions:

  • What data will be generated in the research?
  • What data types will you be creating or capturing?
  • How will you capture or create the data? (This should cover content selection, instrumentation, technologies and approaches chosen, methods for naming, versioning, meeting user needs, etc, and should be sensitive to the location in which data capture is taking place.)
  • If you will be using existing data, state that fact and include where you got it. What is the relationship between the data you are collecting and the existing data?

Data Format and Metadata Standards

Describe the standards to be used for all the data types anticipated, including data or file format and metadata.

Describe the format of your data; think about what details (metadata) someone else would need to be able to use these files. Describe the structural standards that you will apply in making data and metadata available.

  • Which file formats will you use for your data and why?
  • What naming conventions will you use?
  • How will you organizing your directories?
  • What form will the metadata describing/documenting your data take?
  • How will you create or capture these details?
  • Which metadata standards will you use and why have you chosen them? (e.g. accepted domain-local standards, widespread usage)
  • What contextual details (metadata) are needed to make the data you capture or collect meaningful?

Roles and Responsibilities

Describe the roles and responsibilities of all parties with respect to the management of the data (including contingency plans for the departure of key personnel from the project).

Explain how the responsibilities regarding the management of your data will be delegated. This should include time allocations, project management of technical aspects, training requirements, and contributions of non-project staff - individuals should be named where possible. Remember that those responsible for long-term decisions about your data will likely be the custodians of the repository/archive you choose to store your data. While the costs associated with your research (and the results of your research) must be specified in the Budget Justification portion of the proposal, you may want to reiterate who will be responsible for funding the management of your data. Consider these questions:

  • Outline the staff/organizational roles and responsibilities for implementing this data management plan.
  • Who will be responsible for data management and for monitoring the data management plan?
  • How will adherence to this data management plan be checked or demonstrated?
  • What process is in place for transferring responsibility for the data?
  • Who will have responsibility over time for decisions about the data once the original personnel are no longer available?

Data Dissemination Methods

Describe the dissemination methods that will be used to make data and metadata available to others during the period of the award, and any modifications or additional technical information regarding data access after the grant ends.

Describe how and where you will make these data and metadata available to the community. Remember BIO is committed to timely and rapid data distribution; make sure you address how soon your data will be available. Indicate what data will be made available and preserved. Will data be accessible on a web page, by email request, via open-access repository, etc.? Consider these questions:

  • What data will be made available from the study and preserved for the long term?
  • What transformations (to more shareable formats) will be necessary to prepare data for preservation / data sharing?
  • What metadata/ documentation will be submitted alongside the data or created on deposit/ transformation in order to make the data reusable?
  • How and when will you make the data available? (Include resources needed to make the data available: equipment, systems, expertise, etc.)
  • What is the process for gaining access to the data?
  • What contextual details (metadata) are needed to make the data you capture or collect meaningful?
  • What other types of information should be shared regarding the data, e.g. the way it was generated, analytical and procedural, information?
  • How will these additional details be captured or documented?
  • How long will the original data collector/creator/principal investigator retain the right to use the data before opening it up to wider use?
  • Explain details of any embargo periods for political/commercial/patent or publisher reasons.

Policies for Sharing, Public Access, and Re-use

Describe the PI's policies for data sharing, public access and re-use, including re-distribution by others and the production of derivatives. Where appropriate, include provisions for protection of privacy, confidentiality, security, intellectual property rights and other rights.

Describe the policies under which these data will be made available. It is very important, the reason a DMP is required, that you specify how you will share your data with non-group members after the project is completed. If the data is of a sensitive nature - privacy or ecological endangerment concerns, for instance - and public access is inappropriate, address here the means by which granular control and access will be achieved (e.g. formal consent agreements, anonymized data, only available within a secure network, etc.). Consider these questions:

  • Will any permission restrictions need to be placed on the data?
  • Are there ethical and privacy issues? If so, how will these be resolved?
  • What have you done to comply with your obligations in your IRB Protocol?
  • Who will hold the intellectual property rights to the data and how might this affect data access?
  • What and who are the intended or foreseeable uses/users of the data?
  • Do you plan on publishing findings which rely on the data? If so, do your prospective publishers place any restrictions on other avenues of publication?

Archiving, Storage and Preservation

Where relevant, describe plans for archiving data, samples, software, and other research products, and for on-going access to these products through their lifecycle of usefulness to research and education. Consider which data (or research products) will be deposited for long-term access and where. (What physical and/or cyber resources and facilities (including third party resources) will be used to store and preserve the data after the grant ends?)

Consider which data (or research products) will be deposited for long-term access and where. (What physical and/or cyber resources and facilities (including third party resources) will be used to store and preserve the data after the grant ends?)

Describe your long-term strategy for storing, archiving and preserving the data you will generate or use. Consider the following:

  • What is the long-term strategy for maintaining, curating and archiving the data?
  • Which archive/repository/database have you identified as a place to deposit data?
  • What procedures does your intended long-term data storage facility have in place for preservation and backup?
  • How long will/should data be kept beyond the life of the project?
  • What data will be preserved for the long-term?
  • On what basis will data be selected for long-term preservation?
  • What metadata/documentation will be submitted alongside the data or created on deposit/transformation in order to make the data reusable?
  • What related information will be deposited?

UNM's Suggested Answer Text:

The data will be archived for a minimum of 10 years at the University of New Mexico (UNM) Libraries' LoboVault institutional and data repository. After this time, the data will be appraised per established collection and archival management policies for transfer to an external repository, long term archiving in LoboVault, or alternative disposition. LoboVault is an Open Archives Initiative (OAI) compliant repository built using the DSpace repository application, which enables Dublin Core metadata and data set objects to be shared and harvested by other archival systems through the OAI-PMH protocol.

LoboVault is a designated long term digital archives resource maintained by the University of New Mexico Libraries. In addition to the use of Dublin Core for descriptive metadata, the archive provides daily file integrity and format verification and will additionally create and maintain technical and administrative metadata using the widely adopted Metadata Encoding and Transmission Standard (METS) and Preservation Metadata Implementation Strategies (PREMIS) metadata standards. These additional metadata include digital file signatures and checksums for bitwise integrity validation and chain of custody documentation. Primary responsibility for curating and preparing the data for archiving rests on the Data Librarians at the University of New Mexico Libraries.

Computer and Information Sciences and Engineering (CISE)

The current version of DMP guidance for the Computer & Information Sciences & Engineering directorate is available at http://www.nsf.gov/cise/cise_dmp.jsp. The recommended organization is provided below, with guidance. For more detailed assistance, please feel free to contact Research Data Services at .

Roles and Responsibilities

Describe who will own the data. This person should be the one who bears final responsibility for the data and adhering to the data management plan. Also, describe the personnel who will collect and manage the data.

Products of Research

Describe all the types of data you will be collecting, e.g. experimental data, observational. There are a variety of ways to describe data, so try to be brief but descriptive. Also if you can, describe the amount of data in terms of file numbers and disk space.

Describe how you will be collecting your data and any instruments you will be using. If you are using existing data, describe where it came from and any terms or licensing conditions associated with the data.

Policies for Access and Sharing

The main reason a Data Management Plan is required, is for you to think about how you prepare (manage) your data for sharing and describe how you will actively share your data with non-group members after the project is completed. Since publicly funded research is funded by tax dollars, most funding agencies expect your data to be made easily available to others. If you are going to embargo or withhold all or part of your data, it is important you explain why. This might include publishing agreements, commercial or patent interests, private or sensitive data, among other reasons.

NOTE: You should be as open as possible with the products of your research. NSF and most other funding agencies use this as part of the evaluation of your proposal. Grant proposals and DMPs which seem to limit access to the products resulting from the grant do get reviewed poorly in this area.

You should describe:

  • When your data will be available
  • How you will make your data available
    • Email request (avoid this, as research has shown only 20% of data that says its available this way is actually retrievable)
    • Domain Specific Repository (ie. Genbank)
    • Institutional Repository
    • Journal
  • Embargo Periods
    • A reasonable delay to publish is acceptable
    • Delay for patents, although provisional patents can be obtained through the STC at UNM
    • Who will hold the intellectual property rights on the data
    • Any issues with private, sensitive or secret data

Data Storage and Preservation

Use this section to describe both short -term and long-term strategy for storing, archiving and preserving your data. During the study, how will you store your data and prevent corruption and loss? Describe any backup systems and versioning you use. If your data is web accessable how will you protect your data from malicious deletion or corruption? At the conclusion of your study, where will your data be stored? Describe, if known, how the data will be curated and maintained.

Additional Possible Data Management Requirements

You are asked to explain in this section how you plan on satisfying any additional, program-specific data management requirements, if any exist. If not you may leave this section blank.

Education and Human Resources Directorate (EHR)

The current version of DMP guidance for the Education & Human Resources Directorate is available at http://www.nsf.gov/bfa/dias/policy/dmpdocs/ehr.pdf. The recommended organization is provided below, with guidance. For more detailed assistance, please feel free to contact Research Data Services at .

Products of Research

Describe all the types of data you will be collecting, e.g. experimental data, observational. There are a variety of ways to describe data, so try to be brief but descriptive. Also if you can, describe the amount of data in terms of file numbers and disk space.

Describe how you will be collecting your data and any instruments you will be using. If you are using existing data, describe where it came from and any terms or licensing conditions associated with the data.

Period of Retention

Describe how long the data will be preserved. If you are depositing it in the UNM Repository, the data will be kept for a minimum of ten years.

Data Format and Dissemination

This portion of the DMP asks you to combine an explanation of the format of your data and how that format will allow for fast and easy access to the data. One of the main thrusts of the DMP requirement is the NSF's intention to encourage data sharing among researchers - consider this when answering the questions below. Think about how you can not only make your data available to researchers "on-demand," but also how you can more proactively make your data accessible without a specific request. In this section you are also asked to account for issues of privacy, confidentiality and ownership that may arise from the dissemination of your data. Think about what you have done to comply with your obligations in your Institutional Review Board Protocol. Consider these questions:

  • Which file formats will you use for your data, and why?
  • What transformations (to more shareable formats) will be necessary to prepare data for preservation / data sharing?
  • What form will the metadata describing/documenting your data take?
  • How will you create or capture these details?
  • Which metadata standards will you use and why have you chosen them? (e.g. accepted domain-local standards, widespread usage).
  • What contextual details (metadata) are needed to make the data you capture or collect meaningful?
  • What other types of information should be shared regarding the data, e.g. the way it was generated, analytical and procedural, information?
  • What metadata/ documentation will be submitted alongside the data or created on deposit/ transformation in order to make the data reusable?
  • How and when will you make the data available? (Include resources needed to make the data available: equipment, systems, expertise, etc.)
  • What is the process for gaining access to the data?
  • Will any permission restrictions need to be placed on the data?
  • Are there ethical and privacy issues? If so, how will these be resolved?
  • What have you done to comply with your obligations in your IRB Protocol?
  • Who will hold the intellectual property rights to the data and how might this affect data access?
  • What and who are the intended or foreseeable uses/users of the data?

Data Storage and Preservation

Use this section to describe both short -term and long-term strategy for storing, archiving and preserving your data. During the study, how will you store your data and prevent corruption and loss? Describe any backup systems and versioning you use. If your data is web accessable how will you protect your data from malicious deletion or corruption? At the conclusion of your study, where will your data be stored? Describe, if known, how the data will be curated and maintained.

Additional Possible Data Management Requirements

You are asked to explain in this section how you plan on satisfying any additional, program-specific data management requirements, if any exist. If not you may leave this section blank.

Engineering Directorate (ENG)

The current version of DMP guidance for the Engineering Directorate is available at http://nsf.gov/eng/general/ENG_DMP_Policy.pdf. The recommended organization is provided below, with guidance. For more detailed assistance, please feel free to contact Research Data Services at .

Roles and Responsibilities

Explain how the responsibilities regarding the management of your data will be delegated. This should include time allocations, project management of technical aspects, training requirements, and contributions of non-project staff - individuals should be named where possible. Remember that those responsible for long-term decisions about your data will likely be the custodians of the repository/archive you choose to store your data. While the costs associated with your research (and the results of your research) must be specified in the Budget Justification portion of the proposal, you may want to reiterate who will be responsible for funding the management of your data. Consider these questions:

  • Outline the staff/organizational roles and responsibilities for implementing this data management plan.
  • Who will be responsible for data management and for monitoring the data management plan?
  • How will adherence to this data management plan be checked or demonstrated?
  • What process is in place for transferring responsibility for the data?
  • Who will have responsibility over time for decisions about the data once the original personnel are no longer available?

Products of Research

Describe all the types of data you will be collecting, e.g. experimental data, observational. There are a variety of ways to describe data, so try to be brief but descriptive. Also if you can, describe the amount of data in terms of file numbers and disk space.

Describe how you will be collecting your data and any instruments you will be using. If you are using existing data, describe where it came from and any terms or licensing conditions associated with the data.

Period of Retention

Describe how long the data will be preserved. If you are depositing it in the UNM Repository, the data will be kept for a minimum of ten years.

Data Formats and Metadata

Describe the format of your data; think about what details (metadata) someone else would need to be able to use these files. Describe the structural standards that you will apply in making data and metadata available.

  • Which file formats will you use for your data and why?
  • What naming conventions will you use?
  • How will you organize your directories?
  • What form will the metadata describing/documenting your data take?
  • How will you create or capture these details?
  • Which metadata standards will you use and why have you chosen them? (e.g. accepted domain-local standards, widespread usage)
  • What contextual details (metadata) are needed to make the data you capture or collect meaningful?

Data Dissemination and Policies for Sharing

Describe the policies under which these data will be made available. Describe how and when the data will become available. Will data be accessible on a web page, by email request, via open-access repository, etc.?

If the data is of a sensitive nature - privacy or ecological endangerment concerns, for instance - and public access is inappropriate, address here the means by which granular control and access will be achieved.

  • How and when will you make the data available? Will you be embargoing your data, if so explain why (e.g. to publish, for economic interests such as patents).
  • Will any permission restrictions need to be placed on the data? What have you done to comply with your obligations in your IRB Protocol?
  • Who will hold the intellectual property rights to the data and how might this affect data access? Desribe how your data will be licensed (e.g. Public Domain, Creative Commons 3.0)
  • Do you plan on publishing findings which rely on the data? If so, do your prospective publishers place any restrictions on other avenues of publication?

Data Storage and Preservation

Use this section to describe both short -term and long-term strategy for storing, archiving and preserving your data. During the study, how will you store your data and prevent corruption and loss? Describe any backup systems and versioning you use. If your data is web accessable how will you protect your data from malicious deletion or corruption? At the conclusion of your study, where will your data be stored? Describe, if known, how the data will be curated and maintained.

Geosciences Directorate: Ocean Sciences Division (OCE)

The Directorate for Geosciences provides general information about DMP requirements at http://www.nsf.gov/geo/geo-data-policies/index.jsp.

The current version of DMP guidance for the Ocean Sciences division is available at http://www.nsf.gov/geo/geo-data-policies/oce/index.jsp. The recommended organization is provided below, with guidance. For more detailed assistance, please feel free to contact Research Data Services at .

Types of Data

Give a short description of the data, including amount (if known) and content. If possible give a rough estimate of the number of files. Data types could include text, spreadsheets, images, 3D models, software, audio files, video files, reports, surveys, patient records, etc. Also record what file formats you expect to use. If you are using existing or derived data, describe that here.

Data and Metadata Standards

Describe the format of your data and how it will be "documented". Think about what details (metadata) someone else would need to be able to use these files. For example, you may need a "readme file" to explain variables, structure of the files, etc.

  • What form will the metadata describing/documenting your data take?
  • How will you create or capture these details?
  • Which metadata standards will you use and why have you chosen them? (e.g. accepted domain-local standards, widespread usage)
  • Some common types of information to collect are (far from exhaustive list)
    • Instrument settings
    • Variable meanings (data dictionary)
    • Protocols
    • Edit and ownership history of data (Provenance)

There are a large number of metadata standards. You may already know of the best standards in your field, however, there may be additional standards or new standards you are not aware of. It is our job as data librarians to keep up to date and help you choose the best standards.

Policies for Access and Sharing

The main reason a Data Management Plan is required, is for you to think about how you prepare (manage) your data for sharing and describe how you will actively share your data with non-group members after the project is completed. Since publicly funded research is funded by tax dollars, most funding agencies expect your data to be made easily available to others. If you are going to embargo or withhold all or part of your data, it is important you explain why. This might include publishing agreements, commercial or patent interests, private or sensitive data, among other reasons.

NOTE: You should be as open as possible with the products of your research. NSF and most other funding agencies use this as part of the evaluation of your proposal. Grant proposals and DMPs which seem to limit access to the products resulting from the grant do get reviewed poorly in this area.

You should describe:

  • When your data will be available
  • How you will make your data available
    • Email request (avoid this, as research has shown only 20% of data that says its available this way is actually retrievable)
    • Domain Specific Repository (ie. Genbank)
    • Institutional Repository
    • Journal
  • Embargo Periods
    • A reasonable delay to publish is acceptable
    • Delay for patents, although provisional patents can be obtained through the STC at UNM
    • Who will hold the intellectual property rights on the data
    • Any issues with private, sensitive or secret data

Policies for Re-use and Redistribution

Who will be allowed to use your data, how will they be allowed to use your data and will they be allowed to disseminate your data? If you are planning on restricting access, use or dissemination of the data, you must explain in this section how you will codify and communicate these restrictions. A good strategy for doing this is applying a license to your data. A license can guide others in using your data under the conditions you choose. There are several licensing schemes and tools online that can help you with creating a license of your data, such as Creative Commons used by websites such as Flickr.

Archiving and Preservation

Describe your long term plans for storing your data and making it available. You should include information about:

  • Where will your data be stored.
  • How your data will be saved.
  • What additional metadata is saved with the data beyond that mentioned above.
  • If known, what back up and preservation measures are in place.

Geosciences Directorate: Earth Sciences Division (EAR)

The Directorate for Geosciences provides general information about DMP requirements at http://www.nsf.gov/geo/geo-data-policies/index.jsp.

The current version of DMP guidance for the Earth Sciences division is available at http://www.nsf.gov/geo/geo-data-policies/ear/index.jsp. The recommended organization is provided below, with guidance. For more detailed assistance, please feel free to contact Research Data Services at .

Products of Research

Describe all the types of data you will be collecting, e.g. experimental data, observational. There are a variety of ways to describe data, so try to be brief but descriptive. Also if you can, describe the amount of data in terms of file numbers and disk space.

Describe how you will be collecting your data and any instruments you will be using. If you are using existing data, describe where it came from and any terms or licensing conditions associated with the data.

Data and Metadata Standards

Describe the format of your data and how it will be "documented". Think about what details (metadata) someone else would need to be able to use these files. For example, you may need a "readme file" to explain variables, structure of the files, etc.

  • What form will the metadata describing/documenting your data take?
  • How will you create or capture these details?
  • Which metadata standards will you use and why have you chosen them? (e.g. accepted domain-local standards, widespread usage)
  • Some common types of information to collect are (far from exhaustive list)
    • Instrument settings
    • Variable meanings (data dictionary)
    • Protocols
    • Edit and ownership history of data (Provenance)

There are a large number of metadata standards. You may already know of the best standards in your field, however, there may be additional standards or new standards you are not aware of. It is our job as data librarians to keep up to date and help you choose the best standards.

Policies for Access and Sharing

The main reason a Data Management Plan is required, is for you to think about how you prepare (manage) your data for sharing and describe how you will actively share your data with non-group members after the project is completed. Since publicly funded research is funded by tax dollars, most funding agencies expect your data to be made easily available to others. If you are going to embargo or withhold all or part of your data, it is important you explain why. This might include publishing agreements, commercial or patent interests, private or sensitive data, among other reasons.

NOTE: You should be as open as possible with the products of your research. NSF and most other funding agencies use this as part of the evaluation of your proposal. Grant proposals and DMPs which seem to limit access to the products resulting from the grant do get reviewed poorly in this area.

You should describe:

  • When your data will be available
  • How you will make your data available
    • Email request (avoid this, as research has shown only 20% of data that says its available this way is actually retrievable)
    • Domain Specific Repository (ie. Genbank)
    • Institutional Repository
    • Journal
  • Embargo Periods
    • A reasonable delay to publish is acceptable
    • Delay for patents, although provisional patents can be obtained through the STC at UNM
    • Who will hold the intellectual property rights on the data
    • Any issues with private, sensitive or secret data

Policies for Re-use and Redistribution

Who will be allowed to use your data, how will they be allowed to use your data and will they be allowed to disseminate your data? If you are planning on restricting access, use or dissemination of the data, you must explain in this section how you will codify and communicate these restrictions. A good strategy for doing this is applying a license to your data. A license can guide others in using your data under the conditions you choose. There are several licensing schemes and tools online that can help you with creating a license of your data, such as Creative Commons used by websites such as Flickr.

Data Storage and Preservation

Use this section to describe both short -term and long-term strategy for storing, archiving and preserving your data. During the study, how will you store your data and prevent corruption and loss? Describe any backup systems and versioning you use. If your data is web accessable how will you protect your data from malicious deletion or corruption? At the conclusion of your study, where will your data be stored? Describe, if known, how the data will be curated and maintained.

Geosciences Directorate: Atmospheric and Geospace Sciences Division (AGS)

The Directorate for Geosciences provides general information about DMP requirements at http://www.nsf.gov/geo/geo-data-policies/index.jsp.

The current version of DMP guidance for the Atmospheric and Gepspace Sciences division is available at http://www.nsf.gov/geo/geo-data-policies/ags/index.jsp. The recommended organization is provided below, with guidance. For more detailed assistance, please feel free to contact Research Data Services at .

Products of Research

Describe all the types of data you will be collecting, e.g. experimental data, observational. There are a variety of ways to describe data, so try to be brief but descriptive. Also if you can, describe the amount of data in terms of file numbers and disk space.

Describe how you will be collecting your data and any instruments you will be using. If you are using existing data, describe where it came from and any terms or licensing conditions associated with the data.

Data Formats and Metadata

Describe the format of your data; think about what details (metadata) someone else would need to be able to use these files. Describe the structural standards that you will apply in making data and metadata available.

  • Which file formats will you use for your data and why?

  • What naming conventions will you use?

  • How will you organizing your directories?

  • What form will the metadata describing/documenting your data take?

  • How will you create or capture these details?

  • Which metadata standards will you use and why have you chosen them? (e.g. accepted domain-local standards, widespread usage)

  • What contextual details (metadata) are needed to make the data you capture or collect meaningful?

Data Dissemination and Policies for Sharing

Describe the policies under which these data will be made available. Describe how and when the data will become available. Will data be accessible on a web page, by email request, via open-access repository, etc.?

If the data is of a sensitive nature - privacy or ecological endangerment concerns, for instance - and public access is inappropriate, address here the means by which granular control and access will be achieved.

  • How and when will you make the data available? Will you be embargoing your data, if so explain why (e.g. to publish, for economic interests such as patents).
  • Will any permission restrictions need to be placed on the data? What have you done to comply with your obligations in your IRB Protocol?
  • Who will hold the intellectual property rights to the data and how might this affect data access? Desribe how your data will be licensed (e.g. Public Domain, Creative Commons 3.0)
  • Do you plan on publishing findings which rely on the data? If so, do your prospective publishers place any restrictions on other avenues of publication?

Policies for Re-use and Redistribution

Who will be allowed to use your data, how will they be allowed to use your data and will they be allowed to disseminate your data? If you are planning on restricting access, use or dissemination of the data, you must explain in this section how you will codify and communicate these restrictions. A good strategy for doing this is applying a license to your data. A license can guide others in using your data under the conditions you choose. There are several licensing schemes and tools online that can help you with creating a license of your data, such as Creative Commons used by websites such as Flickr.

Archiving and Preservation

Describe your long term plans for storing your data and making it available. You should include information about:

  • Where will your data be stored.
  • How your data will be saved.
  • What additional metadata is saved with the data beyond that mentioned above.
  • If known, what back up and preservation measures are in place.

Costs of Implementing the DMP

If implementing the DMP will incur additional costs to the project this fact should be mentioned in the appropriate section of the plan (for example the cost of setting up and maintaining a web site). Details of the costs must be included in the budget justification in the budget section of the proposal.

Mathematical and Physical Sciences Directorate: Division of Astronomical Sciences (AST)

The current version of DMP guidance for the Division of Astronomical Sciences is available at http://www.nsf.gov/bfa/dias/policy/dmpdocs/ast.pdf. The recommended organization is provided below, with guidance. For more detailed assistance, please feel free to contact Research Data Services at .

Products of Research

Describe all the types of data you will be collecting, e.g. experimental data, observational. There are a variety of ways to describe data, so try to be brief but descriptive. Also if you can, describe the amount of data in terms of file numbers and disk space.

Describe how you will be collecting your data and any instruments you will be using. If you are using existing data, describe where it came from and any terms or licensing conditions associated with the data.

Data Formats and Metadata

Describe the format of your data; think about what details (metadata) someone else would need to be able to use these files. Describe the structural standards that you will apply in making data and metadata available.

  • Which file formats will you use for your data and why?
  • What naming conventions will you use?
  • How will you organizing your directories?
  • What form will the metadata describing/documenting your data take?
  • How will you create or capture these details?
  • Which metadata standards will you use and why have you chosen them? (e.g. accepted domain-local standards, widespread usage)
  • What contextual details (metadata) are needed to make the data you capture or collect meaningful?

Policies for Access and Sharing

The main reason a Data Management Plan is required, is for you to think about how you prepare (manage) your data for sharing and describe how you will actively share your data with non-group members after the project is completed. Since publicly funded research is funded by tax dollars, most funding agencies expect your data to be made easily available to others. If you are going to embargo or withhold all or part of your data, it is important you explain why. This might include publishing agreements, commercial or patent interests, private or sensitive data, among other reasons.

NOTE: You should be as open as possible with the products of your research. NSF and most other funding agencies use this as part of the evaluation of your proposal. Grant proposals and DMPs which seem to limit access to the products resulting from the grant do get reviewed poorly in this area.

You should describe:

  • When your data will be available
  • How you will make your data available
    • Email request (avoid this, as research has shown only 20% of data that says its available this way is actually retrievable)
    • Domain Specific Repository (ie. Genbank)
    • Institutional Repository
    • Journal
  • Embargo Periods
    • A reasonable delay to publish is acceptable
    • Delay for patents, although provisional patents can be obtained through the STC at UNM
    • Who will hold the intellectual property rights on the data
    • Any issues with private, sensitive or secret data

Policies for Re-use and Redistribution

Who will be allowed to use your data, how will they be allowed to use your data and will they be allowed to disseminate your data? If you are planning on restricting access, use or dissemination of the data, you must explain in this section how you will codify and communicate these restrictions. A good strategy for doing this is applying a license to your data. A license can guide others in using your data under the conditions you choose. There are several licensing schemes and tools online that can help you with creating a license of your data, such as Creative Commons used by websites such as Flickr.

Data Storage and Preservation

Use this section to describe both short -term and long-term strategy for storing, archiving and preserving your data. During the study, how will you store your data and prevent corruption and loss? Describe any backup systems and versioning you use. If your data is web accessable how will you protect your data from malicious deletion or corruption? At the conclusion of your study, where will your data be stored? Describe, if known, how the data will be curated and maintained.

Mathematical and Physical Sciences Directorate: Division of Chemistry (CHE)

The current version of DMP guidance for the Division of Chemsitry is available at http://www.nsf.gov/bfa/dias/policy/dmpdocs/che.pdf. The recommended organization is provided below, with guidance. For more detailed assistance, please feel free to contact Research Data Services at .

Products of Research

Describe all the types of data you will be collecting, e.g. experimental data, observational. There are a variety of ways to describe data, so try to be brief but descriptive. Also if you can, describe the amount of data in terms of file numbers and disk space.

Describe how you will be collecting your data and any instruments you will be using. If you are using existing data, describe where it came from and any terms or licensing conditions associated with the data.

Data Formats and Metadata

Describe the format of your data; think about what details (metadata) someone else would need to be able to use these files. Describe the structural standards that you will apply in making data and metadata available.

  • Which file formats will you use for your data and why?
  • What naming conventions will you use?
  • How will you organizing your directories?
  • What form will the metadata describing/documenting your data take?
  • How will you create or capture these details?
  • Which metadata standards will you use and why have you chosen them? (e.g. accepted domain-local standards, widespread usage)
  • What contextual details (metadata) are needed to make the data you capture or collect meaningful?

Policies for Access and Sharing

The main reason a Data Management Plan is required, is for you to think about how you prepare (manage) your data for sharing and describe how you will actively share your data with non-group members after the project is completed. Since publicly funded research is funded by tax dollars, most funding agencies expect your data to be made easily available to others. If you are going to embargo or withhold all or part of your data, it is important you explain why. This might include publishing agreements, commercial or patent interests, private or sensitive data, among other reasons.

NOTE: You should be as open as possible with the products of your research. NSF and most other funding agencies use this as part of the evaluation of your proposal. Grant proposals and DMPs which seem to limit access to the products resulting from the grant do get reviewed poorly in this area.

You should describe:

  • When your data will be available
  • How you will make your data available
    • Email request (avoid this, as research has shown only 20% of data that says its available this way is actually retrievable)
    • Domain Specific Repository (ie. Genbank)
    • Institutional Repository
    • Journal
  • Embargo Periods
    • A reasonable delay to publish is acceptable
    • Delay for patents, although provisional patents can be obtained through the STC at UNM
    • Who will hold the intellectual property rights on the data
    • Any issues with private, sensitive or secret data

Policies for Re-use and Redistribution

Who will be allowed to use your data, how will they be allowed to use your data and will they be allowed to disseminate your data? If you are planning on restricting access, use or dissemination of the data, you must explain in this section how you will codify and communicate these restrictions. A good strategy for doing this is applying a license to your data. A license can guide others in using your data under the conditions you choose. There are several licensing schemes and tools online that can help you with creating a license of your data, such as Creative Commons used by websites such as Flickr.

Data Storage and Preservation

Use this section to describe both short -term and long-term strategy for storing, archiving and preserving your data. During the study, how will you store your data and prevent corruption and loss? Describe any backup systems and versioning you use. If your data is web accessable how will you protect your data from malicious deletion or corruption? At the conclusion of your study, where will your data be stored? Describe, if known, how the data will be curated and maintained.

Mathematical and Physical Sciences Directorate: Division of Materials Research (DMR)

The current version of DMP guidance for the Division of Materials Research is available at http://www.nsf.gov/bfa/dias/policy/dmpdocs/dmr.pdf. The recommended organization is provided below, with guidance. For more detailed assistance, please feel free to contact Research Data Services at .

Products of Research

Describe all the types of data you will be collecting, e.g. experimental data, observational. There are a variety of ways to describe data, so try to be brief but descriptive. Also if you can, describe the amount of data in terms of file numbers and disk space.

Describe how you will be collecting your data and any instruments you will be using. If you are using existing data, describe where it came from and any terms or licensing conditions associated with the data.

Data Formats and Metadata

Describe the format of your data; think about what details (metadata) someone else would need to be able to use these files. Describe the structural standards that you will apply in making data and metadata available.

  • Which file formats will you use for your data and why?
  • What naming conventions will you use?
  • How will you organizing your directories?
  • What form will the metadata describing/documenting your data take?
  • How will you create or capture these details?
  • Which metadata standards will you use and why have you chosen them? (e.g. accepted domain-local standards, widespread usage)
  • What contextual details (metadata) are needed to make the data you capture or collect meaningful?

Policies for Access and Sharing

The main reason a Data Management Plan is required, is for you to think about how you prepare (manage) your data for sharing and describe how you will actively share your data with non-group members after the project is completed. Since publicly funded research is funded by tax dollars, most funding agencies expect your data to be made easily available to others. If you are going to embargo or withhold all or part of your data, it is important you explain why. This might include publishing agreements, commercial or patent interests, private or sensitive data, among other reasons.

NOTE: You should be as open as possible with the products of your research. NSF and most other funding agencies use this as part of the evaluation of your proposal. Grant proposals and DMPs which seem to limit access to the products resulting from the grant do get reviewed poorly in this area.

You should describe:

  • When your data will be available
  • How you will make your data available
    • Email request (avoid this, as research has shown only 20% of data that says its available this way is actually retrievable)
    • Domain Specific Repository (ie. Genbank)
    • Institutional Repository
    • Journal
  • Embargo Periods
    • A reasonable delay to publish is acceptable
    • Delay for patents, although provisional patents can be obtained through the STC at UNM
    • Who will hold the intellectual property rights on the data
    • Any issues with private, sensitive or secret data

Policies for Re-use and Redistribution

Who will be allowed to use your data, how will they be allowed to use your data and will they be allowed to disseminate your data? If you are planning on restricting access, use or dissemination of the data, you must explain in this section how you will codify and communicate these restrictions. A good strategy for doing this is applying a license to your data. A license can guide others in using your data under the conditions you choose. There are several licensing schemes and tools online that can help you with creating a license of your data, such as Creative Commons used by websites such as Flickr.

Data Storage and Preservation

Use this section to describe both short -term and long-term strategy for storing, archiving and preserving your data. During the study, how will you store your data and prevent corruption and loss? Describe any backup systems and versioning you use. If your data is web accessable how will you protect your data from malicious deletion or corruption? At the conclusion of your study, where will your data be stored? Describe, if known, how the data will be curated and maintained.

Mathematical and Physical Sciences Directorate: Division of Physics (PHY)

The current version of DMP guidance for the Division of Physics is available at http://www.nsf.gov/bfa/dias/policy/dmpdocs/phy.pdf. The recommended organization is provided below, with guidance. For more detailed assistance, please feel free to contact Research Data Services at .

Products of Research

Describe all the types of data you will be collecting, e.g. experimental data, observational. There are a variety of ways to describe data, so try to be brief but descriptive. Also if you can, describe the amount of data in terms of file numbers and disk space.

Describe how you will be collecting your data and any instruments you will be using. If you are using existing data, describe where it came from and any terms or licensing conditions associated with the data.

Data and Metadata Standards

Describe the format of your data and how it will be "documented". Think about what details (metadata) someone else would need to be able to use these files. For example, you may need a "readme file" to explain variables, structure of the files, etc.

  • What form will the metadata describing/documenting your data take?

  • How will you create or capture these details?

  • Which metadata standards will you use and why have you chosen them? (e.g. accepted domain-local standards, widespread usage)

  • Some common types of information to collect are (far from exhaustive list)

    • Instrument settings

    • Variable meanings (data dictionary)

    • Protocols

    • Edit and ownership history of data (Provenance)

There are a large number of metadata standards. You may already know of the best standards in your field, however, there may be additional standards or new standards you are not aware of. It is our job as data librarians to keep up to date and help you choose the best standards.

Policies for Access and Sharing

The main reason a Data Management Plan is required, is for you to think about how you prepare (manage) your data for sharing and describe how you will actively share your data with non-group members after the project is completed. Since publicly funded research is funded by tax dollars, most funding agencies expect your data to be made easily available to others. If you are going to embargo or withhold all or part of your data, it is important you explain why. This might include publishing agreements, commercial or patent interests, private or sensitive data, among other reasons.

NOTE: You should be as open as possible with the products of your research. NSF and most other funding agencies use this as part of the evaluation of your proposal. Grant proposals and DMPs which seem to limit access to the products resulting from the grant do get reviewed poorly in this area.

You should describe:

  • When your data will be available
  • How you will make your data available
    • Email request (avoid this, as research has shown only 20% of data that says its available this way is actually retrievable)
    • Domain Specific Repository (ie. Genbank)
    • Institutional Repository
    • Journal
  • Embargo Periods
    • A reasonable delay to publish is acceptable
    • Delay for patents, although provisional patents can be obtained through the STC at UNM
    • Who will hold the intellectual property rights on the data
    • Any issues with private, sensitive or secret data

Policies for Re-use and Redistribution

Who will be allowed to use your data, how will they be allowed to use your data and will they be allowed to disseminate your data? If you are planning on restricting access, use or dissemination of the data, you must explain in this section how you will codify and communicate these restrictions. A good strategy for doing this is applying a license to your data. A license can guide others in using your data under the conditions you choose. There are several licensing schemes and tools online that can help you with creating a license of your data, such as Creative Commons used by websites such as Flickr.

Archiving and Preservation

Describe your long term plans for storing your data and making it available. You should include information about:

  • Where will your data be stored.
  • How your data will be saved.
  • What additional metadata is saved with the data beyond that mentioned above.
  • If known, what back up and preservation measures are in place.

Social, Behavioral and Economic Sciences Directorate (SBE)

The current version of DMP guidance for the Directorate for Biological Sciences is available at http://www.nsf.gov/sbe/SBE_DataMgmtPlanPolicy.pdf. The recommended organization is provided below, with guidance. For more detailed assistance, please feel free to contact Research Data Services at .

Roles and Responsibilities

Explain how the responsibilities regarding the management of your data will be delegated. This should include time allocations, project management of technical aspects, training requirements, and contributions of non-project staff - individuals should be named where possible. Remember that those responsible for long-term decisions about your data will likely be the custodians of the repository/archive you choose to store your data. While the costs associated with your research (and the results of your research) must be specified in the Budget Justification portion of the proposal, you may want to reiterate who will be responsible for funding the management of your data. Consider these questions:

  • Outline the staff/organizational roles and responsibilities for implementing this data management plan.
  • Who will be responsible for data management and for monitoring the data management plan?
  • How will adherence to this data management plan be checked or demonstrated?
  • What process is in place for transferring responsibility for the data?
  • Who will have responsibility over time for decisions about the data once the original personnel are no longer available?

Products of Research

Describe all the types of data you will be collecting, e.g. experimental data, observational. There are a variety of ways to describe data, so try to be brief but descriptive. Also if you can, describe the amount of data in terms of file numbers and disk space.

Describe how you will be collecting your data and any instruments you will be using. If you are using existing data, describe where it came from and any terms or licensing conditions associated with the data.

Period of Retention

Describe how long the data will be preserved. If you are depositing it in the UNM Repository, the data will be kept for a minimum of ten years.

Data Formats and Metadata

Describe the format of your data; think about what details (metadata) someone else would need to be able to use these files. Describe the structural standards that you will apply in making data and metadata available.

  • Which file formats will you use for your data and why?

  • What naming conventions will you use?

  • How will you organizing your directories?

  • What form will the metadata describing/documenting your data take?

  • How will you create or capture these details?

  • Which metadata standards will you use and why have you chosen them? (e.g. accepted domain-local standards, widespread usage)

  • What contextual details (metadata) are needed to make the data you capture or collect meaningful?

Data Storage and Preservation

Use this section to describe both short -term and long-term strategy for storing, archiving and preserving your data. During the study, how will you store your data and prevent corruption and loss? Describe any backup systems and versioning you use. If your data is web accessable how will you protect your data from malicious deletion or corruption? At the conclusion of your study, where will your data be stored? Describe, if known, how the data will be curated and maintained.

Additional Possible Data Management Requirements

You are asked to explain in this section how you plan on satisfying any additional, program-specific data management requirements, if any exist. If not you may leave this section blank.

What If You Are Not Generating Data?

The Data Management Plan is required for all proposals submitted to the NSF. Proposals submitted without a DMP will be returned for resubmission or rejected unread.

For investigators submitting proposals which will not generate or acquire data (as defined by the sponsoring directorate), the DMP is still required. Within the DMP, PIs may simply state that the project is not anticipated to generate data or samples that require management and/or sharing.

Post-Award Monitoring

  • Annual reports must include information about progress made in data management and sharing of research products.
  • Final project reports should document:
    •   Data produced during the award period
    •   Data that will be retained after the award expires.
    •   Dissemination plans **and** verification that it will be available for sharing.
    •   Data and metadata formats available to others.
    •   Where available data has been deposited for public access.