DMP FAQ Roses

This Frequently Asked Questions (FAQ) page for ROSES about Data Management Plans (DMPs) has been updated for ROSES-2021.

Unless otherwise stated, the default continuing from ROSES-2020 is that the data management plan must be placed in a 2-page section in the proposal PDF immediately following the references and citations for the Scientific/Technical/Management (S/T/M) portion of the proposal and does not count against the page limit for the S/T/M Section. Any exceptions that don't follow the default will say so explicitly. Generally, these are either:

A)  The programs for which the nature of the work is inexorably linked to the handling of data so DMP is part of the page-limited S/T/M section of the proposal (e.g., B.7 Space Weather, B.12 Heliophysics Data Environment Emphasis, C.4 Planetary Data Archiving, Restoration, and Tools, D.2 Astrophysics Data Analysis, D.13 Astrophysics USPI (when its solicited), D.14 Theoretical and Computational Astrophysics Networks (when its solicited), and E.3 The Exoplanets Research Program) OR

B)  Programs that fund building hardware for which no data management plan is requested at all since it not appropriate given the nature of the work solicited (see #1, below).

Those proposing to the Planetary Science Division (anything in Appendix C or F.4 Habitable Worlds) are strongly encouraged to use the planetary science division DMP template that may be downloaded here.  

Those proposing to the Heliophysics Division (anything in Appendix B) are strongly encouraged to use the Heliophysics Division DMP template that may be downloaded here.

 

Data Management Plan FAQ for ROSES

Introduction

The Office of Science and Technology Policy (OSTP) at the White House issued a memo to the executive branch agencies telling us to make more accessible the peer reviewed publications and data from the research that we fund. NASA published a response in the form of the

In a future FAQ we will lay out upcoming requirement that peer-reviewed accepted manuscripts be publically accessible (12 months after publication). This FAQ is about the data management plan (DMP) most ROSES proposals must include on submission and what might be in that plan.

First of all, be reassured that we are not going to force you to reveal your precious proprietary data prior to publication. No personal, proprietary or ITAR data is included. The ROSES mandatory minimum is a plan for how you will make available the data that directly underlie the results and findings in your peer‐reviewed publications, like data in the charts and figures in your papers. Individual ROSES program elements require more than this. More on what is and is not included and example data management plans appear below.

1. Who needs to provide a Data Management Plan (DMP)?

Almost all proposals to ROSES must include a DMP or an explanation of why one is not needed. The few exceptions to this rule are programs that just fund building hardware (e.g., IIP, ACT, InVEST, PICASSO, MatISSE, DALI) and those will say so explicitly. However, even if a DMP is not required as a part of a proposal, the information needed to validate the scientific conclusions of peer-reviewed publications resulting from an award, including data underlying figures, maps, and tables, must be made available at the time of publication, publicly and electronically in a place where it can be found and it is likely to persist, e.g., in the supplemental material of the article, a community-endorsed repository, a NASA repository such as data.nasa.gov or a repository supported by a division, or a combination of different resources as would be most appropriate to the data being shared.

2. I really am not going to generate any data, so I should not have to do this

In some cases, like instrument and technology development programs, we will not request a data management plan at all. However, for mixed calls where some of the proposals would generate data, we will request a DMP. We recognize that some proposers to those programs may not be generating any data, that’s OK, you can just say so. For example, all proposals to APRA require a DMP but technology development efforts will not generate scientific data and may simply note where the DMP section of the proposal would go that a DMP is not required because the proposed projects are in the Detector Development or Supporting Technology category. But remember, if you publish a paper with a figure or a table you will have to release at least a little bit of data.

3. Sigh, OK, yes I will generate a little bit of data. What’s the minimum I can get away with?

At a minimum the DMP for ROSES must explain how you will release the data needed to reproduce figures, tables and other representations in publications, at the time of publication. Providing this data via supplementary materials with the journal is one really easy way to do this and it has the advantage that the data and the figures are linked together in perpetuity without any ongoing effort on your part. In addition, NASA will require that you also upload this data along with the manuscript version of your paper into the NIH PubMedCentral (PMC) archive (I think that they are calling the NASA interface into PMC "PubSpace" or something like that?). Please see #17, below.

4. I’m not ready to release this dataset, it took lots of work, that was the hard part, and if I release it now I will get scooped.

Lets say that your paper is based on a big dataset that you generated and this is just the first of a series of articles. In that case we certainly won't require you to release the entire dataset along with the first publication. On the other hand, you do have to release the data that was represented in that paper, and we expect you to release the full dataset at the end of the award. That gives you a few years to publish your papers but means that you can’t take the data to the grave with you.

5. What’s included in these plans?

If the program element or the corresponding research overview (A.1, B.1, C.1…) or Division DMP template gives specific instructions do what it says but, unless otherwise stated the DMP should outline the what, where, when, and who for the data that will be created by the award in adequate detail (see also example DMPs below) i.e., the ideal DMP comprises these elements:

What data types, volumes, formats, and (where relevant) standards,

Where do you intend to make these data available,

When will you make these data available, and

Who will do this archiving and what experience with this kind of data, archive etc.

6. How will the plans be collected?

Most proposers must provide a data management plan (DMP) along with their submission, typically in 2-page section in the proposal PDF immediately following the references and citations for the S/T/M portion of the proposal. However, a few Program Elements require a discussion of data management in the main page-limited S/T/M section of the proposal. Examples include (but are not limited to) A.8 GEDI Science Team, B.7 Space Weather Science Applications, B.12 Heliophysics Data Environment Emphasis, C.4 Planetary Data Archiving, Restoration, and Tools, D.2 Astrophysics Data Analysis, D.13 Astrophysics USPI, D.14 Theoretical and Computational Astrophysics Networks, and E.3 The Exoplanets Research Program. Any program element that requires this will say so explicitly. Some elements, like A.9 Physical Oceanography and A.14 Ocean Surface Topography Science Team require a separate Software Development Plan.

7. I’m still not totally sure what I need to archive. I have a thousand questions

While the questions and answers that appear below, including example DMPs, may help and the

 may help, we cannot cover everything in any one document. Entire workshops have been held to discuss Archiving, formats and standards for just one subdiscipline of science. Rather than trying to address each case we can only recommend that proposers use their best judgment and the standards of their community in deciding what should be archived to allow others in your community to really understand what you have done. I will give a personal example from my own life: to provide what my peers would have needed to use my data I would have archived not just the x,y points need to plot the (final) spectrum in the figure in a paper, but the ones that went into it e.g., the blank, the raw spectrum, any baseline correction, and spectra of any contaminant that was subtracted out. In some cases the data is meaningless without some meta data, so include that too, as appropriate.

8. I don't want my proposal rejected because I didn't do this right. You need to give more details.

We don't want your proposal rejected for this reason either. Each of the Research Overviews (A.1, B.1, C.1…) now has a special section that presents content of DMPs, what is considered "data" for the purpose of the DMP, and even expectations regarding software. Also, some divisions and program elements provide templates for the data management plan. The template for the program elements in Appendix B (Heliophysics) may be found here and the template for the program elements in Appendix C (planetary Science) may be found here.

9. Who’s gonna pay for this?

In most cases I expect that either it will be de minimis (e.g., see #3 above) and not cost any extra, or it will be something that researchers were already doing as part of their projects previously. However, if this has you thinking that you will devote more effort to data archiving and so more funds are required for data management activities (like for large datasets that will go into a NASA archive) these should be covered in the normal budget and budget justification sections of the proposal and simply referred to in the DMP.

Now that you mention it I have been but this is going to take more time and cost NASA more money, OK?

Of course, ROSES allows proposers to include costs for data management and access in proposal budgets. However, no extra funding has been provided to the research program to support this.

10. May I propose anything I want for my data, may I just post it on my web page?

We want to make as much data as freely available as possible so put that an official NASA (approved) archive if you can, see #16, below. Sometimes you will not be able to put it in one of our official archives because, e.g., the Archive doesn’t think it’s appropriate. Still its best if the data is in a place where likely to be found and to persist, so if you university, department, branch or lab has an official archive of some kind it should go there, rather than just on your personal page, which might be more ephemeral. The appendices and individual program elements of ROSES may specify preferred archives for special things e.g., Github for code. Please read the individual program elements carefully.

11. How do I report on this and what if I can’t do what I originally promised to do in my DMP?

We think that data archiving is just part of the normal research process so you would give a status in your progress reports. We understand that for various reasons things may not turn out exactly as you planned for no fault of yours e.g., the journal or archive or university changed their rules. Thus, if you make a good faith effort and get the information out one way or another its going to be fine, even if you don't do exactly what you originally intended. However, if you don't make an effort and or flagrantly flaunt your defiance of these requirements I will remind you that funded researchers, research institutions, and NASA centers are responsible for ensuring and demonstrating compliance with the DMPs approved as part of their awards. Remember, this is a directive from the white house and if you are really bad The President will call your dean and shame you. Just kidding, but awardees who do not fulfill the intent of their DMPs may have continuing funds withheld and this may be considered in the evaluation of future proposals, which may be even worse.

12. I plan to submit via grants.gov so where do I put the DMP?

The same place where you would if you submitted via NSPIRES: In most cases the data management plan must be placed in a 2-page section in the proposal PDF immediately following the references and citations for the Scientific/Technical/Management (S/T/M) portion of the proposal. Any elements that are exceptions to this rule, for example, those that require the DMP be in the main page-limited S/T/M section of the proposal PDF, will say so explicitly.

13. What about code?

More than one of you has pointed out that some datasets are far less meaningful in the absence of associated code and/or that making code accessible is the logical extension of the NASA approach to increasing access. What to require for code is interesting and complicated question that was not addressed in the NASA plan. In the past SMD was not always consistent about that. Starting in ROSES-2020 we have a consistent default approach to software across ROSES and that appears in the Research Overviews (A.1, B.1, C.1…). By default, ROSES still does not require that code be made public (though individual program elements may still supersede the default and do so). The ROSES default is that "Software, whether a stand-alone program, an enhancement to existing code, or a module that interfaces with existing codes, created as part of a ROSES award, should be made publicly available when it is practical and feasible to do so, and when there is scientific utility in doing so. Stand-alone code that is not straightforward to implement, or whose utility is significantly outweighed by the costs to share it, is not expected to be made available." When it is made available "SMD expects that the source code, with associated documentation sufficient to enable use of the code, will be made publicly available as Open Source Software (OSS) under an appropriately permissive license (e.g., Apache-2, BSD-3-Clause, GPL). This includes all software developed with SMD funding used in the production of data products, as well as software developed to discover, access, visualize, and transform NASA data." For definitions of OSS and examples of the kinds of software envisioned (Analysis software, Libraries, and Frameworks) please see the Research Overview (A.1, B.1, C.1…) for the program element to which you plan to send a proposal. Please note that some elements, such as A.9 Physical Oceanography and A.14 Ocean Surface Topography Science Team, require a separate Software Development Plan.

14. In Section 4 of the NASA Plan it says that "DMPs must provide a plan for making research data … accessible at the time of publication or within a reasonable time period4 after publication…” and footnote 4 says "This time period will defined in the final Data Access plan." What is this "reasonable" time period?

For ROSES we insist that the minimum data set (e.g., in the figures, see #3 above) must be released at time of publication. Additional and sometimes large data sets may be released later, and you may propose what you consider a "reasonable time period" for that in your DMP. Certainly, no later than the end of the award. 

15.  In Section 7.1 of the NASA Plan there is a footnote about an "official approval process (signature process) for data release" what is that?

Obviously, it’s not practical to suggest that NASA Program Managers (or, frankly, anyone at NASA) review and approve the many small data sets released with the ~10,000 papers a year produced by NASA funded researchers. Indeed, we currently don't review or approve grantee papers at all, let alone some supplementary data published along with their paper. This refers to the large data sets that go into the official NASA archives, we delegate that review to those responsible for running those archives and they each have a process for review.

16. I am working on a DMP and want to propose to put my data into an official NASA archive but I have never done this before and don't know where to start. Can you please point me in the right direction?

If you are proposing to Earth Science (appendix A of ROSES) then you will probably be using one of the DAACs which can be found via https://earthdata.nasa.gov/ Each DAAC has its own point of contact.

If you are proposing to Heliophysics (appendix B of ROSES) then you will probably be using one of the following archives:
Space Physics Data Facility https://spdf.gsfc.nasa.gov/submitting_data.html (POC Robert M. Candey: robert.m.candey@nasa.gov)
Solar Data Analysis Center https://docs.virtualsolar.org/wiki/NewDataProvider (POC Jack Ireland: jack.ireland-1@nasa.gov)

For information about metadata, the relevant HP standard is the SPASE Data Model (see http://www.spase-group.org) which is used to populate a 'git' registry whose main public face is the Heliophysics Data Portal (https://heliophysicsdata.gsfc.nasa.gov). The required elements of the Data Model are the 'header' information that includes the Resource Type, Measurement Type, people, access URL(s), duration information and the like. More detailed description including the parameters (variables) in the data files are desirable but not required. Our SPASE group is prepared to help anyone with writing SPASE descriptions, and we hope to soon have an online tool for doing this.

If you are proposing to Planetary Science (appendix C of ROSES) then you will probably be using:
The Planetary Data System https://pds.nasa.gov/ Each node has its own point of contact.

If you are proposing to Astrophysics (appendix D of ROSES) then you may be using one of the archives at http://science.nasa.gov/astrophysics/astrophysics-data-centers/

If you are proposing to Biological or Physical Sciences (BPS Appendix E of ROSES) the BPS ROSES program element will specify to which database researchers should address in their DMPs but there are a few possibilities:

Space Biology has the Genelab (https://genelab.nasa.gov) database for omics-related experimental results and the Life Sciences Data Archive, shared with the Human Research Program, (https://lsda.jsc.nasa.gov) for experimental samples and other traditional results. Physical Sciences has the Physical Sciences Informatics database (https://www.nasa.gov/PSI).

17. I hear that my award terms and conditions require archiving of publications. Will, e.g., Pubspace take my supplementary data and does that satisfy the requirement for making data available? If so, much data will they take?

Yes, all NASA award terms and conditions require that as accepted manuscript versions of peer-reviewed publications (hereinafter "manuscripts") that result from ROSES awards must be archived. The archive is currently "PubSpace" NASA's part of the PubMed Central (PMC) repository. This applies only to peer reviewed manuscripts. Patents, publications that contain material governed by personal privacy, export control, proprietary restrictions, or national security law or regulations are covered.

Although PubSpace is a repository for manuscripts, not data, they are set up to accept data that are part of the publication record, data that was made available with an article as supporting information or supplementary data in the journal. PubSpace accepts up to 2GB of data per manuscript. This means that when the manuscript becomes freely available via 12 months after publication, so too will the supplementary data. Otherwise, these would remain perpetually behind a pay wall. So, when you upload your manuscript to PubSpace, add the data needed to reproduce figures, tables and other representations in it, the kind of data that you put into supplementary material in the journal. Our friends at PMC provided this link to a page with additional information: https://www.ncbi.nlm.nih.gov/pmc/about/guidelines/#supplementary. There are rumors that the repository for NASA-funded manuscripts may change in the future but, whatever the repository, there will be space for data along with the manuscripts.

One may also add other data in this way, but this is not the place for large data sets. The first choice for larger datasets remains NASA approved archives such as the Earth science DAACs and the Planetary Data System. But if you have data that readers of the publication should see, and you can't put them in an approved archive then it is acceptable to add them here.

Example Data Management Plans

A. DMP cannot or need not be provided (because its ITAR etc.)

This is a development effort for flight technology that will not generate any data that I can release, so I can’t write a DMP. The data that we will generate will be ITAR.

or

The D.3 APRA program element says that proposals in the Detector Development or Supporting Technology category that will not generate data do not need to provide a DMP.

or

Just explain why your project is not going to generate data.

B. Minimal DMP (publications only):

The proposed project will generate limited data (describe your data here) and this data will be shared at the time of publication via supplementary material associated with publications

C. A real DMP:

The template for the program elements in Appendix B (Heliophysics) may be found here and the template for the program elements in Appendix C (planetary Science) may be found here

In addition to promising to release the required minimal data in supplementary materials along with publications, specifically address in order each of the elements of DMP that are listed in #5, above, e.g., this data set/higher order data product or whatever, [enter data types, volumes, formats, and (where relevant) standards, here] will be uploaded into [enter name of NASA archive e.g., PDS here] by [enter dates here]. Don’t forget to attribute the archiving as a task, e.g., "As described in section xx of the proposal, Dr. Summer Smith will be doing the anatomical data archiving, she has experience with the Anatomy Park Archive, as shown by xx. We have consulted with the Archive POC regarding formatting etc. Our intern, Max Glootie, will have responsibility for archiving the code behind the citizen science app in the NASA Github. We have included adequate time in the summary table of work effort to achieve these archiving tasks.

D. I'm in my own category:

In addition to releasing the required minimal data in supplementary materials along with publications, I have this data set/higher order data product or whatever, [enter data types, volumes, formats, and (where relevant) standards, here]. However, the point of contact (POC) for the NASA archive or the POC for this program element says NASA doesn't want it, or its obviously not appropriate for this or that reason. I think its important and I want everyone to see how wonderful my data set/higher order data product has become so I am posting it on [x, y, z, my own web page whatever].


This was last updated on April 22, 2021.

Please direct questions or corrections on this page to SARA@nasa.gov