DMP FAQ Roses

This Frequently Asked Questions (FAQ) page for ROSES about Data Management Plans (DMPs) has been updated for ROSES-2020. The three major changes are:

First, whereas in past years most DMPs were collected by plain text in a mandatory text box in the NSPIRES cover page, in ROSES-2020 this will no longer be true. Unless otherwise stated, the new default is that the data management plan must be placed in a 2-page section in the proposal PDF immediately following the references and citations for the Scientific/Technical/Management (S/T/M) portion of the proposal and does not count against the page limit for the S/T/M Section. This is new for most of ROSES but has been the default approach for Appendix C (Planetary Science). The exceptions that don't follow the default will say so explicitly and they are the programs for which the nature of the work is inexorably linked to the handling of data so DMP is part of the page-limited S/T/M section of the proposal. Examples include (at the time of release of ROSES): A.8 GEDI Science Team, B.7 Space Weather Science Applications, B.12 Heliophysics Data Environment Emphasis, C.4 Planetary Data Archiving, Restoration, and Tools, D.2 Astrophysics Data Analysis, D.13 Astrophysics USPI, D.14 Theoretical and Computational Astrophysics Networks, and E.3 The Exoplanets Research Program.

Second, the sufficiency of the DMP will be evaluated by the peer review panel as part of the proposal’s intrinsic merit and thus may have a bearing on whether or not the proposal is selected. This is new for most elements but not all. The DMP was always part of merit for program elements like C.4 PDART.

Third, there is now a consistent default approach to software across ROSES, see the answer to question #13, below.

Those proposing to a Planetary Science Division call for proposals (anything in Appendix C or E.4 Habitable Worlds) are strongly encouraged to use the planetary science division DMP template that may be downloaded here.  

Those proposing to a Heliophysics Division call for proposals (anything in Appendix B) are strongly encouraged to use the Heliophysics Division DMP template that may be downloaded here.

 

Data Management Plan FAQ for ROSES

Introduction

The Office of Science and Technology Policy (OSTP) at the White House issued a memo to the executive branch agencies telling us to make more accessible the peer reviewed publications and data from the research that we fund. NASA published a response in the form of the

In a future FAQ we will lay out upcoming requirement that peer-reviewed accepted manuscripts be publically accessible (12 months after publication). This FAQ is about the data management plan (DMP) most ROSES proposals must include on submission and what might be in that plan.

First of all, be reassured that we are not going to force you to reveal your precious proprietary data prior to publication. No personal, proprietary or ITAR data is included. The ROSES mandatory minimum is a plan for how you will make available the data that directly underlie the results and findings in your peer‐reviewed publications, like data in the charts and figures in your papers. Individual ROSES program elements require more than this. More on what is and is not included and example data management plans appear below.

1. Who needs to provide a Data Management Plan (DMP)?

Almost all proposals to ROSES must include a DMP or an explanation of why one is not needed. Only instrument development programs (e.g., ESTO calls in Appendix A such as A.48 Advanced Component Technology, C.12 PICASSO, D.8 Strategic Astrophysics Technology) are excluded entirely from the DMP requirement. Those few Program Elements that don't require a DMP will say so explicitly in the text. Unless the program element explicitly says otherwise, you must present a DMP or a statement as to why one is not needed at the time of submission. Most program elements now require that the DMP be included in the proposal PDF in a 2-page section immediately following the references and citations, see the answer to Question 6, below.

2. I really am not going to generate any data, so I should not have to do this

In some cases, like instrument and technology development programs, we will not request a data management plan at all. However, for mixed calls where some of the proposals would generate data, we will request a DMP. We recognize that some proposers to those programs may not be generating any data, that’s OK, you can just say so. For example, all proposals to APRA require a DMP but technology development efforts will not generate scientific data and may simply note where the DMP section of the proposal would go that a DMP is not required because the proposed projects are in the Detector Development or Supporting Technology category. But remember, if you publish a paper with a figure or a table you will have to release at least a little bit of data.

3. Sigh, OK, yes I will generate a little bit of data. What’s the minimum I can get away with?

At a minimum the DMP for ROSES must explain how you will release the data needed to reproduce figures, tables and other representations in publications, at the time of publication. Providing this data via supplementary materials with the journal is one really easy way to do this and it has the advantage that the data and the figures are linked together in perpetuity without any ongoing effort on your part. In addition, NASA will require that you also upload this data along with the manuscript version of your paper into the NIH PubMedCentral (PMC) archive (I think that they are calling the NASA interface into PMC "PubSpace" or something like that?). Please see #17, below.

4. I’m not ready to release this dataset, it took lots of work, that was the hard part, and if I release it now I will get scooped.

Lets say that your paper is based on a big dataset that you generated and this is just the first of a series of articles. In that case we certainly won't require you to release the entire dataset along with the first publication. On the other hand, you do have to release the data that was represented in that paper, and we expect you to release the full dataset at the end of the award. That gives you a few years to publish your papers but means that you can’t take the data to the grave with you.

5. What’s included in these plans?

The DMP should outline the what, where when and who for the data that will be created by the award in adequate detail (see also example DMPs below) i.e., the ideal DMP comprises these elements:

What data types, volumes, formats, and (where relevant) standards,

Where do you intend to make these data available,

When will you make these data available, and

Who will do this archiving and what experience with this kind of data, archive etc.

6. How will the plans be collected?

Most proposers must provide a data management plan (DMP) along with their submission, typically in 2-page section in the proposal PDF immediately following the references and citations for the S/T/M portion of the proposal. However, a few Program Elements require a discussion of data management in the main page-limited S/T/M section of the proposal. Examples include (but are not limited to) A.8 GEDI Science Team, B.7 Space Weather Science Applications, B.12 Heliophysics Data Environment Emphasis, C.4 Planetary Data Archiving, Restoration, and Tools, D.2 Astrophysics Data Analysis, D.13 Astrophysics USPI, D.14 Theoretical and Computational Astrophysics Networks, and E.3 The Exoplanets Research Program. Any program element that requires this will say so explicitly. Some elements, like A.9 Physical Oceanography and A.14 Ocean Surface Topography Science Team require a separate Software Development Plan.

7. I’m still not totally sure what I need to archive. I have a thousand questions

While the questions and answers that appear below, including example DMPs, may help and the

 may help, we cannot cover everything in any one document. Entire workshops have been held to discuss Archiving, formats and standards for just one subdiscipline of science. Rather than trying to address each case we can only recommend that proposers use their best judgment and the standards of their community in deciding what should be archived to allow others in your community to really understand what you have done. I will give a personal example from my own life: to provide what my peers would have needed to use my data I would have archived not just the x,y points need to plot the (final) spectrum in the figure in a paper, but the ones that went into it e.g., the blank, the raw spectrum, any baseline correction, and spectra of any contaminant that was subtracted out. In some cases the data is meaningless without some meta data, so include that too, as appropriate.

8. I don't want my proposal rejected because I didn't do this right. You need to give more details.

We don't want your proposal rejected for this reason either. Each of the Research Overviews (A.1, B.1, C.1…) now has a special section that presents content of DMPs, what is considered "data" for the purpose of the DMP, and even expectations regarding software. Also, some divisions and program elements provide templates for the data management plan. The template for the program elements in Appendix B (Heliophysics) may be found here and the template for the program elements in Appendix C (planetary Science) may be found here.

9. Who’s gonna pay for this?

In most cases I expect that either it will be de minimis (e.g., see #3 above) and not cost any extra, or it will be something that researchers were already doing as part of their projects previously. However, if this has you thinking that you will devote more effort to data archiving and so more funds are required for data management activities (like for large datasets that will go into a NASA archive) these should be covered in the normal budget and budget justification sections of the proposal and simply referred to in the DMP.

Now that you mention it I have been but this is going to take more time and cost NASA more money, OK?

Of course, ROSES allows proposers to include costs for data management and access in proposal budgets. However, no extra funding has been provided to the research program to support this.

10. Can I propose anything I want for my data, may I just post it on my web page?

We want to make as much data as freely available as possible so put that an official NASA (approved) archive if you can, see #16, below. Sometimes you will not be able to put it in one of our official archives because, e.g., the Archive doesn’t think it’s appropriate. Still its best if the data is in a place where likely to be found and to persist, so if you university, department, branch or lab has an official archive of some kind it should go there, rather than just on your personal page, which might be more ephemeral. The appendices and individual program elements of ROSES may specify preferred archives for special things e.g., Github for code. Please read the individual program elements carefully.

11. How do I report on this and what if I can’t do what I originally promised to do in my DMP?

We think that data archiving is just part of the normal research process so you would give a status in your progress reports. We understand that for various reasons things may not turn out exactly as you planned for no fault of yours e.g., the journal or archive or university changed their rules. Thus, if you make a good faith effort and get the information out one way or another its going to be fine, even if you don't do exactly what you originally intended. However, if you don't make an effort and or flagrantly flaunt your defiance of these requirements I will remind you that funded researchers, research institutions, and NASA centers are responsible for ensuring and demonstrating compliance with the DMPs approved as part of their awards. Remember, this is a directive from the white house and if you are really bad The President will call your dean and shame you. Just kidding, but awardees who do not fulfill the intent of their DMPs may have continuing funds withheld and this may be considered in the evaluation of future proposals, which may be even worse.

12. I plan to submit via grants.gov so where do I put the DMP?

The same place where you would if you submitted via NSPIRES: In most cases the data management plan must be placed in a 2-page section in the proposal PDF immediately following the references and citations for the Scientific/Technical/Management (S/T/M) portion of the proposal. Any elements that are exceptions to this rule, for example, those that require the DMP be in the main page-limited S/T/M section of the proposal PDF, will say so explicitly.

13. What about code?

More than one of you has pointed out that some datasets are far less meaningful in the absence of associated code and/or that making code accessible is the logical extension of the NASA approach to increasing access. What to require for code is interesting and complicated question that was not addressed in the NASA plan. In the past SMD was not always consistent about that. Starting in ROSES-2020 we have a consistent default approach to software across ROSES and that appears in the Research Overviews (A.1, B.1, C.1…). By default, ROSES still does not require that code be made public (though individual program elements may still supersede the default and do so). The ROSES default is that "Software, whether a stand-alone program, an enhancement to existing code, or a module that interfaces with existing codes, created as part of a ROSES award, should be made publicly available when it is practical and feasible to do so, and when there is scientific utility in doing so. Stand-alone code that is not straightforward to implement, or whose utility is significantly outweighed by the costs to share it, is not expected to be made available." When it is made available "SMD expects that the source code, with associated documentation sufficient to enable use of the code, will be made publicly available as Open Source Software (OSS) under an appropriately permissive license (e.g., Apache-2, BSD-3-Clause, GPL). This includes all software developed with SMD funding used in the production of data products, as well as software developed to discover, access, visualize, and transform NASA data." For definitions of OSS and examples of the kinds of software envisioned (Analysis software, Libraries, and Frameworks) please see the Research Overview (A.1, B.1, C.1…) for the program element to which you plan to send a proposal. Please note that some elements, such as A.9 Physical Oceanography and A.14 Ocean Surface Topography Science Team, require a separate Software Development Plan.

14. In Section 4 of the NASA Plan it says that "DMPs must provide a plan for making research data … accessible at the time of publication or within a reasonable time period4 after publication…” and footnote 4 says "This time period will defined in the final Data Access plan." What is this "reasonable" time period?

For ROSES we insist that the minimum data set (e.g., in the figures, see #3 above) must be released at time of publication. Additional and sometimes large data sets may be released later, and you may propose what you consider a "reasonable time period" for that in your DMP. Certainly, no later than the end of the award. 

15.  In Section 7.1 of the NASA Plan there is a footnote about an "official approval process (signature process) for data release" what is that?

Obviously, it’s not practical to suggest that NASA Program Managers (or, frankly, anyone at NASA) review and approve the many small data sets released with the ~10,000 papers a year produced by NASA funded researchers. Indeed, we currently don't review or approve grantee papers at all, let alone some supplementary data published along with their paper. This refers to the large data sets that go into the official NASA archives, we delegate that review to those responsible for running those archives and they each have a process for review.

16. I am working on a DMP and want to propose to put my data into an official NASA archive but I have never done this before and don't know where to start. Can you please point me in the right direction?

If you are proposing to Earth Science (appendix A of ROSES) then you will probably be using one of the DAACs which can be found via https://earthdata.nasa.gov/ Each DAAC has its own point of contact.

If you are proposing to Heliophysics (appendix B of ROSES) then you will probably be using one of the following archives:
Space Physics Data Facility https://spdf.gsfc.nasa.gov/submitting_data.html (POC Robert M. Candey: robert.m.candey@nasa.gov)
Solar Data Analysis Center https://docs.virtualsolar.org/wiki/NewDataProvider (POC Jack Ireland: jack.ireland-1@nasa.gov)

For information about metadata, the relevant HP standard is the SPASE Data Model (see http://www.spase-group.org) which is used to populate a 'git' registry whose main public face is the Heliophysics Data Portal (https://heliophysicsdata.gsfc.nasa.gov). The required elements of the Data Model are the 'header' information that includes the Resource Type, Measurement Type, people, access URL(s), duration information and the like. More detailed description including the parameters (variables) in the data files are desirable but not required. Our SPASE group is prepared to help anyone with writing SPASE descriptions, and we hope to soon have an online tool for doing this.

If you are proposing to Planetary Science (appendix C of ROSES) then you will probably be using:
The Planetary Data System https://pds.nasa.gov/ Each node has its own point of contact.

If you are proposing to Astrophysics (appendix D of ROSES) then you may be using one of the archives at http://science.nasa.gov/astrophysics/astrophysics-data-centers/

17. I hear that there is a new requirement in my grant terms and conditions requiring that publications be archived into NIH PubMed Central, will they take my supplementary data and does that satisfy the requirement for making data available? How much data will they take?

That’s right, awards deriving from ROSES-2017 (in fact all NASA awards this year) include terms and conditions requiring that as accepted manuscript versions of peer-reviewed publications (hereinafter "manuscripts") that result from ROSES awards be uploaded into NASA’s part of the PubMed Central (PMC) repository called NASA PubSpace. This applies only to peer reviewed manuscripts. Patents, publications that contain material governed by personal privacy, export control, proprietary restrictions, or national security law or regulations are covered.

Although PMC is a repository for manuscripts, they are set up to accept data that are part of the publication record, data that was made available with an article as supporting information or supplementary data in the journal. PMC accepts up to 2GB of data per manuscript. This is super, it means that when the manuscript becomes freely available via PMC 12 months after publication, so too will the data becomes freely available. Otherwise these would remain perpetually behind a pay wall. So, when you upload your manuscript to NASA PubSpace, add the data needed to reproduce figures, tables and other representations in it, the same data that you put into supplementary material in the journal. You may also add other data but please note that the PMC archive is not the best place for large data sets and or datasets linked to more than one paper for various reasons. The first choice for large datasets remains NASA approved archives such as the Earth science DAACs and the Planetary Data System. But if you have data that readers of the publication should see and you can’t put them in an approved archive then it is acceptable to add them here.

In relation to Supplementary Data our friends at PMC provided this link to a page with additional information. https://www.ncbi.nlm.nih.gov/pmc/about/guidelines/#supplementary

Example Data Management Plans

A. DMP cannot or need not be provided (because its ITAR etc.)

This is a development effort for flight technology that will not generate any data that I can release, so I can’t write a DMP. The data that we will generate will be ITAR.

or

The D.3 APRA program element says that proposals in the Detector Development or Supporting Technology category that will not generate data do not need to provide a DMP.

or

Just explain why your project is not going to generate data.

B. Minimal DMP (publications only):

The proposed project will generate limited data (describe your data here) and this data will be shared at the time of publication via supplementary material associated with publications

C. A real DMP:

The template for the program elements in Appendix B (Heliophysics) may be found here and the template for the program elements in Appendix C (planetary Science) may be found here

In addition to promising to release the required minimal data in supplementary materials along with publications, specifically address in order each of the elements of DMP that are listed in #5, above, e.g., this data set/higher order data product or whatever, [enter data types, volumes, formats, and (where relevant) standards, here] will be uploaded into [enter name of NASA archive e.g., PDS here] by [enter dates here]. Don’t forget to attribute the archiving as a task, e.g., "As described in section xx of the proposal, Dr. Summer Smith will be doing the anatomical data archiving, she has experience with the Anatomy Park Archive, as shown by xx. We have consulted with the Archive POC regarding formatting etc. Our intern, Max Glootie, will have responsibility for archiving the code behind the citizen science app in the NASA Github. We have included adequate time in the summary table of work effort to achieve these archiving tasks.

D. I'm in my own category:

In addition to releasing the required minimal data in supplementary materials along with publications, I have this data set/higher order data product or whatever, [enter data types, volumes, formats, and (where relevant) standards, here]. However, the point of contact (POC) for the NASA archive or the POC for this program element says NASA doesn't want it, or its obviously not appropriate for this or that reason. I think its important and I want everyone to see how wonderful my data set/higher order data product has become so I am posting it on [x, y, z, my own web page whatever].

This was last updated on February 14, 2020

Please direct questions or corrections on this page to SARA@nasa.gov