DMP FAQ Roses
This is a Frequently Asked Questions page for ROSES about Data Management Plans (DMPs). This FAQ covers proposals to Astrophysics (Appendix D), Earth Science (Appendix A) and Heliophysics (Appendix B). If you are planning to respond to a Planetary Science Division call for proposals (anything in Appendix C or E.4, Habitable Worlds) then don't read this FAQ, it's superseded by the instructions in C.1 the Planetary Science Division Overview. Planetary science division templates for DMPs may be found here.
If you are planning to submit a proposal to any of the other divisions (Astrophysics, Earth, Helio) or E.3 the cross division Exoplanets Research Program) then this FAQ is still good for you.
Data Management Plan FAQ for ROSES
The Office of Science and Technology Policy (OSTP) at the White House issued a memo to the executive branch agencies telling us to make more accessible the peer reviewed publications and data from the research that we fund. NASA published a response in the form of the
In a future FAQ we will lay out upcoming requirement that peer-reviewed accepted manuscripts be publically accessible (12 months after publication). This FAQ is about the data management plan (DMP) most ROSES proposals must include on submission and what might be in that plan.
First of all, be reassured that we are not going to force you to reveal your precious proprietary data prior to publication. No personal, proprietary or ITAR data is included. The ROSES mandatory minimum is a plan for how you will make available the data that directly underlie the results and findings in your peer‐reviewed publications, like data in the charts and figures in your papers. Individual ROSES program elements require more than this. More on what is and is not included and example data management plans appear below.
1. Who needs to provide a Data Management Plan (DMP)?
Almost all proposals to ROSES must include a DMP or an explanation of why one is not needed. Only instrument development programs (e.g., ESTO calls in Appendix A such as A.48 Advanced Component Technology, C.12 PICASSO, D.8 Strategic Astrophysics Technology) are excluded entirely from the DMP requirement. Those few Program Elements that don't require a DMP will say so explicitly in the text. Unless the program element explicitly says otherwise, you must present a DMP or a statement as to why one is not needed at the time of submission. Most program elements will collect the DMP in a text box on the NSPIRES cover page, see Q6, below.
2. I really am not going to generate any data, so I should not have to do this
In some cases, like instrument and technology development programs, we will not request a data management plan at all, the text box wont even appear on the ROSES cover pages. However, for those calls where some of the proposals would generate data we must include the mandatory text box on the cover pages for those who need it. We recognize that some proposers to those programs may not be generating any data. Those of you who are in that category may simply state that you will not be generating any data (or none that can be released e.g., cause its ITAR) and briefly explain why. However, chances are that you are going to publish a paper with a figure or a table and so you will have to release at least a little bit of data.
3. Sigh, OK, yes I will generate a little bit of data. What’s the minimum I can get away with?
At a minimum the DMP for ROSES must explain how you will release the data needed to reproduce figures, tables and other representations in publications, at the time of publication. Providing this data via supplementary materials with the journal is one really easy way to do this and it has the advantage that the data and the figures are linked together in perpetuity without any ongoing effort on your part. In addition, NASA will require that you also upload this data along with the manuscript version of your paper into the NIH PubMedCentral (PMC) archive (I think that they are calling the NASA interface into PMC "PubSpace" or something like that?). Please see #17, below.
4. I’m not ready to release this dataset, it took lots of work, that was the hard part, and if I release it now I will get scooped.
Lets say that your paper is based on a big dataset that you generated and this is just the first of a series of articles. In that case we certainly won't require you to release the entire dataset along with the first publication. On the other hand, you do have to release the data that was represented in that paper, and we expect you to release the full dataset at the end of the award. That gives you a few years to publish your papers but means that you can’t take the data to the grave with you.
5. What’s included in these plans?
The DMP should outline the what, where when and who for the data that will be created by the award in adequate detail (see also example DMPs below) i.e., the ideal DMP comprises these elements:
What data types, volumes, formats, and (where relevant) standards,
Where do you intend to make these data available,
When will you make these data available, and
Who will do this archiving and what experience with this kind of data, archive etc.
6. How will the plans be collected?
Most proposers must provide a data management plan (DMP) along with their submission, typically in a mandatory plain text box on the NSPIRES cover pages, just like the one used to collect the Executive Summary. The NSPIRES system will not permit a proposal to be submitted unless text is entered into that box. In most cases the NSPIRES system will limit the DMP to at most 8000 characters. A few Program Elements ask for a discussion of data management in the main technical part of the proposal. For example, all of Appendix C provide instructions that supersede and/or amplify the minimal DMP described in this FAQ. For example, program elements A.4 Terrestrial Ecology, A.7 Carbon Monitoring System, B.7 Heliophysics Data Environment Enhancements, C.7 Planetary Data Archiving, Restoration, and Tools, and D.2 Astrophysics Data Analysis (for astrophysical databases or the development of new data products or analysis tools) require it in the proposal. Otherwise, just fill out the text box on the cover page. Proposers to Appendix C should read subsection Section 3.5 of the overview C.1.
7. I’m still not totally sure what I need to archive. I have a thousand questions
While the questions and answers that appear below, including example DMPs, may help and the
may help, we cannot cover everything in any one document. Entire workshops have been held to discuss Archiving, formats and standards for just one subdiscipline of science. Rather than trying to address each case we can only recommend that proposers use their best judgment and the standards of their community in deciding what should be archived to allow others in your community to really understand what you have done. I will give a personal example from my own life: to provide what my peers would have needed to use my data I would have archived not just the x,y points need to plot the (final) spectrum in the figure in a paper, but the ones that went into it e.g., the blank, the raw spectrum, any baseline correction, and spectra of any contaminant that was subtracted out. In some cases the data is meaningless without some meta data, so include that too, as appropriate.
8. I don't want my proposal rejected because I didn't do this right. You need to give more details and or answer my questions about this at 11 pm the night that my proposal is due.
Except for the few calls (like A.4, A.7, B.7, C.7, or D.2) that explicitly say the DMP is part of the grade, the evaluation of the DMP it will not effect proposal rating or the likelihood of selection. NASA reserves the right to require the revision of the DMP prior to selection or funding.
9. Who’s gonna pay for this?
In most cases I expect that either it will be de minimis (e.g., see #3 above) and not cost any extra, or it will be something that researchers were already doing as part of their projects previously. However, if this has you thinking that you will devote more effort to data archiving and so more funds are required for data management activities (like for large datasets that will go into a NASA archive) these should be covered in the normal budget and budget justification sections of the proposal and simply referred to in the DMP.
Now that you mention it I have been but this is going to take more time and cost NASA more money, OK?
Of course, ROSES allows proposers to include costs for data management and access in proposal budgets. However, no extra funding has been provided to the research program to support this.
10. Can I propose anything I want for my data, may I just post it on my web page?
We want to make as much data as freely available as possible so put that an official NASA (approved) archive if you can, see #16, below. Sometimes you will not be able to put it in one of our official archives because, e.g., the Archive doesn’t think it’s appropriate. Still its best if the data is in a place where likely to be found and to persist, so if you university, department, branch or lab has an official archive of some kind it should go there, rather than just on your personal page, which might be more ephemeral. The appendices and individual program elements of ROSES may specify preferred archives for special things e.g., Github for code. Please read the individual program elements carefully.
11. How do I report on this and what if I can’t do what I originally promised to do in my DMP?
We think that data archiving is just part of the normal research process so you would give a status in your progress reports. We understand that for various reasons things may not turn out exactly as you planned for no fault of yours e.g., the journal or archive or university changed their rules. Thus, if you make a good faith effort and get the information out one way or another its going to be fine, even if you don't do exactly what you originally intended. However, if you don't make an effort and or flagrantly flaunt your defiance of these requirements I will remind you that funded researchers, research institutions, and NASA centers are responsible for ensuring and demonstrating compliance with the DMPs approved as part of their awards. Remember, this is a directive from the white house and if you are really bad The President will call your dean and shame you. Just kidding, but awardees who do not fulfill the intent of their DMPs may have continuing funds withheld and this may be considered in the evaluation of future proposals, which may be even worse.
12. I plan to submit via grants.gov so where to I put the DMP?
The DMP question is answered in a Grants.gov application by completing and attaching the relevant PSD.pdf form provided in the opportunity instructions document. If you are going to submit via Grants.gov I strongly urge you to also read SARA FAQ 17.
13. You didn't mention code, what about that?
More than one of you has pointed out that some datasets are far less meaningful in the absence of associated code and/or that making code accessible is the logical extension of the NASA approach to increasing access. What to require for code is interesting and complicated question not addressed in the NASA plan. Some our existing solicitations explicitly require archiving new code or otherwise making it available. See, for example, Program Element C.7 Planetary Data Archiving, Restoration, and Tools, which requires archiving source code at NASA’s Github and subsection 3.2.3 (Open source software) of Program Element A.45 Computational Modeling Algorithms and Cyberinfrastructure, which specified that the software developed under those awards must be designated and distributed to the public as open source software via Apache License 2.0. However, we are not prepared at this time to make these kinds of requirements generally to ROSES. We fear that as with proprietary data, there are cases where forcing an organization to release it's code would be an unreasonable burden. For now, as with other areas that we have not specified (see #7, above), we leave it up to proposers to use their best judgment and suggest in their DMP what is appropriate given the standards of their community.
14. In Section 4 of the NASA Plan it says that "DMPs must provide a plan for making research data … accessible at the time of publication or within a reasonable time period4 after publication…” and footnote 4 says "This time period will defined in the final Data Access plan." What is this "reasonable" time period?
For ROSES we insist that the minimum data set (e.g., in the figures, see #3 above) must be released at time of publication. Additional and sometimes large data sets may be released later, and you may propose what you consider a "reasonable time period" for that in your DMP. Certainly, no later than the end of the award.
15. In Section 7.1 of the NASA Plan there is a footnote about an "official approval process (signature process) for data release" what is that?
Obviously, it’s not practical to suggest that NASA Program Managers (or, frankly, anyone at NASA) review and approve the many small data sets released with the ~10,000 papers a year produced by NASA funded researchers. Indeed, we currently don't review or approve grantee papers at all, let alone some supplementary data published along with their paper. This refers to the large data sets that go into the official NASA archives, we delegate that review to those responsible for running those archives and they each have a process for review.
Example Data Management Plans
A. DMP cannot or need not be provided (because its ITAR etc.)
This is a development effort for flight technology that will not generate any data that I can release, so I can’t write a DMP. The data that we will generate will be ITAR.
The D.3 APRA program element says that proposals in the Detector Development or Supporting Technology category that will not generate data do not need to provide a DMP.
Just explain why your project is not going to generate data.
B. Minimal DMP (publications only):
The proposed project will generate limited data (describe your data here) and this data will be shared at the time of publication via supplementary material associated with publications
C. A real DMP:
In addition to releasing the required minimal data in supplementary materials along with publications, this data set/higher order data product or whatever, [enter data types, volumes, formats, and (where relevant) standards, here] will be uploaded into [enter name of NASA archive e.g., PDS here] by the end of the award. I have experience with this archive, as shown by A, B, & C and I have consulted with the Archive POC regarding formatting etc. I have included in my budget adequate time to achieve this task.
D. I'm in my own category:
In addition to releasing the required minimal data in supplementary materials along with publications, I have this data set/higher order data product or whatever, [enter data types, volumes, formats, and (where relevant) standards, here]. However, the point of contact (POC) for the NASA archive or the POC for this program element says NASA doesn't want it, or its obviously not appropriate for this or that reason. I think its important and I want everyone to see how wonderful my data set/higher order data product has become so I am posting it on [x, y, z, my own web page whatever].
16. I am working on a DMP and want to propose to put my data into an official NASA archive but I have never done this before and don't know where to start. Can you please point me in the right direction?
If you are proposing to Earth Science (appendix A of ROSES) then you will probably be using one of the DAACs which can be found via https://earthdata.nasa.gov/ Each DAAC has its own point of contact.
If you are proposing to Heliophysics (appendix B of ROSES) then you will probably be using one of the following archives:
Space Physics Data Facility http://spdf.gsfc.nasa.gov/ (POC Bob McGuire: Robert.E.McGuire.firstname.lastname@example.org)
Solar Data Analysis Center http://umbra.nascom.nasa.gov/ (POC Joe Gurman: email@example.com)
For information about metadata, the relevant HP standard is the SPASE Data Model (see http://www.spase-group.org) which is used to populate a 'git' registry whose main public face is the Heliophysics Data Portal (https://heliophysicsdata.gsfc.nasa.gov). The required elements of the Data Model are the 'header' information that includes the Resource Type, Measurement Type, people, access URL(s), duration information and the like. More detailed description including the parameters (variables) in the data files are desirable but not required. Our SPASE group is prepared to help anyone with writing SPASE descriptions, and we hope to soon have an online tool for doing this.
If you are proposing to Planetary Science (appendix C of ROSES) then you will probably be using:
The Planetary Data System https://pds.nasa.gov/ Each node has its own point of contact.
If you are proposing to Astrophysics (appendix D of ROSES) then you may be using one of the archives at http://science.nasa.gov/astrophysics/astrophysics-data-centers/
17. I hear that there is a new requirement in my grant terms and conditions requiring that publications be archived into NIH PubMed Central, will they take my supplementary data and does that satisfy the requirement for making data available? How much data will they take?
That’s right, awards deriving from ROSES-2017 (in fact all NASA awards this year) include terms and conditions requiring that as accepted manuscript versions of peer-reviewed publications (hereinafter "manuscripts") that result from ROSES awards be uploaded into NASA’s part of the PubMed Central (PMC) repository called NASA PubSpace. This applies only to peer reviewed manuscripts. Patents, publications that contain material governed by personal privacy, export control, proprietary restrictions, or national security law or regulations are covered.
Although PMC is a repository for manuscripts, they are set up to accept data that are part of the publication record, data that was made available with an article as supporting information or supplementary data in the journal. PMC accepts up to 2GB of data per manuscript. This is super, it means that when the manuscript becomes freely available via PMC 12 months after publication, so too will the data becomes freely available. Otherwise these would remain perpetually behind a pay wall. So, when you upload your manuscript to NASA PubSpace, add the data needed to reproduce figures, tables and other representations in it, the same data that you put into supplementary material in the journal. You may also add other data but please note that the PMC archive is not the best place for large data sets and or datasets linked to more than one paper for various reasons. The first choice for large datasets remains NASA approved archives such as the Earth science DAACs and the Planetary Data System. But if you have data that readers of the publication should see and you can’t put them in an approved archive then it is acceptable to add them here.
In relation to Supplementary Data our friends at PMC provided this link to a page with additional information. https://www.ncbi.nlm.nih.gov/pmc/about/guidelines/#supplementary
This was last updated on May 25, 2017