Progress in Responding to PDE IRB Recommendations
- Develop the Planetary Data Ecosystem
- Address Data Preservation Needs
- Address Barriers to Data Use and Development
- Other Recommendations
Develop the Planetary Data Ecosystem
Rec #: Recommendation
Progress and Current Action
R04: NASA should ensure that a sustained, community-led coordinating organization for the PDE exists that mirrors the other Planetary Assessment or Analysis Groups (AGs), reports to the Planetary Science Advisory Committee, and meets regularly. (Non-consensus)
NASA PSD selected a PDE Chief Scientist, Moses Milazzo, in December 2021. The PDE Chief Scientist provides an independent link between the larger PDE community, the Planetary Data System (PDS) and NASA Headquarters, and also refines and represents the PDE to NASA. PSD is also working to develop a PDE workshop series. Once the PDE workshop series takes off, we will use that series as an opportunity to investigate a larger community-led group. The final workshop in the PDE workshop series could be utilized to create such a community group. The workshop topics are being drafted by the PDE working group. Once ready, the info will be presented to LPI to determine whether and how LPI could host the workshop series.
R01: NASA should proceed with developing the concept of the Planetary Data Ecosystem so that the usability and archival needs of the entire planetary sciences community—all people, professional or amateur, who produce, provide, and/or use data—are better met.
R02: NASA should lead work to refine the full scope of the Planetary Data Ecosystem and build community consensus around the Ecosystem. NASA should continue to refine the short definition as well as the detailed list that answers the question: “What is the PDE?” that clearly differentiates it from the PDS.
The Planetary Data webpage was launched on science.nasa.gov in December 2021. The content for this webpage was seeded by the PDE IRB report and has been updated based on input from planetary community members. The webpage can serve as a portal to all the identified elements of the Ecosystem. As part of NASA's web modernization efforts, the future location of the content on this webpage may migrate and be improved upon as a more user-friendly portal to entering the Ecosystem. PSD is also working to develop a PDE workshop series. The PDE workshop series is slated to have topics in this area – “What is the PDE?” and “PDE User Community Needs/Browsing through the PDE?”. The workshop topics are being drafted by the PDE working group. Once ready, the info will be presented to LPI to determine whether and how LPI could host the workshop series. Once established, it is expected a PDE community-led group will continue to provide information on the state of the PDE.
R03: NASA should ensure that the responsibilities, accountabilities, governance, and service levels for those elements of the Ecosystem that are funded by the NASA Planetary Science Division are clearly defined.
PSD is working to group elements of the PDE within a managed portfolio. This will assist in completing regular reviews that tackle concepts of appropriate governance and service levels. PSD is going to recruit a Planetary Data Officer that will assist in gathering information on Ecosystem elements and conducting regular reviews of the PDE elements and how they fit together. The Planetary Data webpage and PDE workshop series will also provide information and context for this work. Once established, it is expected a PDE community-led group will continue to provide information on the state of the PDE.
R14: Consideration should be given to how to make clear the differing responsibilities and expectations of the data preservation mission from the distribution of usable data. Consistent with Recommendation 2 for the broader Ecosystem, the prioritize goals and scope of PDS need to be carefully and explicitly defined by NASA, with input from the Ecosystem and broader community, and clearly articulated to all members of the community. Mandates above and beyond the agreed-upon scope must be negotiated and accompanied by commensurate funding. NASA should fund PDS nodes at levels appropriate to the full scope of work defined by the selected proposals as well as any accumulated duties.
The first priority of the PDS is to be the long-term archive of digital data products returned from NASA's planetary missions, and from other kinds of flight and ground-based data acquisitions, including laboratory experiments. Building on that, the second priority of the PDS is to ensure the usefulness and usability of that data for the planetary science community. The PDS Discipline and Support Nodes completed their Programmatic Reviews in 2021 and 2022, respectively. The outcome of each of the reviews has been to lean forward and the PDS Nodes have been funded at their requested levels. Included in the description of Node responsibilities is the data scope of each Node, which is consistent with the description of each Node on the PDS webpage. The centralized web presence of the PDS, slated to take place by the end of 2024, will improve the discoverability and understanding of its role in the Ecosystem. Additionally, the new PDE webpage assists in making the other components of the Ecosystem more widely known, thereby helping to make clear the specific role of the PDS within the PDE. Lastly, the new SPD-41 (SMD Information Policy) articulates what NASA's expectation is for data preservation and distribution. PSD's Information Policy will build upon SPD-41 and is in work and will be made public with ROSES-2023.
Address Data Preservation Needs
R31: NASA should establish an archive for planetary radar data either within the PDS Small Bodies Node or separately. This archive should facilitate preservation and usability of data at all processing levels by preservation of data processing procedures (or software). Because of the unique situation of Arecibo Observatory, time is of the essence to preserve the data and prevent irretrievable loss.
PSD currently supports radar data analysis, publication, and archiving of Arecibo data at the PDS SBN. Meetings between the PDS SBN and the Arecibo, JPL, and Goldstone radar groups to coordinate formats and processes among their substantial radar data archiving efforts are ongoing and continue to be of priority. SBN is actively working with these radar observers to prepare and submit their data to the PDS and expand SBN holding of ground-based radar data observations. Additionally, PSD has supported the creation of a Radio Science sub-Node of the PDS established in collaboration with the Planetary Radar and Radio Sciences Group (PRRSG) at JPL and which provides a Planetary Radar Advisory Role to the PDS. No further actions are planned to be taken in response to this recommendation, as the remaining component of archival of software is part of a larger Ecosystem discussion.
R33: NASA should establish a requirement for the preservation of mission-supported laboratory analyses of returned sample material that makes the information accessible to the planetary science community. Time is of the essence to establish these requirements, as NASA will receive the largest sample return since Apollo in approximately two years.
R34: NASA should require data preservation with appropriate metadata in an approved archive or repository for data produced by laboratory analysis of returned samples supported by ROSES Data Analysis Programs (DAP).
PSD is actively working preservation of mission-supported laboratory analyses of returned sample material for the OSIRIS-REx mission. PSD has been working with the Astromaterials Data System (Astromat) via the Johnson Space Center Astromaterials Acquisition & Curation Office to address this need. Astromat is currently completing a special study to lay out the implementation plan for appropriate archiving of OSIRIS-REx, and other returned sample mission. Astromat is actively working with the PDS to determine interoperability. Two fundamental principles are that archived mission-supported laboratory analyses of returned samples must be archived at a high quality (e.g. PDS4 compliant) and that users should be able to access all data archived for a particular sample in one place (e.g. interoperability).
R28: NASA should establish a carefully crafted strategy to identify and prioritize the data preservation needs of the planetary science community that are not currently being addressed.
The PDE IRB report identified an initial list of data preservation needs. One response to those identified needs has been to support the development of a modeling annex at the PDS Atmospheres Node. Additionally, SMD released an RFI to comment on SPD-41. RFI responses could include information “about what support, services, training, funding, or further guidance is needed to support the successful implementation of the existing or proposed information policy.” This has been one recent avenue for community needs to be identified. Lastly, the PDE webpage includes a section for Community-Identified Data Needs and states that AGs can provide this information to NASA HQ. The AGs were specifically asked for this information in Fall 2021. The PDE working group will continue to craft strategies to identify and prioritize the needs of the planetary science community, which may include connections to the revamped PDAR(T) in ROSES-2023.
R29: NASA should consider ways of archiving outside of the PDS that are amenable to creating FAIR and standards-based archives of these growing data sets.
The PSD PDE working group completed an internal, informal survey of non-PDS archives supported by PSD and found most to be non-FAIR compliant. This informal survey also identified that archives to support model output and other large/complex derived data products were rare. Many PIs are currently using supplemental material in publications, and institutional archives to host data and codes. The core of this recommendation spans beyond PSD and into SMD. As such, PSD will continue to work within the SMD Open Source Science Initiative to progress in this area. Addressing the requirements of the new SMD Scientific Information Policy (SPD-41) will support FAIR principles for PSD data. Currently, non-PDS archives supported by the PSD are included on the PDE webpage. The PDE working group will continue to gather information on what archives and repositories are currently used by the planetary community and whether they meet the standards outlined in SPD-41 and the forthcoming Planetary Information Policy.
Address Barriers to Data Use and Development
Rec #: Recommendation
Progress and Current Action
R21: NASA should treat mission data archival as a systems engineering concern by including early funding for mission data acquisition, processing, and archiving of data and foundational data products (including cartographic products, data acquisition contextual information, coordinate system standards, etc.) so that they are planned well in advance of data acquisition.
While the existing process for mission data archival meets NASA’s requirements, PSD is keenly aware that regular internal assessment of a mission’s Data Management Plan is necessary as our understanding of complex instruments and mission goals advance. The PDE working group will continue to consider how to best address this recommendation.
R23: NASA should provide regular, accessible, and effective training programs for researchers, data producers, mission specialists, and others who need to archive with the PDS. This should not just be provided by the PDS: entities with experience delivering to the PDS should also be involved. There should also be training for peer-review of data archives. We also recommend that this training and documentation address data preparation from the perspective of reusability and interoperability, such as the Earth Science Data Systems Working Group (ESDSWG) Data Product Development Guide (DPDG) for Data Producers.
R64: NASA should seek to expand opportunities for intermediate to advanced technical training in topics related to accessing, using, and processing planetary data.
In June 2022, PSD selected a proposal (PI: David Williams, ASU) to offer a series of Planetary Data Training Workshops starting in late 2022/early 2023. Workshops will focus on planetary data management, planetary Geographic Information Systems (GIS) training (ArcGIS, open source GIS, and JMARS), ISIS3 for image processing, and SOCET SET-Ames Stereo Pipeline for digital elevation model (DEM) production. Additionally, we have created space on the PDE webpage to highlight training opportunities. PSD expects to continue supporting training opportunities.
R50: NASA should develop outreach to user communities within the Planetary Data Ecosystem, assess user needs, and develop focused educational and documentation materials that meet highest-priority needs.
R52: Relevant elements of the Ecosystem should support the delivery of higher-level and analysis-ready data products in well-documented and broadly used protocols and formats, even where those formats might not be appropriate for primary data. This should include broadening support across the Ecosystem for a wider variety of data and information formats, such as engineering data; data models; sound and imaging data; and physical collections attached to planetary missions.
R05: NASA should expand intra- and inter-agency efforts to ensure that best practices, lessons learned, and appropriate technologies are shared and implemented across Planetary Data Ecosystem elements.
R11: The Planetary Data Ecosystem should regularly (on a one- to two-year time scale) assess the Findability, Accessibility, Interoperability, and Reusability (FAIR) of data across each PDE element for machine-actionable access to data. This assessment should be used to establish the priorities for Ecosystem management and advisory groups.
The PSD PDE working group is identifying possible actions to be taken at NASA HQ in this area. A few relevant actions include developing a PDE workshop series (R50), supporting investigation of a PDS Engineering Data Node at JPL (R52), working with the SMD Open Source Science Initiative to identify core services and leverage assets across SMD (R05), piloting examples of cloud-ready and analysis-ready data sets (R52), and investigating appropriate review cycles and criteria for PDE elements (R11).
Progress and Current Action
R06: NASA should encourage collaboration around cybersecurity policies, practices, and infrastructure to preserve the integrity and availability of data and systems across the PDE.
R07: NASA should maintain active leadership in the International Planetary Data Alliance (IPDA).
R08: NASA should encourage the development of and participation in other cross-disciplinary organizations of data producers, data managers, and data users by PDE participants.
R09: NASA should seek CoreTrustSeal certification, and thereby WDS membership, for the PDS data nodes. NASA should encourage CoreTrustSeal certification for other PDE elements that serve as data repositories.
R10: NASA should prioritize the reuse of data and metadata standards, data format conversion tools, and Application Programming Interfaces (APIs) across other organizations rather than inventing new ones.
R12: As NASA considers future evolution of the PDS, it should consider the positive aspects of what the PDS has accomplished within the context of the Planetary Data Ecosystem—as well as the context of history—and work to preserve the continuing positive outcomes from the PDS, including: maintaining or increasing funding levels as appropriate; working with Mission teams to continue to improve communications about and efficiency of data archival; continuing collaboration with domestic and international partners; and continuing to improve bundle creation and validation software.
R13: NASA should ensure that PDS has adequate expertise and funding to maintain current standards and to support ongoing improvements, including funding of peer-review of data submissions.
R15: All data dictionaries and information models for the PDS and for other archival elements need peer-review and contextual review (i.e., do these data dictionaries link well with other and existing data dictionaries while avoiding unnecessary redundancy?).
R16: Create a shared, common taxonomy, controlled vocabulary, high level data dictionary, and/or glossary of terms across the Planetary Data Ecosystem. This will substantially advance the machine-actionability of Ecosystem data, and specifically improve interoperability and reusability as described in the FAIR data principles.
R17: NASA should consider a more open and centralized Management Council for PDS governance that includes greater emphasis on systemwide governance in regard to structure, standards, and related processes. A major goal should be to increase the efficacy of decision-making and multi-way communication with Ecosystem stakeholders.
R18: The makeup and distribution of nodes should be examined more closely to ensure that the PDS contains the appropriate and relevant node elements and subject matter expertise, that unnecessary duplication of effort and data do not occur, and that appropriate flexibility regarding scope and content is built into policy.
R19: Mission teams should not be re-formatting NASA-produced data for archiving; this should be internal to NASA. It would make more sense for NASA radio-science experts to decide on a single, existing archival standard format for spacecraft tracking and ancillary data and to directly archive these data without relying on mission intervention.
R20: NASA should review its contract agreements to ensure that mission instrument data archiving and future access and usability is an obligation that is appropriately considered and funded. As part of the agreement entered into with NASA, mission and instrument teams should be expected (and funded) to develop level one requirements that include raw, calibrated, higher-level, and foundational data product planning, execution, processing, delivery, and archival.
R22: NASA should consider a series of investigations or workshops to better understand the full costing of archival for various personas: mission archival managers, telemetry managers; instrument archival managers, R&A data producers; etc. The results of these workshops should be made publicly available and should be included with Data Management Plan templates.
R24: (Non-consensus) Several members of the review panel strongly recommend that the PDS move forward with a lightweight user registration system. Other members have concerns and strongly recommend a cautious approach so that any registration system implemented does not create additional barriers to access to, acquisition of, or usability of planetary data.
R25: NASA should continue to execute against the PDS cloud computing strategy, including selective refactoring of current systems to enable cloud migration, such as the adoption of containerization and further work to establish well-defined and well-documented application programming interfaces.
R26: The PDS should continue discussions and collaborations with other NASA elements, including EOSDIS and OCIO, to leverage the work done in these organizations and ensure that Planetary Science Division needs are appropriately considered in establishing NASA standards and practices.
R27: NASA should consider the impact of cloud computing adoption on organizational efficiency and the development of a broader planetary ecosystem, above and beyond the technical capabilities that public cloud computing brings to addressing data provider and data user needs.
R30: The Science Mission Directorate should elevate support for information and data science issues to parity with other areas in order to systematically address NASA's unmet data preservation needs.
R32: NASA should support the establishment of public archives of analysis-ready data from observational facilities for which such archives do not already exist.
R35: NASA should adopt or develop a standard set of metadata and links to ensure that contextual data are adequately tied to returned and gathered samples. With Mars 2020 gathering and caching samples for later return to Earth, time is of the essence.
R36: NASA should assess the current state of planetary analog repositories and develop the requirements for the establishment of a permanent planetary analog repository or archive.
R37: NASA should establish a primary archive or repository for mission telemetry streams that is accessible to the planetary science community to the extent permitted by regulatory limitations.
R38: NASA should establish a requirement for the preservation of mission operations and planning information that makes the information accessible to the planetary science community to the extent permitted by regulatory limitations.
R39: NASA should evaluate and develop a plan for historical information preservation with the aim of making these data available to the public to the extent possible.
R40: NASA should establish requirements that specify the archive(s) or repositories of record for higher-level data products, with the ultimate goals of systematic collection and reuse of these high-level data products.
R41: NASA should establish guidelines for preserving high-level data sets of interest that are not appropriate to PDS archiving. Designate data repositories that comply with FAIR (Findable, Accessible, Interoperable, Reusable) data principles.
R42: NASA should develop a comprehensive software preservation and archiving strategy that ensures discoverable, accessible, and usable software tools. The curation of the collection of NASA-funded software products through a designed software node within PDS, a centrally managed catalog, or with another approach will ensure the successful implementation of NASA open-source software policies.
R43: NASA should develop requirements for the maintenance of mission data processing pipelines so that non-team members can produce data identical to the output from instrument team processing pipelines.
R44: NASA should develop a plan for the preservation of models and modeling data beginning with requirements for how models and modeling data should be preserved and linked to other Ecosystem elements.
R45: NASA should provide and advertise a better point of entry (or several well-connected portals) to its data, suitable for the broadest range of users looking for planetary data.
R46: The user search experience needs to be improved across the Planetary Data Ecosystem. PDE elements should partner with a user experience (UX) expert to understand the principles and guidelines for UX.
R47: NASA should support and encourage expanded use of DOI-like identifiers for data, thereby connecting data at various levels of processing to assist users in locating the best version of a data set for their needs.
R48: NASA should continue to support non-planetary data archives and encourage cross-communication between planetary and non-planetary metadata developers.
R49: NASA should fund the development of more analysis-ready data (ARD) products derived from the lower-level products created by NASA missions.
R51: NASA should continue to foster the development of tools which translate from common planetary formats and standards into broadly used protocols, formats, and standards to enable the adoption of tools and methods in use by other science communities.
R53: NASA and the PDE should ensure that data linkage mechanisms and types are clearly documented with examples.
R54: NASA should find ways that the Ecosystem could include developer advocacy, particularly for the core PDS application program interfaces (APIs).
R55: NASA should expand public participation in scientific research and other crowdsourcing methods as one strategy for providing data labeling essential to ML/AI/AA.
R56: As they proceed with developing the Planetary Data Ecosystem, NASA should ensure that any Ecosystem assessment group considers the needs of current and potential ML/AI/AA users as part of their work.
R57: NASA should also consider how the relatively nascent planetary ML/AI/AA user community might not be well-aligned with traditional missions, funding opportunities, and user groups and the impact that might have on potential respondents to funding calls.
R58: NASA should increase the level of funding available to explicitly support software development, either via the existing ROSES programs or via the creation of new programs, and clarify its policies for evaluating funding proposals that do not include major components of hypothesis-based science.
R59: NASA should establish a mechanism to support the preservation, support, and maintenance of software tools past the expiration of the grants under which they were developed.
R60: NASA should consider providing options for funding software tools with the proposal requirements and total budget with the scale and scope of typical Guest Investigator or early-career programs.
R61: NASA should consider, on a case-by-case basis, whether commercial contract bid mechanisms or grant proposal mechanisms would be more appropriate for efficiently filling certain critical Planetary Data Ecosystem software tool needs.
R62: Recognizing that “Software is data, but data is not software” (NASEM 2018 page 2), and in keeping with NASA’s open data policies, NASA should ensure that software developed by or for the Planetary Data Ecosystem is as open as possible and only as closed as necessary.
R63: The Planetary Science Division should adopt a single, coherent, open source software policy that applies across all its activities. Ideally, this policy should be a consistent Science Mission Directorate policy. Given that portions of the Ecosystem are outside of NASA’s direct control, a single policy across the entire Ecosystem is likely not practical. However, it is appropriate for NASA to use its influence to achieve a high level of software policy consistency across the Ecosystem.
R65: NASA should encourage continuing education of members of the planetary science community by making it clear that such costs are allowable on grants for all job categories.
Specific response from NASA addressing ALL recommendations to include current status, anticipated timeline to address (if applicable), and potential future plans will be finished in Summer/Fall 2022 and posted on the PDE webpage.