Data and Computing Architecture Study

Overview

The NASA Science Mission Directorate’s (SMD) Data and Computing Architecture Study final report was released on August 5, 2024. Now is a critical time to ensure that SMD’s data and computing systems work effectively to provide services to users. NASA's 150+ current and upcoming missions studying Earth, the Moon, the Sun, the atmosphere, and beyond generate substantial data, with over 100 petabytes currently available freely and openly, projected to exceed 500 petabytes by the end of the decade.

The Data and Computing Architecture Study was chartered in FY22 to determine: whether a coordinated cloud and on-premises computing infrastructure could meet the data and computing needs of SMD, enable efficiencies, and support SMD’s transition to open science.

The study had two goals:

  1. Identify scientific data and computing capabilities and architectures that:
    • Enable open science
    • Improve access to advanced computing and analytic services
    • Allow open scientific collaboration
    • Manage cybersecurity risks
    • Balance cost.
  2. Identify opportunities that will ensure long-term evolution and sustainability of an SMD- wide data and computing infrastructure to enable open-source science.

A strategic evaluation of SMD’s data and computing infrastructure was necessary to improve coordination of NASA’s scientific data and computing capabilities. The scope of this study included the data and computing systems supporting SMD-sponsored scientific workloads:

  • Scientific modeling and simulation
  • Data processing (L0-L4)
  • Data analytics
  • Artificial intelligence and machine learning

Frequently Asked Questions

Answers to common questions about the Data and Computing Architecture Study.

What inputs were solicited for the Data and Computing Architecture Study?

The study solicited input from a broad and diverse set of NASA teams, industry partners, open science experts, and stakeholders across the science mission, research,data, and computing systems community. The study was conducted openly  and included public workshops and broad outreach efforts.

Input gathering for the study included:

  • 17 open workshops with industry, research institutions, and government agencies
  • A Request for Information (RFI) open January – February 2023
  • Technical interchange meetings with SMD divisions covering cloud and High-End Computing (HEC) architecture.

Recent SMD studies on HEC and cloud architecture were also considered. The study was designed with emphasis on a combination of broader outreach (ex. RFI, open workshops) and smaller, highly targeted “deep-dives'' with expert individuals from SMD divisions.

Four panes reveal different snippets of information about the Data and Computing Architecture Study. The top left pane says, “Open workshops: Participants in open workshops included individuals from NASA, other government agencies, and the general public; 279 individuals participated across the 17 open workshops held for the study. Average workshop attendance was 45 participants.” The bottom left pane, titled “RFI: 70 Responses,” displays two pie charts. The left-hand pie chart, colored in blue and titled “All respondents,” shows nearly half of responses were from a private company, a bit more than a quarter were from the government, slightly more than one-eighth were from a university, and slightly under one-eighth were from a nonprofit. The right-hand pie chart, colored in orange, is titled “government respondents,” and shows that three-quarters of government respondents were from NASA centers, about one-sixth were JPL respondents, and the rest were other categories of government respondent. The top right pane, titled “Technical Interchange Meetings (TIMs)”, reads, “The study team conducted TIMs to discuss discipline/Division-specific needs with SMD Divisions. Participants included Division representatives from HQ and Division archives.” A bulleted list reads, “Biological and Physical Sciences Division (TIM), Earth Science Division (write-in), Heliophysics Division (TIM), Planetary Science Division (TIM), and Astrophysics Division (TIM).” The bottom right pane is titled “Previous Studies, Literature Review, and Inventory of Existing Resources,” and reads, Existing SMD resources were inventoried and catalogued; relevant literature was reviewed for applicability to a hybrid cloud architecture. Results of recent applicable studies were taken into account as inputs, including the НЕС Needs Assessment (2020) and the Cloud Consolidation Study (2022).”

What were the findings and recommendations of the Data and Computing Architecture Study?

A notional service layer architecture was developed following the synthesis of study inputs; findings and recommendations addressed all five layers of this notional architecture.

Colored boxes stacked on top of each other show a reference scientific data and computing architecture developed in response to study inputs. The first three boxes are labeled as “Services provided by Divisions to support unique needs of Division user communities.” The first of these boxes reads, “User interface services for person and machine-to-machine access and use of analysis services.” The second box reads, “Analysis services that are accessed by the user interface services to support Open-Source Science.” The third box reads, “Data access and management services that support the analysis services.” The Data Access and Management Services box also overlaps with the second category of label: “Core Data and Computing Services deployed in collaboration with Divisions.” There are two more boxes under this label as well: “IT support services for developing and running the user interface, analysis, data, and support services,” and “Cloud and Infrastructure services with interoperable access to other agency infrastructure services (e.g. computing, networking, communication, etc.) to host/provision the user interface, analysis, data, and IT support services.”

The study concluded with five findings, resulting in recommendations that address these reference architecture and programmatic aspects:

  • Open-Source Science Infrastructure
    Addresses the need to efficiently access/combine data from multiple repositories, leverage modern scientific analysis and collaboration tools, and easily utilize data in cloud environments and high-end computing facilities.
  • Infrastructure Common Services
    Addresses specific “baseline” collaboration tools, cloud services, and data services that would be useful to all SMD divisions.
  • Improving Efficiency and Access to Computing
    Addresses the need for developers, operators, and users to have flexibility, ease of entry, and surge capacity within data and computing systems.
  • Managing Cybersecurity Risks (Programmatic)
    Addresses implementation of cybersecurity across SMD’s data and computing resources.
  • Long-term Sustainability (Programmatic)
    Addresses the organizational framework required to most effectively capitalize on existing and future scientific data and computing resources, including the management of cloud resources.

How is SMD responding to the recommendations of the Data and Computing Architecture Study?

The Office of the Chief Science Data Officer is supporting the development of the Core Data and Computing Services program element. This program element flows from U.S., NASA, and SMD priorities/strategies; the development of the program will also be guided by the study results to ensure that it meets the SMD divisions’ and users’ needs.

The Core Data and Computing Services program element will provide a foundation for a layered architecture over which SMD science divisions can seamlessly and efficiently integrate their discipline-specific services. Central to the development of the Core Data and Computing Services program element will be the integration of new and existing data and computing capabilities into a modular and secure architecture that is reusable by all of the SMD divisions.

If you have any questions about Data and Compute Architecture Study, please contact Andrew Mitchell.

Keep Exploring

Discover More Topics From NASA