4 min read
As extreme weather events increase around the world due to climate change, the need for further research into our warming planet has increased as well. For NASA, climate research involves not only conducting studies of these events, but also empowering outside researchers to do the same. The artificial intelligence (AI) efforts spearheaded by the agency offer a powerful tool to accomplish these goals.
In 2023, NASA teamed up with IBM Research to create an AI geospatial foundation model. Trained on vast amounts of NASA’s widely used Harmonized Landsat and Sentinel-2 (HLS) data, the model provides a base for a variety of AI-powered studies to tackle environmental challenges. In keeping with open science principles, the model is freely available for anyone to access.
Foundation models serve as a baseline from which scientists can develop a diverse set of applications, enabling powerful and efficient solutions. “Foundation models only know what things are represented in the data,” explained Manil Maskey, the data science lead at NASA’s Office of the Chief Science Data Officer (OCSDO). “It’s like a Swiss Army Knife—it can be used for multiple different things.”
Once a foundation model is created, it can be trained on a small amount of data to perform a specific task. To date, the Interagency Implementation and Advanced Concept Team (IMPACT) along with collaborators have demonstrated the geospatial foundation model’s capabilities by fine-tuning it to detect burn scars, to delineate flood water, and to classify crop and other land use categories.
Because of the computational resources required to create the initial foundation model, a partnership was necessary for success. In this case, NASA brought the data and scientific knowledge, while IBM brought the computing power and AI algorithm optimization expertise. The team’s shared commitment to making their research accessible through open science principles ensures that their model can be useful to as many researchers as possible.
“To build a foundation model at scale, we realized early on that it's not feasible for one institution to build it,” Maskey said. “Everything we have done on our foundation models has been open to the public, all the way from pre-training data, code, best practices, model weights, fine-tuning training data, and publications. There’s transparency, so researchers can trace why certain things were used in terms of data or model architecture.”
Following on from the success of their geospatial foundation model, NASA and IBM Research are continuing their partnership to create a new, similar model for weather and climate studies. They are collaborating with Oak Ridge National Laboratory (ORNL), NVIDIA, and several universities to bring this model to life.
This time, the main dataset will be the Modern-Era Retrospective analysis for Research and Applications, Version 2 (MERRA-2), a huge collection of atmospheric reanalysis data that spans from 1980 to the present day. Like the geospatial foundation model, the weather and climate model is being developed with an open science approach, and will be available to the public in the near future.
Covering all aspects of Earth science would take several foundation models trained on different types of datasets. However, Maskey believes those future models might someday be combined into one comprehensive model, leading to a “digital twin” of Earth that would provide unparalleled analysis and predictions for all kinds of climate and environmental events.
Whatever innovations the future holds, NASA and IBM’s geospatial and climate foundation models will enable leaps in Earth science like never before. Though powerful AI tools will enhance researchers’ work, the team’s dedication to open science supercharges the possibilities for discovery by allowing anyone to put those tools into practice and pave the way for groundbreaking research to help better care for the planet.
For more information about open science at NASA, visit:
https://science.nasa.gov/open-science/
By Lauren Leese
Web Content Strategist for the Office of the Chief Science Data Officer