The National Science Foundation (NSF) on Thursday announced the launch of its new Integrated Data Systems and Services (NSF IDSS) program to build national-scale data systems and selected 10 datasets for integration into the National Artificial Intelligence Research Resource (NAIRR) Pilot.

Both of those efforts, NSF said, are working toward the goal of the White House’s AI Action Plan released earlier this month.

The NSF IDSS program, the agency said, “will fund the development and operation of powerful national-scale systems and associated services that allow researchers across the country to access, use and share scientific data — accelerating innovation and strengthening American competitiveness in AI and other sectors.”

IDSS, NSF said, fills an existing gap in programs that support operational national-scale data systems.

“IDSS fills this gap by enabling the deployment of high-impact platforms that serve research and education communities and interoperate with other federal science and data infrastructure efforts,” the agency said.

“A robust data infrastructure is also critical to the success of the NSF-led NAIRR Pilot, a key initiative expanding access to AI research resources,” the agency continued.

“As AI transforms sectors from health care and agriculture to energy and national defense, researchers face the challenge of accessing and integrating vast data to power advanced AI systems,” NSF said.

“Awarded systems and services through the IDSS program will be integrated into the NAIRR and other NSF-managed programs, such as the NSF Advanced Cyberinfrastructure Coordination Ecosystem: Services and Support program, to be made easily discoverable and accessible to the nation’s research and education communities,” the agency said, adding, “These systems will connect data with computing, instruments and software, making AI development, data analysis and scientific discovery faster, more reliable and more reproducible.”

In a parallel effort, NSF announced the selection of 10 datasets for integration into the NAIRR Pilot.

The NAIRR serves as a shared national infrastructure to support the AI research community and power responsible AI use. NSF launched the NAIRR Pilot in January 2024.

The 10 datasets include:

AI4Shipwrecks (University of Michigan)

Turbulence Database (Johns Hopkins University)

Cell Painting Gallery (Broad Institute)

FathomNet (Monterey Bay Aquarium Research Institute)

PatchDB (George Mason University)

Phase-Field Fracture Simulation (Johns Hopkins University)

SecureChain (Purdue University)

Microbiome Preterm Birth DREAM Challenge Dataset (The March of Dimes Repository for Preterm Birth Research at the March of Dimes Prematurity Research Center at the University of California, San Francisco)

Industry Documents Library (University of California, San Francisco)

OpenTopography (UC San Diego, Arizona State University and the Earthscope Consortium)

“The datasets cover a range of domains, including lidar-based terrain mapping, microbiome data and software supply chain graphs, and several of them will offer integration with NAIRR Pilot partner platforms,” NSF said, adding that “in the coming weeks, many of the datasets will become more deeply embedded in the NAIRR Pilot.”