
A crowd gathered for TPC25’s Hackathon/Tutorial plenary session in San Jose. (Source: TPC25)
Monday marked day one of TPC25, the Trillion Parameter Consortium’s all-hands conference and exhibition in San Jose, Calif. A group of over 130 participants gathered for a morning plenary session to kick off the conference’s hackathon/tutorial focused on building agentic AI systems for science. Before the keynote began, an introduction from Pacific Northwest National Laboratory Chief Data Scientist Neeraj Kumar set the scene for the July 28-31 conference, reminding the standing room only crowd that the Trillion Parameter Consortium formed just two years ago with a single objective: democratizing access to the most powerful AI models to accelerate scientific discovery.
TPC now has nearly 1,500 members across universities, national laboratories, and industry, which is evidence, Kumar said, that the consortium’s vision of open access is now taking hold in everyday research. He pointed to recent milestones, from agentic systems that are drafting hypotheses to generative models that are speeding up materials discovery. “These are not incremental improvements, but are a paradigm shift in how we conduct science,” he said. “These breakthroughs are just the beginning. The real transformation happens when we put these tools into the hands of scientists, engineers, all of you researchers who understand the deep questions in your field that can harness AI to find answers.”
Giving today’s keynote was Vivek Natarajan, a research lead at Google DeepMind, who presented the lab’s “AI co‑scientist,” a multi-agent AI system and collaborative tool built on the Gemini 2.0 model. Early work with medical language models revealed a bottleneck: generating a single useful hypothesis required thousands of random samples. AI co‑scientist tackles that inefficiency by running a network of Gemini‑based agents that share four core tasks: generate, review, rank, and improve. This mirrors the way a human group might brainstorm, critique, and refine ideas, but works in seconds instead of days.
Natarajan explained how the system’s architecture moves the slow work of reading papers, testing ideas, and ranking options onto AI agents while keeping every decision visible for scientists to review. One agent reads recent papers and proposes extensions, another stages text‑based debates to test assumptions, and a reflection agent checks claims against primary data or tool outputs like AlphaFold. A ranking agent scores every idea on novelty, testability, potential impact, and safety, then tracks results tournament‑style so stronger concepts rise to the top. An evolution agent revisits promising ideas, pulling in new literature or targeted data to sharpen them further. The full trace of each step is saved, a choice Natarajan argued can improve reproducibility by giving peer reviewers a transparent chain of reasoning and tool use.

The design of DeepMind’s AI co-scientist. Slide courtesy of Vivek Natarajan.
An audience member asked Natarajan for details on how the system scores ideas for “novelty.” He said each hypothesis is broken down into core ideas and assumptions, then checked against existing literature to confirm the elements are truly new. Creative combinations of known results earn some credit but less than concepts with no published precedent. Because fresh ideas often carry untested assumptions, the system balances novelty against testability and correctness, displaying scores across all axes so researchers can weigh different aspects before choosing which paths to pursue.
Case studies grounded AI co-scientist’s architecture. In acute myeloid leukemia, AI co‑scientist proposed three repurposed drugs: one showed strong tumor inhibition in cell tests, two did not, showing the gap between a drug’s promise and reality. A partnership with Stanford used the system to identify a Vorinostat‑based therapy for liver fibrosis that cut chromatin damage by a significant 91% in human organoids. In another striking case study, the system closely reproduced an unpublished antimicrobial‑resistance breakthrough from Imperial College London in two days, prompting the lead researcher to ask Google if it had accessed his private files.
Natarajan noted that every AI co-scientist agent currently runs on the same Gemini model, yet future versions could mix specialized models for chemistry, genomics, or physics. Access remains limited through a trusted tester program because each run is compute‑intensive, but he encouraged attendees to propose pilot projects.

LtR: Vivek Natarajan of Google DeepMind, Charlie Catlett of Argonne, and Neeraj Kumar of PNNL. (Source: TPC25)
The plenary closed with practical guidance for the hackathon/tutorial session. Kumar reminded newcomers that TPC’s working sessions are built around three goals: grow an open community, launch projects no single lab could tackle alone, and train the next generation of AI researchers. Organizers promise complete transparency on code, data, and evaluation methods, along with the expectation that every participant share results back with the group.
For the hackathon, teams will meet in six sessions ranging from basics to final demos. Mentors from Argonne, Berkeley Lab, and industry sponsors will circulate to help with agent design, debugging, and access to pooled Nvidia and Cerebras compute credits. Running in parallel, the tutorial path offers a structured dive into agentic AI for science with topics like fine‑tuning, retrieval‑augmented generation, and grounding model answers in scientific literature. Organizers stressed that participants may move between tracks, but the goal remains the same: to leave San Jose with a prototype, new collaborators, and a clearer sense of how agent systems can advance everyday research.
The four‑day program continues with keynotes, lightning talks, and breakout sessions through Thursday. We’ll be bringing you daily highlights and deeper dives into the discoveries that emerge at TPC25.
Related