Early Experiments in Accelerating Science with GPT-5

Science shapes the medicines that keep us healthy, the energy that powers our cities, the technologies that drive our economies, and our understanding of the universe. It’s the foundation of human progress. If we can shorten parts of the discovery cycle, the ripple effects reach almost every part of society. Because scientific results are visible, testable, and shared, progress in one place can benefit many others. OpenAI for Science advances our mission – ensuring that artificial general intelligence benefits all of humanity – by helping accelerate that progress.

That pace is currently a real constraint. Even when a promising idea exists, turning it into a tested result, a working technology, or a standard of care can be painfully slow. In a recent survey we ran, about 3 in 5 people in the U.S. said they’re worried about how long it takes for scientific and medical breakthroughs to reach them. Seventy-three percent said we need new and innovative ways to accelerate discoveries, and sixty-nine percent said scientific leadership is a top national priority.

Today, we’re introducing a paper – “Early science acceleration experiments with GPT-5” – co-authored by researchers at OpenAI together with collaborators at universities and national laboratories, including Vanderbilt University, UC Berkeley, Columbia University, the University of Oxford, Lawrence Livermore National Laboratory, The Jackson Laboratory, and others. The paper presents early case studies showing how GPT-5 has supported scientific work across math, physics, biology, and computer science – where the model generated helpful ideas, rediscovered known results, produced candidate proofs, or conducted broad literature review – and it documents limitations just as directly. We’re releasing this collection to give the scientific community an early, qualitative view of what these systems can and cannot do in real research settings.

What is OpenAI for Science?

OpenAI for Science is our program for empowering scientists and mathematicians across disciplines by pairing frontier AI models with the right tools, workflows, and collaborations to accelerate discovery. We’re collaborating with researchers in academia, industry, and national labs, studying concrete use cases, and feeding what we learn back into how we design and deploy these systems.

Our goal is to help researchers explore more ideas, test hypotheses faster, and unlock discoveries that would otherwise take years. That can mean finding a plausible proof path, proposing a small number of mechanisms from a confusing dataset, or navigating a body of literature across fields and languages without relying solely on keyword search.

We’re building this for scientists. In practice, that means fitting AI systems into real workflows – literature review, proof generation, modeling, simulation, and wet-lab planning. It also means being explicit about where these tools are actually useful and where they are not, and letting empirical evidence shape what we do next.

Our approach combines two complementary beliefs. Specialized scientific tools, such as simulation engines, protein databases, and computer algebra systems, are essential for efficiency and precision. At the same time, scaling foundation models continues to unlock new reasoning abilities: connecting ideas across fields, sketching proofs, proposing mechanisms, and navigating large literatures conceptually rather than by keyword. Where specialized tools exist, we want to use them; where general reasoning is required, we build models designed to handle it. Both paths reinforce each other.

How Scientists are Working with GPT-5 Today

Today, the most interesting progress comes from human-AI teams. Scientists frame the question, choose the right tools, and ultimately judge the validity of any output. GPT-5 brings broad background knowledge, fast exploration of many approaches, and the ability to keep track of long, technical arguments. Progress usually looks like a dialogue: the model suggests ideas, the researcher critiques and redirects, the model tries again, and this loop continues until the pieces fit – or until the scientist decides the path is a dead end.

Using GPT-5 this way is a skill. Researchers have to learn how to pose problems, when to push back, how to break a question into steps, and how much “thinking time” to give the model. They also have to decide which parts of a result can safely lean on the model and which parts must be independently re-derived or experimentally validated. We’re sharing a short guide for researchers on how to work with systems like GPT-5 in practice.

The Current State of GPT-5 in Scientific Work

GPT-5 is beginning to assist researchers in ways that appear to shorten certain parts of their workflow, and we’re now seeing early, cautious indications of the model contributing to new scientific insights. To be clear, this is not a system that solves scientific problems on its own. It is a tool that, in the hands of experts, can help them reach correct results more quickly.

One emerging capability is high-level conceptual literature search across disciplines and languages. Instead of relying only on keyword matching, GPT-5 can often identify deeper conceptual connections between ideas. In practice, it has surfaced work from obscure theses and across languages that gave researchers insights they may not have otherwise found.

Researchers are also using GPT-5 to support mathematical discovery. In many cases, mathematicians who were confident a theorem was true – and had a plan to give a key lemma to a postdoc – found that GPT-5 was able to produce a correct proof outline in minutes. These are results that humans could eventually reach, but the model shortens the cycle, moving work that might have taken days or weeks to under an hour.

These capabilities are not universal or fully reliable. GPT-5 is strongest today in domains with clear formal structure and fast feedback – mathematics, theoretical computer science, and parts of physics.

In more empirical domains like biology, it can propose mechanisms and experiments, but those only matter once they survive real data and real experiments.

To be clear: GPT-5 is not proving field-defining theorems independently, nor is it substituting for scientific creativity, judgment, or intuition. But we are clearly past the stage where the model only summarizes existing knowledge. We now see early “green shoots” of contribution, and the pace at which these capabilities are emerging suggests the potential for meaningful acceleration in the near future – when used carefully and with expert oversight.

The rest of this post shares some of the first places where GPT-5 has helped move real problems forward in math, physics, and biology, and what those collaborations are teaching us about where these systems help, where they fail, and how to use them responsibly.

Limitations

These case studies are curated examples intended to illustrate the narrow but real ways GPT-5 can currently help shorten parts of a research workflow, not a systematic sample of scientific work, and they do not capture the full range of failure modes researchers may encounter in practice.

Right now, GPT-5’s strongest contributions appear when an expert is guiding the model, and critiquing and independently validating the model’s outputs. Working with scientists in the field for several years – including through our partnerships with the National Labs – we’re tracking recurring areas for improvement. Researchers report that the model can sometimes hallucinate citations, mechanisms, or proofs that appear superficially plausible; it can be sensitive to scaffolding, warm-up problems, and prompt formulation; it may fail to recognize circular reasoning or unproductive directions; and it may miss domain-specific subtleties that an expert would identify immediately. These are active areas of research, and we are working with collaborators to better measure, report, and mitigate these issues as we refine future versions.

What’s Next

Taken together, these case studies show that GPT-5 is starting to provide early indications that it can assist with new kinds of scientific work. It is not running projects on its own, but in the hands of experts it can help sharpen theorems, rediscover and extend hidden structures, surface connections across fields, and propose mechanisms and experiments that scientists can then test and validate. We are past the point where the model only summarizes what already exists; we are seeing the first, narrow but real, signs of acceleration.

At the same time, everything here depends on close collaboration with scientists. In every example, researchers framed the question, challenged weak ideas, checked every line of a proof, and tested biological and physical claims against data. That is the model we want to keep: human intuition, values, and judgment setting the agenda, with GPT-5 providing speed, breadth, and the ability to explore many lines of thought in parallel. Working with these systems is a skill that the community is still developing, which is why we are putting as much emphasis on sharing good practices as we are on sharing results.

OpenAI for Science is our way of making that collaboration systematic. We will keep working with universities, labs, and research groups around the world to run more of these experiments, publish what we find, and build tools that fit naturally into how scientists already work. Ongoing feedback is how we intend to continue improving tools like GPT-5 for the scientific community.

Looking ahead, our expectation – based on current evidence – is that the model’s reasoning improves when given more time and compute. If GPT-5 can help address research questions in 20 minutes, imagine what becomes possible when it can think for hours, days, or weeks on a single problem. That trajectory – combined with world-class researchers using these systems – is what gives us measured confidence that these tools could contribute to a step-change in scientific productivity over time.

Source:

Tell Us What You Think

Do you have a review, update or anything you would like to add to this news story?

Leave your feedback
Your comment type
Submit

Sign in to keep reading

We're committed to providing free access to quality science. By registering and providing insight into your preferences you're joining a community of over 1m science interested individuals and help us to provide you with insightful content whilst keeping our service free.

or

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.