
Artificial intelligence (AI) is spreading like wildfire through much of the developed world, including in research. This emerging field of 'AI for Science' (AI4Science) opens many opportunities for using AI in research, such as generating hypotheses, designing and conducting experiments, analysing data and making inferences, and writing grant proposals, scientific papers, and summaries.
Visions for an AI-powered scientific future include AI scientists who could autonomously drive Nobel-worthy discoveries. Just like any disruptive technology, AI raises many questions that need to be tackled rapidly to ensure that AI's prospective benefits in research and beyond outweigh its potential drawbacks.
AI cuts across sciences
Finding the needle in the haystack
One area in which AI excels is accelerating the discovery process by enabling rapid exploration of vast solution spaces.
Take materials discovery, for example. This has typically involved laborious and time-consuming synthesis work by chemists and materials scientists to find materials with optimal properties. But the combinatorial space of possible compounds is vast, making the search for materials akin to a needle in a haystack. There is a strong push for rapid material discovery for next-generation technologies (for example, batteries and solar cells) for an energy transition.
Enter AI systems such as Google DeepMind's GNoME (graph neural networks for materials discovery). Such AI systems are now able to rapidly identify vast numbers of potential compounds and predict which would be stable. AI-run robotics-assisted labs such as A-Lab can autonomously synthesize materials predicted from the Materials Project and GNoME. In just 17 days, A-Lab was able to synthesize 41 new compounds from a set of 58 targets.
Similar needle-in-a-haystack problems exist across the sciences, including protein design. Proteins are vital for developing next-generation vaccines, therapeutics, and biomaterials. However, this requires identifying proteins with the right function, which is determined by their molecular composition and structure. Similarly to materials discovery, the solution space is vast, and until recently, scientists have needed to screen many proteins to identify one that works as desired.
Now AI tools such as AlphaFold and RoseTTAFold predict protein structures from amino-acid sequences using neural networks trained on known protein sequences and structures. Even newer tools, such as RFdiffusion, generate structures that have an increasingly high likelihood to bind to the desired target, which radically reduces the time to develop working protein designs.

Accelerating clinical sciences
Such AI-powered acceleration occurs across the sciences, including medical and health sciences. For example, AI is now being explored for accelerating clinical trials. It now takes 1 decade and more than 1 billion dollars to bring a drug to market. Half of the time and money is spent on clinical trials, and only a fraction of the drugs that enter phase-1 trials are approved. Therefore, there are efforts to use AI to drastically cut down the time/money spent to take drugs to market. Prospective uses of AI include trial design, writing protocols recruiting patients, and analysing data.
AI has also many prospective uses in image analyses across sciences, including medical and health sciences. For example, an AI tool was developed to trace metastatic cancers to their source. This tool even outperforms pathologists at identifying the origins of metastatic cancer cells (including common cancers such as lung, ovaries, breasts, and stomach) that circulate in the body. This type of early and accurate identification can be the difference between life and death for patients - the researchers found in a retrospective assessment of a pool of participants four years after they'd received treatment that those who received treatment for the type of cancer predicted by the model, were more likely to have survived.
Boosting researchers' productivity
Many AI tools are also rapidly being developed to support scientists' research. AI-powered search engines function as digital assistants. They can provide yes-no answers to research questions, tidy up bibliographies, suggest new papers, and generate summaries.
And, of course, the advent of large-language models (LLMs) such as ChatGPT has already had a big impact on scientists' productivity. LLMs are now used to conduct literature reviews and brainstorm research ideas, and many researchers used them to write code, manuscripts, grant applications, and summaries.
Multi-faceted use
AI4Science and the ways in which researchers incorporate AI in their scientific workflows are multi-faceted and develop rapidly. As Wang et al. write in this review, "They [AI algorithms] are becoming indispensable tools for researchers by optimizing parameters and functions[4], automating procedures to collect, visualize, and process data[5], exploring vast spaces of candidate hypotheses to form theories[6], and generating hypotheses and estimating their uncertainty to suggest relevant experiments[7]."
The review authors give examples of how AI use proliferates across disciplines, enabling the integration and analysis of massive data sets, refinement of measurements and data, and guidance in designing experiments and workflows that enable autonomous discovery. They describe how AI systems could play a valuable role in interpreting scientific datasets and extracting relationships and knowledge from scientific literature in a generalised manner, such as recently illustrated by the demonstration of the potential for unsupervised language AI models to capture complex scientific concepts, such as the periodic table, and predict applications of functional materials years before their discovery, suggesting that latent knowledge regarding future discoveries may be embedded in past publications.
Harnessing data
But Wang et al. point to many challenges as well. For example, training data for AI algorithms must be annotated reliably, requiring time-consuming and resource-intensive experimentation and simulations. Algorithmic approaches such as pseudo-labeling and label propagation have been suggested to automate the annotation of vast unlabeled datasets based on only a small set of accurate annotations. In pseudo-labeling, surrogate models - trained on manually labelled data - annotate unlabeled samples. In label propagation, labels are diffused to unlabelled samples via similarity graphs constructed based on so-called feature embeddings. In addition to these strategies, active learning could also be used to identify the most informative data points to be labelled by humans or the most important experiments to be done.

Another requirement for training data to maximize deep-learning performance is that it needs to be large, high-quality, and diverse. However, data is often incomplete. Also here are algorithmic solutions to generate synthetic data points through automatic data augmentation and deep generative models. For example, generative adversarial networks have been shown to be useful in many domains to improve scientific images because of their ability to synthesise realistic images.
AI techniques are also useful for refining data, for example, to significantly increase resolution (through deep convolutional methods) and decrease noise (through so-called denoising autoencoders) in measurements.
And Wang et al. point out that deep learning can extract meaningful representations of scientific data at various levels of abstraction and optimize them to guide research. Three such strategies are geometric deep learning (based on graph neural networks that model relational patterns in the data), self-supervised learning (a technique that enables models to learn the general features of a dataset without relying on explicit labels), and language modelling (particularly useful for biological sequences).
Generating and testing hypotheses
As Wang et al. argue, AI could also help generate testable hypotheses, but also here challenges need to be overcome. First, the space of possible hypotheses is vast and potentially infinite. One strategy to reduce this vastness is incorporating physical laws and principles into AI.
Second, the review authors point out that AI could generate far more hypotheses than would be feasible to test because of resource constraints or be worthy of pursuit from scientific, commercial, or resource-optimization points of view. One way to optimize the search for hypotheses is to implement AI policies that estimate the reward of each search and prioritize searches with higher values in this reward-signal using evolutionary algorithms.
To aid drug discovery—where high-throughput screening is used to assess thousands to millions of molecules—algorithms could prioritize which molecules to investigate experimentally. However, doing so requires so-called ground-truth data, which may be unavailable for many molecules.
Therefore, weak supervision-learning approaches could be used to train these models, whereby noisy, limited, or imprecise supervision is used as a training signal (instead of costly experiments, calculations, or human labeling). AI-selected candidates could then be sent to medium—or low-throughput experiments for continual refinement using experimental feedback.
Protein design has greatly benefited from these developments. AlphaFold2 is powerful for predicting the atomic coordinates of proteins with atomic accuracy, even for proteins unlike those in the training data set. In addition to such forward-predictive problems, AI is increasingly explored for solving inverse problems.
Overall, Wang et al. argue that AI could make testing of hypotheses more efficient by aiding in designing and planning experiments, optimizing resource use, reducing unnecessary investigations, and even replacing costly experiments with simulations by providing key parameters.
Unresolved challenges
The flip side of AI4Science is the emergence of many difficult questions that need urgent answers. One question raised by Wang et al. relates to AI models' ability to generalize. This is currently a prospective weak point in AI models because neural networks trained on data from a specific regime may discover patterns that do not generalize to a different regime, whose underlying distribution is different. Humans are better at generalizing from data, the hypothesis being that we do so because we use causal rather than statistical models when we make inferences. Hence, the review authors argue that implementing causality in AI would be key.
Another step in the AI-development roadmap pointed out by Wang et al. includes developing strategies for accounting for multimodal input to the training set. The authors argue this could prospectively be implemented in neural networks by exploiting their modularity - distinct neural modules could transform diverse data modalities into universal vector representations.
An overarching goal should also be to explain how AI models work. Many models currently operate as black boxes, which poses challenges for their use and risks decreasing trust in AI.
AI Developments need to extend beyond algorithmic developments. As the authors point out, AI can only transform research if AI expertise is incorporated into research teams. Research teams will need to include AI specialists, software and hardware engineers, and new forms of collaboration. Resources would likely need to be shared because the computational and data requirements associated with AI models are massive.
Educational programs would need to be revamped to train scientists in designing, implementing, and applying laboratory automation and AI in scientific research. They should also teach scientists when the use of AI is appropriate and how to use AI ethically. Achieving this requires tackling many thorny ethics, privacy, security, and intellectual property issues.
AI scientists driving tomorrow's discoveries?
The coming years will see stakeholders in the research ecosystem navigate uncharted territory. There are many visions and interests for AI developments in research. Hiroaki Kitano articulated one of these in his perspective Nobel Turing Challenge: creating the engine for scientific discovery. His Nobel Turing Challenge "...aims to develop a highly autonomous AI system that can perform top-level science, indistinguishable from the quality of that performed by the best human scientists, where some of the discoveries may be worthy of Nobel Prize level recognition and beyond."
Rise of the AI Scientist
According to Kitano, the key to this challenge would be to develop 'AI Scientists', comprised of software and hardware modules that dynamically interact to generate hypotheses, learn from data and interactions with humans and other parts of the system, reason, and autonomously make decisions.
To achieve this, he proposes that these AI Scientists would not only be capable of autonomously conducting scientific research to generate scientific discoveries at scale but also to make strategic choices on the topic of research and communicate through publications and other means to explain the value, methods, reasoning behind the discovery, and their applications and social implications.
He envisions a future where AI Scientists then will be (almost) comparable to top-level human scientists and wonders whether they would be distinguishable from top-level human scientists in the Feigenbaum Test (a variation of the Turing Test), or whether they will exhibit patterns of scientific discovery that are different from human scientists.

Toward unbiased exploratory discovery
Kitano posits that AI Scientists would be good at exploring a vast hypothesis space and that, therefore, it would enable moving away from what he calls the traditional approach to doing research. In this approach, scientists aim to maximize the probability that the discovery they make will be significant under certain criteria—he dubs this hunt for significant discoveries a value-driven approach.
The alternative approach he proposes would maximize the probability of discovery at any level of significance instead of maximizing value, a process he dubs unbiased exploration of hypotheses. He views this transition as a "logical evolution of the modality of science where a vast hypothesis space is searched in an unbiased manner rather than depending on human intuition."
With an increased level of autonomy, AI Scientists would make decisions on which topics and hypotheses to pursue. Kitano suggests that such strategies could be based on 1) goal-oriented approaches of defining very high-level goals and finding multiple paths to best achieve such goals, or 2) a bottom-up approach of exploring hypothesis search space based on discoveries already made by a specific AI Scientist.
He further states that "The real value of AI Scientist is its capability to explore hypothesis space magnitude more efficiently into seemingly low-value domains with the expectation that may eventually lead to major outcomes.", something that he claims would be infeasible to do by human scientists. He suggests that there would be two roles for AI Scientists: 'AI Scientist as a Problem Solver' (where the problem-solving would be aligned with the value of stakeholders) and 'AI Scientist as an Explorer' (which would "boldly explore hypothesis space nobody has gone before"). The common denominator for these roles, and what Kitano views as distinguishing the AI Scientist discovery process from the traditional one, is the exhaustive hypothesis generation and verification.
For implementation, Kitano envisions two possible characteristics for an architecture of AI Scientists: 1) a multiplexed agent system where multiple instances of specialized AI agents are created to explore hypotheses spaces organically whilst communicating to merge discoveries for further exploration; 2) a human-in-the-loop system, where humans enter as domain experts or in commanding and monitoring roles. The AI Scientists may exist at the institutional, academic community, or national levels and communicate with each other in networks formed based on data ownership, privacy, and intellectual property considerations.
Uncharted territory
Like many other AI-related visions, Kitano's AI-Scientist vision focuses predominantly on opportunities, which is understandable. But many thorny questions arise in the rapidly evolving AI4Science.
Take, for instance, Kitano's claim that the transition from a value-based to exploration-based discovery is a "logical evolution of the modality of science where a vast hypothesis space is searched in an unbiased manner rather than depending on human intuition." It is neither clear why this transition is logical nor is it clear that it would be desirable - after all, why should we ignore human intuition? Why is the purportedly unbiased exploration approach desirable as a replacement for the current approach? To what extent can these approaches generate mechanistic insights and causal principles rather than correlations-driven predictions only? Why would it be desirable to move away from value-driven research in the first place?
These are only a few questions that would need addressing in the rapidly developing AI4Science realm. Other pressing questions relate to issues such as: intellectual property and attribution; evaluation of scientists' research output; inequalities arising in the research ecosystem because of the major investments needed; biases in AI algorithms and training data; reproducibility issues; dual-use and security risks; energy demands of AI systems; fraud and misinformation that could be accelerated with AI; the role of scientists and the interplay between different stakeholders in the research ecosystem as AI permeates it.
Tackling these questions requires researchers from all disciplines (science, technology, social science, humanities), funders, publishers, policy-makers, legal experts, tech companies, and key non-governmental organizations that represent everyone that will be affected by these exponential developments, to come together to discuss the various visions that exist and set up common frameworks, standards, and guidelines.
I aim to look into some of these developments in future blog posts.
Thank you for reading! Share within your networks. And do let us know your comments, questions, or suggestions for topics you wish us to cover in future posts.
Features: everything in our on-demand, self-paced e-course + live sessions with Q&As and discussions (all courses), and exercises/peer-to-peer feedback (optional add-on).
We offer bespoke courses, course environment, and training of your staff in managing the LMS, to enable offering our courses to all of your researchers and staff in walled-off course environment that you control access to.
Features: everything in our on-demand, self-paced e-course + live sessions with Q&As, discussions, exercises, peer-to-peer feedback (optional add-on), individual feedback from instructor (optional add-on).
We offer bespoke courses, course environment, and training of your staff in managing the LMS, to enable offering our courses to all of your researchers and staff in walled-off course environment that you control access to.
Features: everything in our on-demand, self-paced e-course + live sessions with Q&As, discussions. Optional add-ons: exercises and assignments, peer-to-peer feedback.
We offer bespoke courses, course environment, and training of your staff in managing the LMS, to enable offering our courses to all of your researchers and staff in walled-off course environment that you control access to.