Google DeepMind – Artifex.News

How will AlphaFold 3 change life sciences research?

admin — Thu, 20 Jun 2024 00:00:00 +0000

Proteins are one of the most important molecules of life, with almost every biological function from birth to death being regulated by them in some way. Each protein is made up of a string of smaller building blocks called amino acids, which contain all the information to transform proteins — from a single sequence to a folded, functional 3D structure.

The steps a protein takes to go from its straight form to its final form are too many to count and too hard to follow, leaving the question of how every protein folds — the famous protein-folding problem — unanswered. “If you want to understand the molecular basis of how cells work, how organisms work, how life works, you need to understand how proteins get their shape,” Frank Uhlmann, a biochemist at the Francis Crick Institute in London, said.

Answers ex machina

Things changed when Google DeepMind’s protein-structure prediction software AlphaFold burst into the scene in 2020. They changed more drastically in 2021 with the highly improved AlphaFold 2. AlphaFold uses machine learning and artificial intelligence (AI) to accurately predict protein structures from an amino acid sequence, seemingly solving the protein-folding problem without learning any of the deeper physical principles that drive this biological process.

“If the protein folding problem was set to us by God to teach us how to learn molecular interactions from first principles, we cheated,” Derek Lowe, author of the Science column “In the pipeline” and long-time pharmaceutical researcher, told The Hindu. “We haven’t learned a tremendous amount more about that. We have figured out how they usually do it, even if we don’t know why.”

“It’s startling how it works as well as it does.”

Now, in a Nature paper published in May 2024, scientists at DeepMind led by John Jumper introduced AlphaFold 3, building on its predecessors with even more transformative capabilities. AlphaFold 3 can predict protein-protein interactions as well as the structures of other molecules like DNA and RNA, along with the interactions of proteins with all these other compounds.

Democratising research

“AlphaFold 2 predicted the structure of proteins with revolutionary levels of accuracy,” Josh Abramson, a research engineer at DeepMind and lead author of the new paper, told The Hindu in an email.

“AlphaFold 3 is even more accurate for proteins, but can also predict the structure of DNA, RNA, and all the other molecular components that make up biology. The interaction of all these biomolecules is what makes up the processes of life, so it is important to be able to predict the structure of these interactions.”

Apart from being able to give us a lot more insight into biological processes, the new AlphaFold is also more usable by scientists who aren’t experts in machine learning. Dr. Uhlmann, who has been using AlphaFold 3 to study how proteins and DNA interact in chromosomes, said, “You don’t need to know anything about coding, now literally everybody can do it. All you need is a Google account, you can upload protein sequences in the DeepMind server, and 10 minutes later you get your results. That completely democratises structure prediction research.”

From noise to signal

The original AlphaFold was trained on the thousands of sequences and protein structures present in the protein data bank, a giant protein repository where scientists submit experimentally determined protein structures. “It is completely ignoring all the fundamental physics and thermodynamics, it’s modelling based on learning what real structures tend to look like, taking advantage of tendencies of protein structures that are too subtle for humans to realise,” Dr. Lowe said.

Unlike its predecessors, AlphaFold 3 uses a diffusion model, which is what image-generating software also uses. The model works by first training on protein structures, adding noise to the data, and then trying to de-noise it. This way, the model becomes able to work its way back from a noisy structure to a real protein structure. This architecture also helps AlphaFold 3 handle a much larger input dataset.

A reliability problem

Its accuracy at predicting protein-protein interactions is also incredibly high — but not its reliability when it comes to interactions between small molecules and proteins. Proteins use a language of 20 amino acids whereas small molecule ligands “have a much larger vocabulary”, according to Dr. Lowe.

Greater variations in the dataset and the use of diffusion techniques can lead to the model coming up with answers that look plausible but aren’t real. Adding more training data can help circumvent this problem, but not entirely get rid of it.

Nevertheless, AlphaFold 3 predicts protein structures and interactions better than other models right now. Academics and companies can potentially use it to find drug candidates that can bind to proteins and help cure diseases. In fact, DeepMind’s spin-off company Isomorphic Labs is using AlphaFold 3 for this very purpose: drug discovery. However, this option isn’t open to everyone yet.

A peek under the hood

Additionally, even though scientists are free to use the AlphaFold server to upload their protein sequences, many researchers are irked at not being able to access the model’s full code. This means they can’t play around with its nuts and bolts and modify it for specific use-cases.

An important implication of this lack of access is that it’s currently impossible to use AlphaFold 3 to find structures of proteins bound to drug candidates. Researchers expressed their disappointment in an open letter signed by more than 600 to date. According to the text, the restriction “does not align with the principles of scientific progress, which rely on the ability of the community to evaluate, use, and build upon existing work.” Different groups have also begun a race to crack the model’s code and make open-source versions.

Responding to the backlash, DeepMind scientists have also changed their initial stance of not releasing the whole code to saying they will do so in six months.

We love the excitement & results from the community on AlphaFold 3 and are doubling the AF Server daily job limit to 20. Happy to also share that we’re working on releasing the AF3 model (incl weights) for academic use, which doesn’t depend on our research infra, within 6 months.

— Pushmeet Kohli (@pushmeet) May 13, 2024

The journey begins

For now, we need to wait and watch how DeepMind decides to let eager scientists look under the hood and examine AlphaFold 3 more closely, to appreciate its full power. But until then, the model remains one of the best AI-based protein structure prediction models out there, now with the ability to predict interactions with other kinds of biological structures as well.

At the same time, both Dr. Lowe and Dr. Uhlmann wanted to be clear that even if AlphaFold 3 makes very good predictions, it shouldn’t be treated as an “infallible oracle”. Instead, it offers a goodstarting point where scientists can obtain some answers, which they can then build on with further experiments and expert analysis.

“It’s a prediction, you can’t take it for granted,” Dr. Uhlmann said. “It’s not solving your question, but it’s a new and exciting discovery tool that helps you build and test new hypotheses.”

Rohini Subrahmanyam is a freelance journalist with a PhD in biology from the National Centre for Biological Sciences, Bengaluru.

Source link

Google DeepMind unveils next generation of drug discovery AI model

admin — Thu, 09 May 2024 02:29:26 +0000

Google DeepMind unveils next generation of drug discovery AI model.
| Photo Credit: AP

Google Deepmind has unveiled the third major version of its “AlphaFold” artificial intelligence model, designed to help scientists design drugs and target disease more effectively.

In 2020, the company made a significant advance in molecular biology by using AI to successfully predict the behaviour of microscopic proteins.

With the latest incarnation of AlphaFold, researchers at DeepMind and sister company Isomorphic Labs – both overseen by cofounder Demis Hassabis – have mapped the behaviour for all of life’s molecules, including human DNA.

The interactions of proteins – from enzymes crucial to the human metabolism, to the antibodies that fight infectious diseases – with other molecules is key to drug discovery and development.

(For top technology news of the day, subscribe to our tech newsletter Today’s Cache)

DeepMind said the findings, published in research journal Nature on Wednesday, would reduce the time and money needed to develop potentially life-changing treatments.

“With these new capabilities, we can design a molecule that will bind to a specific place on a protein, and we can predict how strongly it will bind,” Hassabis said in a press briefing on Tuesday.

“It’s a critical step if you want to design drugs and compounds that will help with disease.”

The company also announced the release of the “AlphaFold server”, a free online tool that scientists can use to test their hypotheses before running real-world tests.

Since 2021, AlphaFold’s predictions have been freely accessible to non-commercial researchers, as part of a database containing more than 200 million protein structures, and has been cited thousands of times in others’ work.

DeepMind said the new server required less computing knowledge, allowing researchers to run tests with just a few clicks of a button.

John Jumper, a senior research scientist at DeepMind, said: “It’s going to be really important how much easier the AlphaFold server makes it for biologists – who are experts in biology, not computer science – to test larger, more complex cases.”

Dr Nicole Wheeler, an expert in microbiology at the University of Birmingham, said AlphaFold 3 could significantly speed up the drug discovery pipeline, as “physically producing and testing biological designs is a big bottleneck in biotechnology at the moment”.

Source link

AlphaGeometry and the threat of AI’s takeover of mathematics | Explained

admin — Wed, 13 Mar 2024 10:00:00 +0000

A few weeks ago, an animated discussion unfolded in a WhatsApp group whose members are mathematicians interested in the Indian Mathematical Olympiad. The spark was a Nature paper that announced a Google DeepMind artificial intelligence (AI) named AlphaGeometry had achieved a milestone: it could solve geometry problems at the level of the International Mathematical Olympiad, nearly matching the prowess of gold medallists.

The news evoked a mix of awe, fear, and wonder among us, especially in light of how AI tools like ChatGPT have started to reshape education. Some mathematicians wondered if the advent of AlphaGeometry signals the start of AI’s ascendancy in mathematics.

Is this truly the beginning of an AI takeover in mathematics? To answer this question, let’s take a look at the inner workings of AlphaGeometry.

How does mathematical logic work?

The Nature paper was coauthored by two computer scientists at New York University and two DeepMind researchers. AlphaGeometry is one of DeepMind’s array of AI systems – perhaps the most popular of which is AlphaZero, a deep-learning algorithm that excels at playing chess. Programs like these are part of researchers’ efforts to work up a ladder of complexity, building tools that can perform more complex tasks more reliably.

The AlphaGeometry team has published supplementary information describing the proofs generated by AlphaGeometry for some geometry problems, showcasing its ability to create hundreds of logical steps in proof construction.

Let’s start with a simple example from school mathematics. Suppose we only know that for any number a, a + 0 = a. From this, we will be able to prove that for any number a × 0 = 0. How? If a + 0 = 0 for any number a, then we should have 0 + 0 = 0. Thus a × 0 can be written as a × (0 + 0), which is the same as a × 0 + a × 0. So we have the equality a × 0 = (a × 0) + (a × 0). Cancelling a × 0 on both sides of the equation, we can conclude that a × 0 = 0.

Here, the entire proof is simply derived from the hypothesis using the rules of logic. Many computer programs can execute such a process but AlphaGeometry stands apart because of its ‘Deductive Database’ – a method that significantly reduces the number of steps in a proof.

What is ‘Deductive Database’?

Suppose we are given a statement A, and we want to deduce the statement Z. The program can spit out all possible next steps – let’s call them B – that can be deduced from A using the rules of logic. Then it will spit out all possible next steps C that can be deduced from B, and so on. If there are only finitely many steps possible, then it should reach the conclusion Z at some point. But once it reaches Z, it will perform a ‘traceback’ process to find the proof that takes the minimum number of steps.

So much for arithmetic and logic; geometry requires something more. In geometry, we use algebraic relations between different kinds of measures to find new relations. For example, we will have used simple techniques in school geometry called ‘angle chasing’, ‘ratio chasing’ and ‘distance chasing’.

To illustrate the meaning of these ideas, let us take an example from school geometry. Let a, b, and c be three lines on a plane. If we know the angle between a and b and the angle between b and c, we can immediately determine the angle between a and c (see figure 1). This is an example of ‘angle chasing’. Similarly, AlphaGeometry can quickly discover all possible algebraic relationships between some given quantities using its ‘Algebraic Rules’ program.

Figure 1.
| Photo Credit:
Special arrangement

When it combines its ‘Deductive Database’ and ‘Algebraic Rules’ programs, AlphaGeometry can write complete proofs for most school-level geometry problems.

For example, let A, B, C, and D be any four points on a plane (see figure 2). Suppose by angle chasing we know that the angle between the lines AB and BD is equal to the angle between the lines AC and CD. Then ‘Deductive Database’ can immediately figure out all the four points lie on a circle while ‘Algebraic Rules’ can determine that the angle between the lines BC and CA is equal to the angle between the lines BD and DA.

Figure 2.
| Photo Credit:
Special arrangement

What are auxiliary constructions?

The combination of these two programs makes AlphaGeometry a very powerful tool. The AlphaGeometry team could solve 14 of the 30 geometry problems in the International Mathematical Olympiad in this way.

This achievement also reveals that a significant amount of difficulty in these problems was not in terms of the ingenuity required to solve them but in the ability to deduce the most number of relations – and computers are better at this than humans.

Fortunately, this ability is not sufficient to prove all problems in geometry, but AlphaGeometry seems to have summited this peak as well.

Mathematics is really a creative field because mathematicians often come up with clever constructions to solve a problem. Their name for such a construction is an auxiliary construction. Auxiliary constructions are not part of what is ‘given’ to us nor what we want to prove, and also illustrate what makes automatic theorem proving difficult. There are infinite ways to build constructions, and human intelligence is required to judge which one to choose for a given problem and how to use it.

There is a classic example: some 2,000 years ago, Euclid proved that there are infinitely many prime numbers. His proof goes as follows: suppose there are only finitely many primes numbers, say p₁, p₂, …, p_n. Take the product of all these primes and add 1 to the product. Let’s call this new number p. That is, p = p₁p₂ … p_n + 1. The question now is whether p is a prime.

If p is a prime, and since p is bigger than all the other primes, we have a new prime. However, this shouldn’t be possible because we assumed originally that there is only a finite number of primes. If p is not a prime, we will be forced to conclude that one of the primes should divide 1, which is absurd. In sum, assuming there is a number of primes leads us to absurdity, which means there have to be infinitely many primes.

The auxiliary construction in this proof is constructing the number p. There are no particular restrictions for how we can come up with different constructions, and thus different ways to solve the problem. They simply require experience and deep insight.

What is the significance of AlphaGeometry?

Invariably, most geometry proofs require auxiliary constructions. Large language models like GPT-4, which is behind ChatGPT, can be taught to come up with possible constructions. One can train them to use rule-sets from different fields to build auxiliary constructions and use them to write proofs. However, there is no guarantee that the new constructions they devise will be able to lead to new proofs.

But when the AlphaGeometry team combined GPT-4 with ‘Deductive Database’ and ‘Algebraic Rules’, the program could produce auxiliary constructions for geometry problems, with no prior human demonstration. This is a new development in the field, and in this sense, AlphaGeometry seems like a big step towards AI’s takeover of mathematics, which has thus far been a very human enterprise.

In all, AlphaGeometry could solve 11 more Olympiad geometry problems, bringing its tally to 25 out of 30 problems. It is also commendable that AlphaGeometry can write human-readable proofs and can draw diagrams to explain a proof. Once it did so, the team asked a coach of the U.S. Mathematical Olympiad to evaluate the proofs and grade them. The result: AlphaGeometry performed better than an average silver medallist.

The architecture developed for AlphaGeometry may not have been able to solve the other Olympiad problems, but the techniques it developed are directly useful to solve problems from other areas of mathematics. The success of this project will certainly lead to the development of AI programs that can efficiently do mathematics at least at the school level.

Mohan R. is a mathematician at Azim Premji University, Bengaluru.

Source link