ML and the Future of Simulation

A roundtable at SPC.

Gopal Raman

Aug 14, 2025 • 21 min read

Simulation is essential for making progress in nearly every physical domain: drug discovery, aerospace engineering, weather forecasting, manufacturing, and more. And it’s changing, quickly.

We’re joined by two SPC members – Rui and Daanish – and three friends – Victor, Peetak, and Gabriel, to dive into how they see machine learning playing a more central role in the future of simulation.

First, a quick round of introductions.

Peetak: I completed my undergraduate studies in India before moving to the U.S. for my PhD in what was then an emerging field — scientific machine learning. The goal was to leverage advances in ML from domains like NLP and computer vision to tackle scientific challenges, particularly in modeling complex nonlinear dynamical systems such as fluid flows and combustion chemistry.

During my PhD, I focused on applying machine learning to fluid dynamics, developing surrogate models that significantly accelerated simulations while reducing computational costs. After completing my doctorate, I moved to the Bay Area and joined Xerox PARC, where I helped build the AI Climate team with funding from DARPA and NASA. Our work centered on creating AI-driven climate emulators for long-term projections, aiding our understanding of how climate evolves under anthropogenic influences.

In late 2022, I became part of the founding team at Excarta, an AI-powered weather intelligence startup that secured multiple rounds of funding. Our mission was to bring cutting-edge AI weather forecasting into real-world applications, bridging the gap between research and production. After, I transitioned to Equilibrium Energy, an energy trading firm, and now am at Squarepoint, a global multi-strategy hedge fund.

Gabriel: I’ve been in the biophysics field for about eight years now. I started back home in Brazil, planning to do a pharmacy degree because, when I was a kid, I thought pharmacists created new drugs. That turned out not to be the case – but it was a reasonable assumption! During my degree, I got exposed to computational chemistry and eventually computational biophysics, and I really liked it.

I realized I needed a PhD if I was going to work on that, so I did a master’s (in Brazil, you typically need a master’s before a PhD). During my master’s, I got a chance to go to Yale for a summer. I really liked the research there and made connections. They asked, “Why don’t you do your PhD in the U.S.?” and I thought, “Sure.” Brazil wasn’t doing very well politically or economically, so it made sense.

I ended up doing my PhD at Brown University under Professor Brenda Rubenstein. Her lab mostly focuses on density functional theory and quantum mechanics. I was the odd one out doing biophysics, but it worked out. After that, I realized the field was changing quickly with generative methods, and I heard about Genesis Therapeutics. They recruited me, and I joined as a senior computational chemist about four months ago.

Rui: I went to Stanford for my undergrad; I was a systems guy there for a while. I graduated in about three years, then worked for a company called Rockset for a bit. They later got bought by OpenAI. I wrote a large part of their V1 compiler for the database.

Anyway, that made me realize AI was the clear future, so I went back and worked for Andrew Ng in the machine learning group at Stanford while doing my master’s. I focused on control algorithms for partially observed data spaces — basically, early agent-like systems using RL over large-scale scientific state spaces.

I started a company out of that lab called Expedock, which did decently well. It was an AI backend for many supply chain businesses, handling unstructured data in emails, spreadsheets, documents. We served everything from Fortune 500s down to mom-and-pop shops, with hundreds of customers by the time I left.

Over the past few months, I’ve gotten more into the engineering sciences field. I did a lot of complex systems engineering in college, and I find hardware and product engineering fascinating. I think all these systems are being underexplored in favor of large language models in Silicon Valley. There’s a lot you can do in the quantitative world that might be even more impactful than language-based models.

Victor: I spent a long time at Hugging Face, where I was one of the earliest employees. We were a five-person company when I joined, and I left in the beginning of the summer of 2024 to do something else. In my last three years there, I mainly trained and open-sourced large language models (state-of-the-art LLMs, VLMs, etc). We focused on infrastructure, data curation, RLHF, and everything people are talking about now. I left to join another startup in computer vision/agents, where I led model training and evaluation. I left that in November and have been exploring new ideas since.

One product space I’ve been looking at is engineering tools: the software people use to build physical products. It feels a bit like coming home: I went to engineering school back in the day, and a lot of my friends are building real stuff — tunnels, or working on restoring the Notre Dame Cathedral — things in the physical world, not just software. I find that pretty cool, and that’s my rabbit hole into this space.

Daanish: My career has been about the interplay of physics and computation. At Stanford, I was in Electrical Engineering– specifically in semiconductor devices and nanophotonics. I began to feel that the precepts of computer science (in the tradition of Wolfram) had the potential to revitalize how we approach physical reality. Around the same time, the deep learning revolution started, and we began to see from early computer vision work how deep models were able to evolve suitable representations to solve a given problem — some of them humanly interpretable, others completely alien. More practically, the devices community also began to use ML and inverse design to create nanophotonic structures, inceedibly simplifying the design process. While both fields were nascent, professionally I worked on applications that used the useful state of the art in vision and language models.

The study of NN training dynamics was focused on the phase when a model evolves the correct representations to solve its training task. I was drawn to domains where pretraining was still relevant: last year, I started the Deep Learning for Science forum at SPC. We surveyed a range of engineering and scientific domains where ML and automated design would have the greatest impact. Simultaneously, I started helping a few clients with computational physics modelling and simulation. Now I’m building tools that leverage neural advances for engineering design automation; we’ve found initial ground in photonics as the precise moulding of light continues to become extremely valuable across semiconductor, energy and consumer electronics industries.

Gopal: Let's get into it. Gabriel, could you walk us through a bit more of the past, present, and future of modeling within the biophysical domain?

Gabriel: This is topical because of the recent Nobel Prize for David Baker and the folks behind AlphaFold. When I joined the field, it was already mature: people were running molecular dynamics (MD), using homology modeling and docking. All these methods are physics-based. The idea is to translate the laws of nature into computational systems, then run enough calculations to figure out a given behavior.

One big bet that maybe didn’t fully pan out was DE Shaw Research’s work on building supercomputers for MD simulations. They’re basically solving Newton’s laws of motion (approximating and integrating them) for hundreds of thousands of atoms at each time step, which is only a few femtoseconds. If you want to simulate drug binding or unbinding, that might happen on the scale of a second. That’s trillions of calculations. Even on next-gen GPUs it might take months; on their Anton supercomputer it might take days or a week.

Even so, we’re still not at a point where we can effectively and efficiently model drug-target binding. There are many reasons for that. Our ways of translating natural laws are limited, our understanding is limited, and our computational systems are limited. That said, these methods still work. Many drugs were discovered with them — but there’s something missing.

That’s where Genesis and a lot of other people in the field are coming in, using a lot of statistics. The cool thing is these physics-based methods generate very high-quality data, which you can feed into statistical learning models. You can create a virtuous cycle: high-quality data improves the model, which in turn helps you get closer to modeling reality. The interface between AI and physics is where I think we’ll see real progress.

Daanish: It's fascinating how you need to stay at a level of abstraction relevant to the problem. What aspects of structural biology are essential to drug development?

Gabriel: When you first join the field, people might say every receptor is a lock, every drug is a key, and you just need to find the right fit. But that’s oversimplified. Ideally, if someone comes to me with a drug discovery project and wants computational biophysics support, I'd like to see a solved structure — maybe via cryo-EM, crystallography, or something similar — and a few examples of previously defined chemical matter. But that’s still scratching the surface. Methods like AlphaFold, RosettaFold, and others can model huge complexes we can’t fully tackle with present physics-based methods.

Daanish: Isn’t a lot of biological modeling very empirically driven? Biologists are used to working at a much higher level of abstraction. E.g. After AlphaFold, we were able to learn more about intrinsically disordered proteins that fold in certain contexts, complicating our simplified pictures yet biological systems remain vast and unknown.

Gabriel: That’s true — it’s partly empirical because there’s so much complexity in the cell. But there’s also emergent behavior in physics that we don’t fully understand, like phase separation.

People have poured a lot of money and effort into improving pure physics methods, but breakthroughs like AlphaFold use mostly statistics. It does rely on decades of research on how crystallography works, but still, no one expected coevolution to be so crucial. It’s hard to predict where the next revolutionary leap in methods will come from, but combining AI and physics seems to be the best bet.

Gopal: One recurrent thread I see people excited about is the idea of “physical foundation models,” or systems really adept at describing physical interactions in a more fundamental way.

I’m curious: do you think it’s possible or even useful to talk about a “foundation physics model?” Is that a goal worth pursuing, or is it misguided?

Peetak: That’s always been the promised land in science — having a foundational understanding and thereby representation of whatever you’re trying to model. Depending on who you ask, though, you’ll get ten different definitions of “foundation models,” especially in AI. While AI has been used in weather forecasting, it was previously mostly applied as a post-processing/correction step instead of at the forecasting stage.

In terms of how the revolution in AI weather forecasting came along, we know AI basically needs three things to flourish: compute, algorithms, and data. Luckily, the field of meteorology has seen public/governmental investments in the past many decades because of its importance to life and property. Due to the public nature of these investments, the data and models have been released publicly over time. One such group in Europe is called ECMWF that released a dataset called ERA5, which is one of our most accurate representations of global weather for the past 40 years. That was the straw that broke the camel’s back: once you have a very clean, comprehensive dataset, everyone comes together to build algorithms and benchmarks around it.

Since weather is designed as a Markovian process (the future weather state depends on the current and past weather) it can be designed as a next word/scene prediction problem, just like an LLM. Similar to LLMs, we saw that when you throw more compute and data at these models, they learn better. Early models used transformers, operator models, graph models, and so on. Now everyone’s trying to build bigger or better models to produce more accurate forecasts, squeezing out any possible value from the data.

But the real value, and bottleneck, is what happens just outside that very structured dataset, like observations, radar stations, weather balloons etc. How do we incorporate that data to make even more accurate and actionable forecasts? Right now, the biggest focus in the field is shifting from a curated dataset that mostly works for “general weather” to unstructured data for extreme weather.

And extreme weather is really what matters, especially with climate change. Storms intensify more quickly than physical models predict, creating havoc for communities. So we need better and timely forecasting. AI can leverage unstructured data that previous genres of first principles based models haven’t used after initialization. So the reason AI did well with the weather problem at first was the availability of this excellent curated dataset and benchmarks. Now the next evolution will come from using unstructured data and many research groups and startups are looking at this exciting space.

On that note, Gabriel, I’m curious about your experience in biology. Is the biggest bottleneck data, lack of understanding of physical mechanisms, or something else? What’s the key to creating a “foundational model” there?

Gabriel: Great question. I don’t know how weather measurement methods evolved historically, but, in biology, the secret is that our techniques aren’t exceptionally better now than they were in, say, 2010. We still rely on crystallography, which often involves freezing your protein of interest. There’s cryo-EM, which is better in some ways but has resolution issues.

The largest bottleneck is data quality. Are we even measuring what we think we’re measuring? The cheapest, highest-throughput drug-binding tests (like IC50) are indirect. Is the cell dying? Is there a readout at a certain wavelength? A temperature change? We assume these things represent “binding,” and that’s usually reasonable, but these methods have major resolution caveats. They don’t necessarily differentiate the dynamics of binding.

So when we build methods for predicting receptor-ligand complexes using IC50 or SPR results, we’re guiding the model with indirect data. We get models that do well at predicting experiments, but experiments themselves are an abstraction from reality. It’s similar to AlphaFold predicting crystal structures, which are incredibly useful but aren’t the whole story of biology.

The protein data bank was a perfect storm: well-curated, highly structured data that was ideal for AI. But that’s not the case for most other biological data. Many useful models have been developed from these indirect measurements, but my intuition is that none will be as impactful as AlphaFold was, unless we improve data quality further.

Victor: There’s something beautiful in the weather prediction space: we’ve built infrastructure for decades, and much of that data is publicly available. It levels the playing field. But in biology and many engineering sciences, we lack the data volume that allows for anything like a foundation model. To your point, Gabriel, a lot of the bottleneck is the data availability.

Gopal: Given your work in open-source, Victor, how do you imagine that community creating progress in this domain? Which of those three inputs — compute, algorithms, data — are most ripe?

Victor: One mechanism people often overlook is benchmarks. If you look back 10–15 years, ImageNet was a huge driving force for deep learning research. It was just cats and dogs, but that competition rallied the community, and progress soared. A similar open benchmark for engineering sciences — say, thermodynamics prediction — could be a huge catalyst. Something like an XPrize for these domains.

Rui: Actually, there is the Johns Hopkins Turbulence Dataset, which isn’t exactly a competition with a prize, but it’s close to what Victor’s describing. It’s not as big as the entire internet, but it’s on the order of 100 terabytes of isotropic turbulence simulations.

We haven’t fully solved any of the three pillars — data, compute, or algorithms — in these scientific domains. With text, we had the internet, and that was enough. Transformers work well, and we have enough compute. But if you look at turbulence, you can have 100 terabytes for just one type of simulation. Meanwhile, GPT might have trained on 500 terabytes total.

Synthetic data wasn’t so big in language, but it might help here. You can generate more data by running a supercomputer. On the other hand, these state spaces can be extremely large and chaotic. The architecture may need to catch up.

Peetak: With turbulence, those 100 terabytes might be from just a handful of simulations because they come from direct numerical simulation (basically, solving Navier-Stokes on a very fine grid, which is unimaginably expensive due to the numerical limits and poor scaling due to high degrees of freedom). That’s our “ground truth,” but it’s so expensive we can’t do it for every scenario.

One area that could be a game-changer is incorporating conservation laws directly into the architecture so the model respects physics. ML models can produce outputs that violate physical laws, so you need inductive biases in the architecture. There’s a long way to go, but that’s the next revolution in bridging AI and physics.

As an example, Google has its NeuralGCM, which uses large-scale equations for weather forecasting on a coarse grid, then uses smaller-scale learned parameterizations using ML. That approach yields more accurate results, runs faster and stably for longer. So that’s a glimpse at how physics and AI might mesh going forward.

Rui: Did AlphaFold have inductive bias? Because there's a world where you just have a big enough model, enough data, and you don’t need explicit inductive bias. Maybe the “bitter lesson” is we shouldn’t over-engineer these constraints.

Victor: I might be on the extreme side of the “bitter lesson.” I’ve trained a lot of transformers in NLP and vision, and the simplest scaling approach often wins. When the transformer started taking off, there were two solid years of people modifying it — tweaking softmax, messing with positional encoding — but then Google did a big empirical comparison and found the original transformer was still the best baseline.

So all that research on smaller or more specialized approaches didn’t beat the straightforward approach that scales really well. Not sure how that translates to engineering science, but a strong baseline might be: just do the thing that scales well, with a recipe we know works.

Gopal: Imagine a future many, many years from now where Ansys, COMSOL, and similar companies have been toppled by new startups. If we work backward from that scenario, what needs to happen in the near-to-medium term?

Rui: I don’t think Ansys dies in 10 years. Maybe they get bought and shut down by a bigger company. Talking to people in the field, the customer experience for that software is terrible — it’s expensive, has a massive rollout cost, it’s on your local computer, etc. But there’s a reason it’s built that way.

The reason no one has killed it with a better cloud experience or something is that fidelity and usefulness matter a lot, especially for big, regulated industries. Think about launching a rocket. No one is going to trust a big transformer model with that. There’s also the regulatory aspect: if you launch a satellite on SpaceX, you have to show them your heat transfer data from an Ansys result or something similar.

Plus, there’s the sunk-cost fallacy. A place like Lockheed or Boeing has trained thousands of engineers on Ansys over decades. Any new tool would have to be phenomenally better for them to switch. So if you really want to disrupt that space in 10 years, you’d need something magical that leaps over regulatory hurdles and is so good that companies will rip and replace an entrenched workflow.

Peetak: In the early days of Excarta, we’d approach people with, “We have an AI-based weather forecast,” and it felt like we were a novelty act, intriguing but not yet taken seriously. We could show metrics, but trust was so low compared to physics-based models. Eventually when end users tried our products and we showed our value in that top 99th-percentile scenario, and then built trust. Also it helped that the overall meteorology community embraced this new game changing technology.

Now it’s commonplace to see AI based weather forecasting products. Google phones use their MetNet based AI weather forecasting technology, so does Microsoft with Bing. During the last hurricane season, it was the first time to my knowledge that we saw AI-based model trajectories (for hurricane tracks) alongside the European and American physics models. Some fields will grow faster than others.

Gopal: Victor, I’d love for you to expand on something you wrote in the chat here – “the cost of a rocket exploding is the forcing function.” If trust is the main hurdle, and that’s why you have strong regulation, how do you build a product for a domain with such a high watermark for trust?

Victor: You never want full automation. Nobody is going to pay for a system that runs end-to-end with no human in the loop. In software, you can just push to the App Store and see what happens, and it usually doesn’t kill anyone. But if you launch a train and it explodes, or a rocket fails, that’s a different world. So for these critical systems, you need humans checking, validating, verifying. That’s built into the product.

Rui: Maybe I’m wrong, but for weather, you got something more accurate, and that replaced the older methods. Could you do the same with AI for a rocket? Would you trust an AI model for that?

Peetak: I’d push back slightly: physics-based weather models are still quite accurate, but they’re slow and expensive to run. Historically, new versions got released every two or three years after big government agencies and scientific organizations decided on updates. The AI-based weather forecasting paradigm has chewed away the traditional advantages of capital and we’ve seen new models being released every few weeks, similar to what we see with LLMs and DeepSeek.

That doesn’t mean the physics-based models are gone. In fact, many AI-based models get initialized with a physics-based simulation. We run a quick numerical simulation to approximate the current state of the atmosphere, then the AI model uses that state to predict the future states. Some groups are trying to go purely observational, but the early results are not quite promising because your initial state is unreliable, and these systems are highly sensitive to initial states.

Gopal: It seems like we’re triangulating on an accuracy-speed-trust trade-off. Classical physical simulations may be slow but extremely accurate. Novel AI-based methods might be faster but have potential accuracy or trust gaps. Do you subscribe to that view? And second, for photonics, in particular, what’s relevant in this conversation?

Daanish: The accuracy-speed-trust simplex definitely exists, but I think we’re overestimating how much existing hardware companies are able to push the physical envelope. Very few places do truly cutting-edge physical engineering; there’s the occasional SpaceX or Boom. But most “engineering” is recombining parts out of stable, well-understood tech. Simulation software is not, by itself, a load-bearing component for safety.

I can imagine a product like a Figma for physics: something faster, but with toggled precision. It could help you figure out different levels of modeling or how to approach a problem from multiple angles. Today’s engineering design tools are not co-pilots for the imagination; they just help you staple together existing IP. As deep models deliver meaningful speedups (10,000x) in physical simulation domains, that would open up entirely new product categories.

[For photonics, which I work on, we’re talking about electromagnetic phenomena / light. The semiconductor industry wants to integrate light more deeply, so you have new photonics companies making single parts. But who’ll manufacture them? You can’t use a part in a product until there’s a scalable production process for it. In addition, there are tons of fabrication imperfections, so as a designer, it’s impossible to hold every tolerance in your head. You need a big “napkin” to visualize these multidimensional constraints. And then an agent could iterate through the design space, finding dead ends or promising paths that you’re not able to consider. That might not require perfect accuracy. Sometimes an order-of-magnitude estimate is enough for a first sketch.

Second, we have a very small number of professional engineers and computational physicists (< 80,000), and each job requires hyper specialization. We need a lot more “brain flops” with general ability to model the physical world to deliver on the pressing physical world problems we are facing in energy, climate and computation.

Rui: That’s almost a user experience question. You can imagine AI unlocking that experience, but maybe you could do the same with classical solvers. There aren’t that many potential customers, so how do you find a big enough market to justify building something like that?

Peetak: One thing we saw when we worked with some car manufacturers during my PhD — using AI for CFD — was this concept called surrogate-based optimization. They wanted a cheap surrogate model to explore the design space quickly, then run the big, expensive simulations only on a few promising cases. That unlocked a lot of value. In the CFD world, at least, people definitely benefit from that kind of approach.

Rui: Surrogate models are useful because they’re faster, right? You’re doing something in constant time that would normally be exponential or whatever it takes to solve Navier-Stokes or other equations. But I wonder whether hardware will catch up first or AI will catch up first. GPUs are already pretty good. People run 150 million-cell meshes in under three minutes. Does it make sense to put an AI model in there? Or do we just keep using GPU-based solvers? Maybe it becomes obsolete as we invest billions in better chips to do linear algebra so fast that AI isn’t needed for these particular problems. I’m not sure.

Peetak: That’s a great point. In CFD, hardware improvements have really driven progress for decades, more than changes on the physics side. Better CPU chips used to make CFD dramatically faster. I’m curious what needs to change in GPUs (currently built for linear algebra) to handle more scientific computation. You still have a lot of matrix inversions and such in physics-based solvers. Any thoughts?

Rui: My quick take is that the hard part is usually the software layer, not the hardware. It’s straightforward (though not trivial) to build a chip that does linear algebra really well. The bigger issue is the kernel layer: doing it efficiently, especially for very large problems where you can’t fit everything in GPU memory, makes you network-bound. That needs algorithmic improvements and better software libraries. Then, sure, maybe specialized hardware (like a TPU for scientific computing) could help, but you’re talking about hundreds of millions in chip design costs. NVIDIA did it for graphics, but I’m not sure how big this market is.

Daanish: If only it were cheaper to design and build chips. I tried doing the math. GPU acceleration helps, but not enough to change the equation fundamentally.

It’s tricky because it’s not like language data, which is easy to gather. Language is shaped by millennia of human intelligence, so extracting patterns is somewhat straightforward. Extracting the right level of abstraction from a wind tunnel’s raw data is harder. That’s where architecture and inductive biases come in, e.g ensuring conservation of energy, or selecting the right modeling layer for a given problem.

Victor: I’d slightly disagree and maybe pose another question. We already train diffusion models and now video models that rely on fancy transformers plus a U-Net. They don’t have huge inductive biases beyond what we do in NLP or computer vision. If RGB values are basically how humans see the world, you can make a case that it’s similar to how a physical sensor might see the world, so maybe you don’t need as many explicit constraints.

Daanish: We’re generating so much more video data. It captures those patterns in the visual world, which most life with eyes shares.

Victor: I get that language is a more condensed form of intelligence, but pixels are still just raw signals.

Rui: There’s another human aspect: even if faster scientific simulation is nice, the time it takes to build a rocket or a chip isn’t purely about simulation. Manufacturing dominates. Faster sims are great, people will pay for it, but if it reduces a 60-minute sim to 10 minutes, how much does that really change the bottom line?

If you reduce computational complexity, maybe you can do a lot more. CFD matters because running a wind tunnel test can be prohibitively expensive. If you can do something a billion times cheaper with a neural net than with a supercomputer, you could red-team it. Test every possible condition, parameterize everything super fast, do validation quickly.

There’s also this subfield of physics-based ML where you estimate real-world conditions more easily. Someone on an F1 team told me they have sensors everywhere on the car, and they want fluid flow calculations in real time during the race.

That’s impossible with classical methods, but neural nets with physics-based biases plus sensor data can do it accurately.

So while it’s not necessarily about speeding time-to-market for a product, there are many real-world use cases (uncertainty quantification, validation, optimization) where having a faster surrogate is a huge advantage.

Peetak: Absolutely. Software can make things more efficient or add novel value. In robotics, for instance, NVIDIA has the Isaac SDK for simulation. You don’t want to build a new environment for every robot; you want to simulate scenarios for self-driving, etc. Companies like Parallel Domain create virtual worlds to train your self-driving stack before you ever deploy a car.

In the energy industry, you might be trading against other players in a complex game. AI agents, built on top of simulations, physics, known patterns, can help you figure out how to trade better. There’s real value in combining ML with simulation.

Rui: It’s like the early days of computing: once chips got 10 billion times cheaper, you could do word processing. I think there’s some analogy in scientific computing that we haven’t fully nailed, but hopefully we’ll get there as simulation gets more affordable.

Daanish: And then there’s inverse design. We already do that for small devices: specify the properties you need, and the geometry is generated automatically. That’s used in chip, fabrication, small-scale stuff. As manufacturing evolves, we might do more of that. “Here are the properties I want,” and the structure just gets printed.

Gopal: There’s a concept we talk about internally at SPC, around waves, surfboards, and surfers. Waves are big exogenous tailwinds, surfboards are the products built to ride those waves, and surfers are the people making it happen. So which waves are you all interested in? What other open questions keep cropping up?

Rui: When designing a physical system, you have the computerized simulation (solving differential equations) and the human reasoning side (“what do these results mean for my design?”). We’re automating both in different ways. Billions are going into large language models for reasoning, and GPU-based solvers and neural operators for the actual physics.

If you’re building a company, you have to decide which parts to automate, where the human is still necessary, and how these two progress together. That’s interesting for the future of scientific modeling.

Peetak: And in scientific progress, accuracy is everything. You can’t tolerate hallucinations like with ChatGPT. So we need robust metrics. For example, in turbulence modeling, you might look at the spectral behavior of velocity fields. Once the community figures out these domain-specific metrics for model evaluation, that’ll build trust and make these tools scalable. Hallucinations might be cute in NLP, but they’re unacceptable in a physical model.

Victor: Another thought experiment: we’re building the next generation of software or intelligence for product development. What about the next generation of manufacturing? Do we just make existing products faster, or do we fundamentally change what we’re building? Maybe mass personalization. In 15 years, how will these building blocks (CFD, CAD, etc.) transform the physical world?

Daanish: It’d be great if hardware building blocks updated as seamlessly as software packages. Imagine automatically improving every year. Right now, that’s not how hardware works.

Rui: We probably need better automation in physical manufacturing. Simulation is definitely key, but so are robotics and assembly lines. Maybe we get huge speedups on both sides, like faster manufacturing plus new materials or designs from advanced models, and it becomes a virtuous cycle.

Peetak: From my experience applying it in scientific domains, like climate change, AI isn’t a panacea. It’s a tool, and you have to be very careful how you use it. There’s so much hype that people want to throw AI at everything, but scientific domains need careful validation, metrics, and evaluations. As we’ve said, there are great opportunities, but they need proper rigor.

Simulation is essential for making progress in nearly every physical domain: drug discovery, aerospace engineering, weather forecasting, manufacturing, and more. And it’s changing, quickly.

We’re joined by two SPC members – Rui and Daanish – and three friends – Victor, Peetak, and Gabriel, to dive into how they see machine learning playing a more central role in the future of simulation.

Stay in the loop.