Home /Research /Learning Advance: Robotics-LLM Guided Hypotheses Generation for the Discovery of Chemical Knowledge
OTHER

Learning Advance: Robotics-LLM Guided Hypotheses Generation for the Discovery of Chemical Knowledge

Tianzhixi Yin, Ruozhu Feng, Jie Bao, Peiyuan Gao, Yangang Liang, Heather Job, Alán Aspuru‐Guzik, Wei Wang

Year
2025
Citations
4

Abstract

We present a novel framework that we name "Learning Advance" for hypothesis generation and validation for the discovery of chemical knowledge in the context of optimizing solubility in amphiphile/water systems. The workflow begins with an initial hypothesis: that the incorporation of common hydrotropic additives, such as sugars or urea, enhances solubility limits. To test this assumption, we employ a grid search and Latin hypercube sampling approach to design experimental combinations of additive weight percentages. We employ high-throughput robotic systems for automating the experiments and a YOLO-based image analysis workflow for determining the degree of solubilization. Experimental data are transformed into a chemical feature space to train a Gaussian Process Regression (GPR) model, which drives a Bayesian optimization (BO) algorithm for identifying optimal additive combinations. When BO plateaus, the "Learning Advance" approach leverages all accumulated data for AI analysis. We extract correlations between target property and chemical features, enabling LLM tools to generate a novel hypothesis based on the observed data. This hypothesis is subsequently validated through experimentation, creating a continuous cycle of discovery. This framework demonstrates how integrating BO with AI-driven hypothesis generation enables breakthroughs beyond conventional optimization limits, establishing a promising approach for advancing scientific knowledge discovery in material science and chemistry.

Keywords

Artificial intelligenceRoboticsComputer scienceKnowledge extractionData scienceKnowledge managementRobot

Related papers

Browse all OTHER papers