Talk
With a focus on healthcare applications where accuracy is non negotiable, this talk highlights challenges and delivers practical insights on building AI agents which query complex biological and scientific data to answer sophisticated questions. Drawing from our experience developing Owkin-K Navigator, a free-to-use AI co-pilot for biological research, I'll share hard-won lessons about combining natural language processing with SQL querying and vector database retrieval to navigate large biomedical knowledge sources, addressing challenges of preventing hallucinations and ensuring proper source attribution. This session is ideal for data scientists, ML engineers, and anyone interested in applying python and LLM ecosystem to the healthcare domain.
Basic familiarity with Python and LLM concepts will be helpful but is not required.
The growth of scientific healthcare literature and publicly available biomedical databases has created many opportunities but also great challenges for researchers. While large amounts of biological data are now freely available, finding and connecting relevant information across disparate sources remains time-consuming and complex. LLM-powered tools offer promising solutions to this challenge, but implementing them in healthcare, where accuracy can impact patient outcomes, requires specialised approaches and careful design considerations.
This talk will share practical lessons and technical strategies to address hallucinations, complex domain-specific terminology, source citations.
The presentation will be structured into three main sections:
The challenge of scientific data retrieval (5 mins)
Technical architecture for LLM-powered scientific search (15 mins)
Lessons learned and future directions (5 mins)
Throughout the talk, I'll provide concrete examples of how these technologies can be applied to real research questions, in a production environment, demonstrating the practical value of AI agents in accelerating scientific discovery.
Intended audience: This talk is designed for data scientists, ML / Software engineers, bioinformaticians, and researchers interested in leveraging AI for scientific data retrieval and analysis.
While examples will focus on biological data, the principles and techniques discussed are applicable across scientific domains. Basic familiarity with Python and AI concepts will be helpful but is not required.
Senior Machine learning engineer
I have worked in the healthcare industry for more than 10 years, currently a senior machine learning at Owkin. Committed to open source and open science principles, I aspire to leverage Python and data science for social good, focusing on health, inclusion, and projects that make a meaningful difference in people's lives.