Knowledge Graphs in Drug Discovery part 9

June 12, 2024 03:00 PM Europe/London

The webinar conference will last approximately 2.5 hours. If you register for the event, you will receive the recording when it is ready via email even if you are unable to attend the live event. Please follow Biorelate on LinkedIn for more webinars and data science news.

Talks will include:

Using Retrieval Augmented Generation Approaches To Gather Information for Drug Discovery
Jon Hill (Boehringer Ingelheim)
Large language models have captured the public imagination; but how can they be useful for work in drug discovery? What about the risks of introducing false information? This talk will provide an overview of different approaches to using LLMs in a pragmatic way in early research, including the management of hallucination. These models are increasingly accessible to non-AI experts who are aware of their limitations and can improve the speed and comprehensiveness of the information that you bring to your research.

Multi-modal Knowledge Graphs for Precision Oncology
Miguel Gonçalves (AstraZeneca)
In this talk, I will go over the work we have been doing with Knowledge Graphs to uncover clinical insights in specific indications via incorporating multi-modal data. I will describe our flexible KG approach and provide examples on how this is making an impact at AZ.

Ask ARCH: LLM Question Answering over Large-Scale Knowledge Graphs
Jon Stevens (AbbVie, Inc.)

Knowledge graphs provide a vehicle for grounding LLM answers in harmonized structured data, reducing hallucinations and allowing easy fact-checking. In turn, LLMs provide a natural way for end users to query knowledge graph data, without requiring a query language or deep understanding of database structure. We present our integration of AbbVie's 30-million-node R&D knowledge graph, the ARCH Graph, with GPT-based LLMs to create a scientific question-answering system. The ARCH Graph is a Neo4J graph that harmonizes and connects molecules, drugs, genes, health conditions, and other entities from a variety of data sources, allowing scientists to make connections between disparate data points. However, querying the graph can be challenging for end users without a natural language interface. The new Ask ARCH Graph provides such an interface, allowing users to ask questions in natural language (e.g., "What genetic markers are associated with acute myeloid leukemia?") and receive natural language answers ("Some genetic markers associated with acute myeloid leukemia include PICALM (ENSG00000073921), CEBPA (ENSG00000245848), ...") along with the underlying data and the Cypher query used to retrieve it. To achieve this, the system utilizes a combination of vector search, Cypher query generation and validation, and LLM-based summarization of the Cypher output. The process of accurately retrieving information from a large-scale knowledge graph is more complex and less researched than simpler RAG methods on document corpora. We discuss the evolution of our approach and evaluate its accuracy and performance. The integration of LLMs with knowledge graphs helps reduce hallucinations, improve reliability in specialized domains, enhance reasoning with context, and enable dynamic and interactive knowledge discovery.

At this virtual free-to-attend conference for the biopharma data professional community, the speakers from across biopharma research give 30-minute presentations on knowledge graphs, NLP, and other related topics of interest to the data science, bioinformatics, computational biology and greater biopharma communities, then at the end, we have a roundtable Q&A session with all of the speakers. 

Our previous conferences have brought in an average of over 150 attendees each and have featured speakers from organisations including Roche, AstraZeneca, Boehringer Ingelheim, Bayer, Novo Nordisk, and NASA, and many other leading biopharma research leaders. Most recent webinar recordings can be accessed here and some recordings are also available on YouTube.

Jon Hill

Principal Scientist, Boehringer Ingelheim

Jon Hill is a Senior Principal Scientist at Boehringer Ingelheim with over twenty-years of experience, currently focusing on discovery and validation of new therapeutic concepts for liver disease. Although his core role relies on transcriptomics and similar technology, he has found unstructured data, such as text, to be an invaluable complement.

Miguel Gonçalves

Senior Biomedical Informatics Scientist - Oncology, AstraZeneca

Miguel Gonçalves is a Senior Biomedical Informatics Scientist at AstraZeneca, currently focusing on the implementation of multi-modal Knowledge Graphs for patient stratification and biomarker discovery. He has experience in both early stage and late stage pharmaceutical R&D in multiple therapeutic areas. Miguel has a background in Biomedical Engineering and completed a PhD in Biophysics at UCL focusing on the tumour microenvironment using magnetic resonance imaging.

Jon Stevens

AI Language Capability Lead, Information Research, AbbVie

Jon Stevens is part of AbbVie's RAIDERS team and a founding member of a new team dedicated to bringing generative AI solutions to the enterprise. He received his PhD in Linguistics in 2013 from the University of Pennsylvania, after which he worked as a researcher in computational linguistics and cognitive science for five years before joining AbbVie as an NLP Engineer in 2018. When he is not harnessing the power of language models, Jon enjoys playing video games with his family and noodling on the banjo.

Add to calendar

Ask a question

Ask as (log out)