Jon Stevens (AbbVie, Inc.) - Ask ARCH: LLM Question Answering over Large-Scale Knowledge Graphs

June 14, 2024

Ask ARCH: LLM Question Answering over Large-Scale Knowledge Graphs
Jon Stevens (AbbVie, Inc.)

Knowledge graphs provide a vehicle for grounding LLM answers in harmonized structured data, reducing hallucinations and allowing easy fact-checking. In turn, LLMs provide a natural way for end users to query knowledge graph data, without requiring a query language or deep understanding of database structure. We present our integration of AbbVie's 30-million-node R&D knowledge graph, the ARCH Graph, with GPT-based LLMs to create a scientific question-answering system. The ARCH Graph is a Neo4J graph that harmonizes and connects molecules, drugs, genes, health conditions, and other entities from a variety of data sources, allowing scientists to make connections between disparate data points. However, querying the graph can be challenging for end users without a natural language interface. The new Ask ARCH Graph provides such an interface, allowing users to ask questions in natural language (e.g., "What genetic markers are associated with acute myeloid leukemia?") and receive natural language answers ("Some genetic markers associated with acute myeloid leukemia include PICALM (ENSG00000073921), CEBPA (ENSG00000245848), ...") along with the underlying data and the Cypher query used to retrieve it. To achieve this, the system utilizes a combination of vector search, Cypher query generation and validation, and LLM-based summarization of the Cypher output. The process of accurately retrieving information from a large-scale knowledge graph is more complex and less researched than simpler RAG methods on document corpora. We discuss the evolution of our approach and evaluate its accuracy and performance. The integration of LLMs with knowledge graphs helps reduce hallucinations, improve reliability in specialized domains, enhance reasoning with context, and enable dynamic and interactive knowledge discovery.

Our previous conferences have brought in an average of over 150 attendees each and have featured speakers from organisations including Roche, AstraZeneca, Boehringer Ingelheim, Bayer, Novo Nordisk, and NASA, and many other leading biopharma research leaders. Most recent webinar recordings can be accessed here and some recordings are also available on YouTube.

Add a comment...
Post as (log out)