Data Quality for LLMs: Building a Reliable Data Foundation

When

JavaScript Disabled

(Date and time are shown in your browser's local time zone)

Event Type

Achieving value with Large Language Models (LLMs) hinges on a reliable data foundation. This is becoming increasingly relevant with the introduction of conversational AI agents that exploit RAG (retrieval augmented generation) techniques to extract information from biomedical data. What isn’t emphasized enough, is the crucial role that well-annotated data and its accessibility to the models plays.

In this webinar, we look at how data quality affects the performance of LLMs. For this, we assess how LLM-powered AI agents query across three versions of the same gene expression corpus, but with varying degrees of quality:

Unstructured Data from GEO (Gene expression Omnibus)
Structured Data from the CREEDS project
ML-ready data, annotated using Elucidata’s Polly

Speaker

Abhishek Jha, CEO & Co-Founder at Elucidata

Please click here to view a recording of this event and other past webinars.

Last Updated on May 30, 2024 by Helen Taylor
Categories: Artificial Intelligence & Machine Learning, Data Governance, Pistoia Alliance

Search...

Data Quality for LLMs: Building a Reliable Data Foundation

When

Event Type

Speaker

Events

31 Jul 2024

06 Aug 2024

28 Aug 2024

29 Aug 2024