Add Zero-Shot LLM Evidence Retrieval Pipeline (Ahsan et al. 2024)#1135
Open
abhiseksinha-r1 wants to merge 1 commit intosunlabuiuc:masterfrom
Open
Add Zero-Shot LLM Evidence Retrieval Pipeline (Ahsan et al. 2024)#1135abhiseksinha-r1 wants to merge 1 commit intosunlabuiuc:masterfrom
abhiseksinha-r1 wants to merge 1 commit intosunlabuiuc:masterfrom
Conversation
…ero-Shot Evidence LLM
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Add Zero-Shot LLM Evidence Retrieval Pipeline (Ahsan et al. 2024)
Contributor: Abhisek Sinha (abhisek5@illinois.edu)
Type of Contribution: Dataset + Task + Model
Paper Reference: https://arxiv.org/abs/2309.04550
Summary
This PR implements the zero-shot LLM pipeline for EHR evidence retrieval as proposed in Ahsan et al. (2024) "Retrieving Evidence from EHRs with LLMs: Possibilities and Challenges" (CHIL 2024, PMLR 248:489-505). The implementation follows PyHealth's modular architecture with new dataset, task, and model components.
Project Details
1.
MIMIC3NoteDataset- New Dataset ClassA specialized MIMIC-III data loader optimized for NLP and evidence retrieval tasks:
noteeventsanddiagnoses_icdtables by defaultmimic3_note.yaml) exposing theiserrorflag for filtering erroneous notescharttimevalues fromchartdate2.
EHREvidenceRetrievalTask- New Task DefinitionBinary classification task: does a patient's notes support a given query diagnosis?
3.
ZeroShotEvidenceLLM- New Model ImplementationImplements the two-step zero-shot prompting strategy from the paper:
Key features:
use_cbert_baseline=True)Files Added/Modified
pyhealth/datasets/mimic3.pyMIMIC3NoteDatasetclasspyhealth/datasets/configs/mimic3_note.yamlpyhealth/datasets/__init__.pypyhealth/tasks/ehr_evidence_retrieval.pyEHREvidenceRetrievalTaskpyhealth/tasks/__init__.pypyhealth/models/ehr_evidence_llm.pyZeroShotEvidenceLLMmodelpyhealth/models/__init__.pyexamples/clinical_tasks/mimic3_note_ehr_evidence_retrieval_llm.pydocs/api/datasets/pyhealth.datasets.MIMIC3NoteDataset.rstdocs/api/tasks/pyhealth.tasks.EHREvidenceRetrievalTask.rstdocs/api/models/pyhealth.models.ZeroShotEvidenceLLM.rsttests/core/test_mimic3_note_dataset.pytests/core/test_ehr_evidence_llm.pyAblation Experiments
The example script includes four ablation experiments from the paper:
Usage
Quick Demo (No MIMIC Access Required)
Full Pipeline with MIMIC-III
Testing
# Run all new tests pytest tests/core/test_mimic3_note_dataset.py -v pytest tests/core/test_ehr_evidence_llm.py -vRelated Work
Checklist
MIMIC3NoteDataset)EHREvidenceRetrievalTask)ZeroShotEvidenceLLM)