Tom Hartvigsen
Assistant Professor
Data Science
University of Virginia
(Office: 1919 Ivy Rd., Rm. 339)
Hi! I'm a tenure-track Assistant Professor of Data Science and, by courtesy, Computer Science at the University of Virginia. Before joining UVA in Fall 2023 I was a postdoc at MIT CSAIL working with Marzyeh Ghassemi. I received my PhD in Data Science at WPI where I was advised by Elke Rundensteiner and Xiangnan Kong.
Research
My research group works on machine learning and natural language processing. We work to enable responsible model deployment in ever-changing environments, especially for healthcare.
Active directions and highlights:
Continually monitoring and editing knowledge and behavior of big models
Aging with GRACE: Lifelong Model Editing with Discrete Key-Value Adapters (NeurIPS'23 + code + blog post)
TAXI: Evaluating Categorical Knowledge Editing for Language Models (ACL'24 + data)
BendVLM: Test-Time Debiasing of Vision-Language Embeddings (NeurIPS'24)
Composable Interventions for Language Models (arXiv'24)
Time series and multi-modality
Are Language Models Actually Useful for Time Series Forecasting? (NeurIPS'24 Spotlight + code)
Language Models Still Struggle to Reason about Time Series (EMNLP'24 + code)
UniTS: A Unified Multi-Task Time Series Model (NeurIPS'24 + code)
Detecting and mitigating harmful biases in language and language models
PolygloToxicityPrompts: Multilingual Evaluation of Neural Toxic Degeneration in Large Language Models (COLM'24 + Leaderboard + blog post)
ToxiGen: Using LLMs to detect and mitigate implicit social biases (ACL'22 + dataset). ToxiGen has been used while training Llama2, Code Llama, phi-1.5, phi-2, and other LLMs, and to detect toxicity in Econ Forums and Laws.
Healthcare & Biomedical Data Science
Demographic Bias in Misdiagnosis by Computational Pathology Models (Nature Medicine)
Dissecting the Heterogeneity of "In-the-Wild Stress" from Multimodal Sensor Data (npj Digital Medicine)
Language Models are Surprisingly Fragile to Drug Names in Biomedical Benchmarks (EMNLP'24 + leaderboard)
★ News ★
Jan'25:
Paper on sequential knowledge editing accepted to Workshop on Knowledgeable Foundation Models at AAAI 25
Dec'24:
Invited talks at UVA's Genome Sciences Seminar Series and UVA's Darden Business School
New preprint on foundation models for protein phenotypes
Nov'24:
Lab member Xu Ouyang was awarded an iPRIME PhD Fellowship --- congrats Xu!
New preprint on scaling laws for LLM quantization
Oct'24:
New preprints:
Paper accepted to IEEE BigData on spike train classification
Sep'24:
3 papers accepted to NeurIPS'24!
Are Language Models Actually Useful for Time Series Forecasting? (Spotlight!) - Congrats, Mingtian!
Test-Time Debiasing of Vision-Language Embeddings
UniTS: A Unified Multi-Task Time Series Model
3 papers accepted to EMNLP'24!
MATHWELL: Generating Educational Math Word Problems with Teacher Annotations - Congrats, Bryan!
Language Models Still Struggle to Zero-shot Reason about Time Series
Language Models are Surprisingly Fragile to Drug Names in Biomedical Benchmarks
Aug'23:
Paper accepted to TMLR on using LLMs for robust text classification
July'24:
Paper accepted to COLM'24 on multilingual toxicity in LLMs.
Paper accepted to AIES'24 on detecting implicit social biases in VL models
July'24:
New preprints:
Paper accepted to MICCAI'24 on federated learning for medical imaging
May'24:
Paper accepted to ACL'24 on categorical knowledge editing for LLMs
Apr'24:
Nature Medicine paper on bias in computational pathology
Spring'24: Invited talks at Dartmouth, IBM Research, UCSF/UC Berkeley, and the University of Alabama, Birmingham