Skip to main content Skip to secondary navigation
decorative background

AIMI Dataset Index

A community-driven resource of health AI datasets for machine learning in healthcare



EMory BrEast imaging Dataset EMBED is a racially diverse mammography dataset containing 3.4M screening and diagnostic images from 110,000 patients collected from 2013-2020, with an equal representation of black and white women. The dataset is comprised of 2D, synthetic 2D (C-view), and 3D (digital breast tomosynthesis, i.e. DBT) images. It contains 60,000 annotated lesions linked to structured imaging descriptors and ground truth pathologic outcomes grouped into six severity classes. This release represents 20% of the total 2D and C-view dataset and is available for research use.

Data Source: Emory University
Number of Sources: Single
Population #: 116,000
Population Unit: adults
Population Representation: breast cancer
Longitudinal Observations: No
Accessibility: Apply for access
Permitted Uses: Non-commercial only
Fees: Free
Data Types: Medical Imaging
Funding Source: NIH/NCATS
Documentation url: View Documentation »
Main url: Visit site »