Dataset Directory
Connecting machine learning practitioners with meaningful datasets
Well-annotated data is required to develop effective machine learning tools for the clinical environment. With the Dataset Directory, we connect machine learning practitioners with accessible and meaningful datasets for their projects.
Here are organizations to contact about their datasets or with datasets ready to be pulled directly from their websites.
Cancer Genome Atlas Cervical Kidney Renal Papillary Cell Carcinoma
This data collection is part of a larger effort to build a research community focused on connecting cancer phenotypes to genotypes by providing clinical images matched to subjects from The Cancer Genome Atlas (TCGA).
The Cancer Imaging Archive, 26,667 InstancesCancer Genome Atlas Breast Invasive Carcinoma
This data collection is part of a larger effort to build a research community focused on connecting cancer phenotypes to genotypes by providing clinical images matched to subjects from The Cancer Genome Atlas (TCGA).
The Cancer Imaging Archive, 230,167 InstancesCancer Genome Atlas Cervical Squamous Cell Carcinoma and Endocervical Adenocarcinoma
This data collection is part of a larger effort to build a research community focused on connecting cancer phenotypes to genotypes by providing clinical images matched to subjects from The Cancer Genome Atlas (TCGA).
The Cancer Imaging Archive, 19,135 InstancesCancer Genome Atlas Colon Adenocarcinoma
This data collection is part of a larger effort to build a research community focused on connecting cancer phenotypes to genotypes by providing clinical images matched to subjects from The Cancer Genome Atlas (TCGA).
The Cancer Imaging Archive, 8,387 InstancesCancer Genome Atlas Esophageal Carcinoma
This data collection is part of a larger effort to build a research community focused on connecting cancer phenotypes to genotypes by providing clinical images matched to subjects from The Cancer Genome Atlas (TCGA).
The Cancer Imaging Archive, 20,593 InstancesCancer Genome Atlas Kidney Chromophobe
This data collection is part of a larger effort to build a research community focused on connecting cancer phenotypes to genotypes by providing clinical images matched to subjects from The Cancer Genome Atlas (TCGA).
The Cancer Imaging Archive, 9,221 InstancesCancer Genome Atlas Liver Hepatocellular Carcinoma
This data collection is part of a larger effort to build a research community focused on connecting cancer phenotypes to genotypes by providing clinical images matched to subjects from The Cancer Genome Atlas (TCGA).
Cancer Imaging Archive, 125,397 InstancesCancer Genome Atlas Low Grade Glioma
This data collection is part of a larger effort to build a research community focused on connecting cancer phenotypes to genotypes by providing clinical images matched to subjects from The Cancer Genome Atlas (TCGA).
The Cancer Imaging Archive, 241,183 InstancesCancer Genome Atlas Lung Adenocarcinoma
This data collection is part of a larger effort to build a research community focused on connecting cancer phenotypes to genotypes by providing clinical images matched to subjects from The Cancer Genome Atlas (TCGA).
The Cancer Imaging Archive, 48,931 InstancesCancer Genome Atlas Ovarian Cancer
This data collection is part of a larger effort to build a research community focused on connecting cancer phenotypes to genotypes by providing clinical images matched to subjects from The Cancer Genome Atlas (TCGA).
The Cancer Imaging Archive, 536,662 InstancesCancer Genome Atlas Prostate Adenocarcinoma
This data collection is part of a larger effort to build a research community focused on connecting cancer phenotypes to genotypes by providing clinical images matched to subjects from The Cancer Genome Atlas (TCGA).
The Cancer Imaging Archive, 16,790 InstancesCancer Genome Atlas Rectum Adenocarcinoma
This data collection is part of a larger effort to build a research community focused on connecting cancer phenotypes to genotypes by providing clinical images matched to subjects from The Cancer Genome Atlas (TCGA).
The Cancer Imaging Archive, 1,786 InstancesCancer Genome Atlas Sarcoma
This data collection is part of a larger effort to build a research community focused on connecting cancer phenotypes to genotypes by providing clinical images matched to subjects from The Cancer Genome Atlas (TCGA).
The Cancer Imaging Archive, 1,786 InstancesCancer Genome Atlas Stomach Adenocarcinoma
This data collection is part of a larger effort to build a research community focused on connecting cancer phenotypes to genotypes by providing clinical images matched to subjects from The Cancer Genome Atlas (TCGA).
The Cancer Imaging Archive, 43,908 InstancesCancer Genome Atlas Thyroid Cancer
This data collection is part of a larger effort to build a research community focused on connecting cancer phenotypes to genotypes by providing clinical images matched to subjects from The Cancer Genome Atlas (TCGA).
The Cancer Imaging Archive, 2,780 InstancesCancer Genome Atlas Urothelial Bladder Carcinoma
This data collection is part of a larger effort to build a research community focused on connecting cancer phenotypes to genotypes by providing clinical images matched to subjects from The Cancer Genome Atlas (TCGA).
The Cancer Imaging Archive, 78,429 InstancesCancer Genome Atlas Uterine Corpus Endometrial Carcinoma
This data collection is part of a larger effort to build a research community focused on connecting cancer phenotypes to genotypes by providing clinical images matched to subjects from The Cancer Genome Atlas (TCGA).
The Cancer Imaging Archive, 71,674 InstancesCHESTXRAY14
Frontal view chest X-ray images labeled considering 14 common thorax disease conditions.
National Institutes of Health, 112,129 InstancesMURA
MURA (musculoskeletal radiographs) is a large dataset of bone X-rays. Algorithms are tasked with determining whether an X-ray study is normal or abnormal.
Stanford ML Group, 40,561 InstancesOsteoarthritis Initiative
This is a multi-center, longitudinal, prospective observational study of knee osteoarthritis (OA). The overall aim of the OAI is to develop a public domain research resource to facilitate the scientific evaluation of biomarkers for osteoarthritis as potential surrogate endpoints for disease onset and progression.
NIMH Data Archive, 8,892 Instances