Connecting machine learning practitioners with meaningful datasets

Well-annotated data is required to develop effective machine learning tools for the clinical environment. With the Dataset Directory, we connect machine learning practitioners with accessible and meaningful datasets for their projects.

Here are organizations to contact about their datasets or with datasets ready to be pulled directly from their websites. 

Contact us to add a dataset to our directory
  • Cancer Genome Atlas Cervical Kidney Renal Papillary Cell Carcinoma

    This data collection is part of a larger effort to build a research community focused on connecting cancer phenotypes to genotypes by providing clinical images matched to subjects from The Cancer Genome Atlas (TCGA).

    The Cancer Imaging Archive, 26,667 Instances
  • Cancer Genome Atlas Breast Invasive Carcinoma

    This data collection is part of a larger effort to build a research community focused on connecting cancer phenotypes to genotypes by providing clinical images matched to subjects from The Cancer Genome Atlas (TCGA).

    The Cancer Imaging Archive, 230,167 Instances
    A medical image of breast cancer
  • Cancer Genome Atlas Cervical Squamous Cell Carcinoma and Endocervical Adenocarcinoma

    This data collection is part of a larger effort to build a research community focused on connecting cancer phenotypes to genotypes by providing clinical images matched to subjects from The Cancer Genome Atlas (TCGA).

    The Cancer Imaging Archive, 19,135 Instances
  • Cancer Genome Atlas Colon Adenocarcinoma

    This data collection is part of a larger effort to build a research community focused on connecting cancer phenotypes to genotypes by providing clinical images matched to subjects from The Cancer Genome Atlas (TCGA).

    The Cancer Imaging Archive, 8,387 Instances
  • Cancer Genome Atlas Esophageal Carcinoma

    This data collection is part of a larger effort to build a research community focused on connecting cancer phenotypes to genotypes by providing clinical images matched to subjects from The Cancer Genome Atlas (TCGA).

    The Cancer Imaging Archive, 20,593 Instances
    A medical image of esophageal cancer
  • Cancer Genome Atlas Kidney Chromophobe

    This data collection is part of a larger effort to build a research community focused on connecting cancer phenotypes to genotypes by providing clinical images matched to subjects from The Cancer Genome Atlas (TCGA).

    The Cancer Imaging Archive, 9,221 Instances
  • Cancer Genome Atlas Liver Hepatocellular Carcinoma

    This data collection is part of a larger effort to build a research community focused on connecting cancer phenotypes to genotypes by providing clinical images matched to subjects from The Cancer Genome Atlas (TCGA).

    Cancer Imaging Archive, 125,397 Instances
  • Cancer Genome Atlas Low Grade Glioma

    This data collection is part of a larger effort to build a research community focused on connecting cancer phenotypes to genotypes by providing clinical images matched to subjects from The Cancer Genome Atlas (TCGA).

    The Cancer Imaging Archive, 241,183 Instances
  • Cancer Genome Atlas Lung Adenocarcinoma

    This data collection is part of a larger effort to build a research community focused on connecting cancer phenotypes to genotypes by providing clinical images matched to subjects from The Cancer Genome Atlas (TCGA).

    The Cancer Imaging Archive, 48,931 Instances
    A medical image of lung cancer
  • Cancer Genome Atlas Ovarian Cancer

    This data collection is part of a larger effort to build a research community focused on connecting cancer phenotypes to genotypes by providing clinical images matched to subjects from The Cancer Genome Atlas (TCGA).

    The Cancer Imaging Archive, 536,662 Instances
    Photo of a benign ovarian tumor
  • Cancer Genome Atlas Prostate Adenocarcinoma

    This data collection is part of a larger effort to build a research community focused on connecting cancer phenotypes to genotypes by providing clinical images matched to subjects from The Cancer Genome Atlas (TCGA).

    The Cancer Imaging Archive, 16,790 Instances
  • Cancer Genome Atlas Rectum Adenocarcinoma

    This data collection is part of a larger effort to build a research community focused on connecting cancer phenotypes to genotypes by providing clinical images matched to subjects from The Cancer Genome Atlas (TCGA).

    The Cancer Imaging Archive, 1,786 Instances
  • Cancer Genome Atlas Sarcoma

    This data collection is part of a larger effort to build a research community focused on connecting cancer phenotypes to genotypes by providing clinical images matched to subjects from The Cancer Genome Atlas (TCGA).

    The Cancer Imaging Archive, 1,786 Instances
  • Cancer Genome Atlas Stomach Adenocarcinoma

    This data collection is part of a larger effort to build a research community focused on connecting cancer phenotypes to genotypes by providing clinical images matched to subjects from The Cancer Genome Atlas (TCGA).

    The Cancer Imaging Archive, 43,908 Instances
    Medical image of stomach cancer
  • Cancer Genome Atlas Thyroid Cancer

    This data collection is part of a larger effort to build a research community focused on connecting cancer phenotypes to genotypes by providing clinical images matched to subjects from The Cancer Genome Atlas (TCGA).

    The Cancer Imaging Archive, 2,780 Instances
    A medical imaging scan of thyroid cancer
  • Cancer Genome Atlas Urothelial Bladder Carcinoma

    This data collection is part of a larger effort to build a research community focused on connecting cancer phenotypes to genotypes by providing clinical images matched to subjects from The Cancer Genome Atlas (TCGA).

    The Cancer Imaging Archive, 78,429 Instances
  • Cancer Genome Atlas Uterine Corpus Endometrial Carcinoma

    This data collection is part of a larger effort to build a research community focused on connecting cancer phenotypes to genotypes by providing clinical images matched to subjects from The Cancer Genome Atlas (TCGA).

    The Cancer Imaging Archive, 71,674 Instances
  • CHESTXRAY14

    Frontal view chest X-ray images labeled considering 14 common thorax disease conditions.

    National Institutes of Health, 112,129 Instances
    A chest x-ray image
  • MURA

    MURA (musculoskeletal radiographs) is a large dataset of bone X-rays. Algorithms are tasked with determining whether an X-ray study is normal or abnormal.

    Stanford ML Group, 40,561 Instances
  • Osteoarthritis Initiative

    This is a multi-center, longitudinal, prospective observational study of knee osteoarthritis (OA). The overall aim of the OAI is to develop a public domain research resource to facilitate the scientific evaluation of biomarkers for osteoarthritis as potential surrogate endpoints for disease onset and progression.

    NIMH Data Archive, 8,892 Instances