Data Resource – SD

What is the Synthetic Derivative (SD)?

The Synthetic Derivative (SD) is a rich, multi-source repository of data collected from VUMC’s clinical records and de-identified for use in research.

Type of Data

  • Electronic Health Records (EHR)

    Years Available

    1980s to present, most robust starting 2001

    Description

    The SD is a de-identified database created using electronic scrubbing techniques to remove identifiers while maintaining semantic integrity. Identifiers such as names and dates are replaced or shifted in a consistent but anonymized manner. The database includes over 3.9 million records and is structured according to the OMOP common data model, with some custom tables. As it contains no HIPAA identifiers, the SD qualifies as non-human subjects research.

    Strengths

    • Contains over 3.9 million de-identified records.
    • Compliant with HIPAA Safe Harbor standards.
    • Integrated with BioVU genomics data.
    • Connected to ImageVU and MicroVU.
    • Can be accessed using a self service web tool, SD Discover, at no cost to users.

        Limitation

        • Data from the 1980s-2000 may be incomplete or inconsistent due to evolving documentation practices and system changes.
        • Dates are systematically shifted, preventing exact date recovery.
        • Not all medical record data is included, though new elements are regularly added.
        • Available for research purposes only.

        Availability

        • SD data can be accessed through:

          • SD Discover: A free, self-service user-interface for cohort selection and select data element export.
          • IDASC custom programming: A billable service for complex phenotype criteria or data requirement requests. Note: Funding is required to pay for hourly programming and project management costs.
          • Databricks access: A workspace available for investigators who have the expertise to write their own SQL queries. Funding is required to pay for workspace setup and compute costs.

        Contact Person

        • VICTR Big Data team: victrbigdata@vumc.org

        Website