The Synthetic Derivative (SD) is a rich, multi-source repository of data collected from VUMC’s clinical records and de-identified for use in research.
1980s to present, most robust starting 2001
The SD is a de-identified database created using electronic scrubbing techniques to remove identifiers while maintaining semantic integrity. Identifiers such as names and dates are replaced or shifted in a consistent but anonymized manner. The database includes over 3.9 million records and is structured according to the OMOP common data model, with some custom tables. As it contains no HIPAA identifiers, the SD qualifies as non-human subjects research.
SD data can be accessed through: