Synthetic datasets of the UK Biobank cohort
This repository stores synthetic datasets derived from the database of the UK Biobank (UKB) cohort. The datasets were generated for illustrative purposes, in particular for reproducing specific analyses on the health risks associated with long-term exposure to air pollution using the UKB cohort. The code used to create the synthetic datasets is available and documented in a related GitHub repo, with details provided in the section below. These datasets can be freely used for code testing and for illustrating other examples of analyses on the UKB cohort. Note: while the synthetic versions of the datasets resemble the real ones in several aspects, the users should be aware that these data are fake and must not be used for testing and making inferences on specific research hypotheses. Even more importantly, these data cannot be considered a reliable description of the original UKB data, and they must not be presented as such. The original datasets are described in the article by Vanoli et al in Epidemiology (2024) (DOI: 10.1097/EDE.0000000000001796), which also provides information about the data sources.
Keywords
Epidemiology, UK Biobank, Air Pollution, Cohort StudiesItem Type | Dataset |
---|---|
Capture method | Simulation |
Date | 23 October 2024 |
Language(s) of written materials | English |
Creator(s) |
Gasparrini, A |
LSHTM Faculty/Department | Faculty of Public Health and Policy > Dept of Public Health, Environments and Society |
Participating Institutions | London School of Hygiene & Tropical Medicine, London, United Kingdom |
Funders |
Project Funder Grant Number Funder URI |
Date Deposited | 26 Feb 2025 14:57 |
Last Modified | 26 Feb 2025 15:04 |
Publisher | Zenodo |