Predictive Modelling for Stillbirths and Neonatal Deaths in Sub-Saharan Africa
Initial release of the reproducible analytical pipeline for data harmonisation and predictive modelling of stillbirths and neonatal deaths in sub-Saharan Africa (SSA), integrating seven contributing studies: the Action Leveraging Evidence to Reduce Perinatal Mortality and Morbidity trial (ALERT), the Every Newborn-INDEPTH study (EN-INDEPTH), the Preterm Birth Initiative (PTBi), the Pregnancy Care Integrating Translational Science Everywhere cohort (PRECISE), the WHO Multi-Country Survey on Maternal and Newborn Health (WHOMCS), the Neonatal Care Outcomes Project Study (NCOPS), and Demographic and Health Surveys (DHS). The unified dataset contains 5,996,390 birth records from 66 countries.
The pipeline implements a structured five-stage harmonisation framework: (1) ethical data acquisition and governance; (2) variable mapping across 13 harmonised domains using standardised domain-prefix naming conventions; (3) value standardisation and recoding using structured case-when logic with regular-expression pattern matching; (4) linkage of environmental and climate data from ERA5, CHIRPS, SRTM, ACAG and MODIS sources at 99.3% completeness; and (5) quality assurance including range validation, cross-tabulations and logical consistency checking. The modelling pipeline benchmarks classical statistical methods (logistic regression, generalised estimating equations), ensemble machine learning (Random Forest, XGBoost, LightGBM, CatBoost) and exploratory deep learning (multilayer perceptrons) across four prediction scenarios and two primary outcomes. Interpretability analysis uses SHapley Additive exPlanations (SHAP) values throughout.
The methodology is documented in Data Harmonisation Documentation Version 6.1 and Statistical Analysis Plan Version 1 (February 2026), both publicly deposited on the Open Science Framework prior to commencement of model development analyses.
Keywords
stillbirth; neonatal death; sub-Saharan Africa; predictive modelling; data harmonization; machine learning; perinatal health; LMIC; low- and middle-income countries; ALERT; DHS; EN-INDEPTH; PRECISE; PTBi; WHOMCS; NCOPS; PRECISE study| Item Type | Dataset |
|---|---|
| Resource Type |
Resource Type Resource Description Software R script |
| Capture method | Simulation, Other |
| Date | 15 March 2026 |
| Language(s) of written materials | English |
| Creator(s) |
Akuze, J |
| Associated roles | Dadelszen, Pv (Other); Waiswa, P (Other); Lawn, JE (Other); Hanson, C (Other) and Volvert, M |
| LSHTM Faculty/Department |
Faculty of Epidemiology and Population Health > Dept of Infectious Disease Epidemiology & International Health (2023-) Faculty of Infectious and Tropical Diseases > Dept of Disease Control |
| Participating Institutions | London School of Hygiene & Tropical Medicine, London, United Kingdom |
| Date Deposited | 16 Mar 2026 10:16 |
| Last Modified | 16 Mar 2026 10:16 |
| Publisher | Zenodo |
Explore Further
- Zenodo (Online Data Resource)
- Github (Data)
- Software Heritage Archive (Data)
No files available. Please consult associated links.
- Zenodo (Online Data Resource)
- Github (Data)
- Software Heritage Archive (Data)