https://doi.org/10.17037/DATA.00001522
The ALPHA network is an innovative secondary data analysis program aimed at improving our understanding of the HIV epidemiology. ALPHA is coordinated by its secretariat in the Department of Population Health (DPH) under the Faculty of Epidemiology and Population Health at the London School of Hygiene and Tropical Medicine. It comprises of 10 autonomous research institutions sharing similar interests in HIV Epidemiology. Each institution has its own research agenda and data management system. All partners pre-date the network formation. They all have population/community-based longitudinal demographic and HIV surveillance data.
ALPHA leverages the benefits of data pooling - Better statistical power gained by bringing together data from a number of research institutions and a wider perspective not possible to achieve with one research institution.
ALPHA assembles datasets on various topics related to demographic and HIV surveillance. These data are referred to as ALPHA data specifications or data specs and are described on the ALPHA metadata page. The ALPHA data specs have a well-defined structure to which each partner of the network has to transform their data. ALPHA is organised around data analysis and HIV research capacity strengthening workshops. At the workshops, partners bring their data and are involved in data analysis training addressing research questions of interest for the particular workshop.
ALPHA is working on a project to produce a sharable set of harmonised data that combines both population-based and clinic data from the partner studies with funding from the Wellcome Trust.
Whilst community-based cohorts and demographic surveillance systems provide a rich source of data, use of the data is often limited because successful analysis requires detailed knowledge of the study's contemporary and historical procedures and of data management practices. To date the ALPHA Network has successfully extracted and harmonised 10 standard data tables from the partner studies. However, these data are still complex and require considerable prior knowledge to use effectively, which in practice means the data can only be used in collaboration with one of the ALPHA staff.
The main project combines a number of activities among them:
This data collection resulted from a study relating to the second activity on data documentation. It contains qualitative data collected as part of scoping work to establish domain experts’ perspectives on the functionality that a user-friendly metadata browser for ALPHA datasets should provide. It contains transcripts of 10 semi-structured Skype interviews conducted with individual researchers and data managers affiliated to the ALPHA network and the Cohort & Longitudinal Studies Enhancement Resources (CLOSER) project. Interviews explored proposed features of the metadata browser, including: provision for viewing all tasks performed in the process of creating a dataset, browsing the steps in each task, task purpose, related concepts, related code scripts, association between a sub-task and its input data and outputs and provision for viewing data structure.
A convenience sample of 10 participants was drawn from data managers and researchers affiliated to the ALPHA and CLOSER projects. These two groups of users, were considered suitable for identifying the requirements of both internal and external users. All but one interviewee had at least a master’s degree and 5 years work experience.
The data collection consisted of background material reading and a recorded Skype interview. An information pack was emailed to the study participants prior to the interview. In this pack there were the following items: (1) a study background document, (2) an information sheet, (3) a consent form, and (4) a question guide comprising of the 6 mock-up diagrams of the proposed features and accompanying questions.
Each participant was interviewed over Skype on the features in the mock-up diagrams using the semi-structured question guide. The participants graded each feature’s importance on a provided scale and gave the rationale for their grading. Further, they listed any desired features not included in the mock-ups.
All the interviews were recorded and transcribed verbatim, checked and cleaned by the lead researcher.
Southern and Eastern Africa and United Kingdom.
The lead researcher transcribed the audio files from the interviews, checked the transcripts and cleaned them as needed with the support of software developers in the research team.
Human population
Names of participants and other identifying information such as place names were removed and replaced with pseudonyms.
All participants gave their permission for the transcripts to be archived in an anonymised form for use in future research.
LSHTM ethics ref: 16429
ALPHA, provenance, metadata, data harmonisation, requirements elicitation
English
PhD Thesis: Provenance of “after the fact” harmonised community-based demographic and HIV surveillance data from ALPHA cohorts.
Interview transcripts produced during this study are embargoed until 30/06/2020 to enable sufficient anonymisation of the data. Subsequent access may be granted for secondary analysis for other purposes.
All other accompanying files are public access.
Forename | Surname | Faculty / Dept | Institution | Role |
Chifundo | Kanjala | Population Health | LSHTM | Data Creator |
Arofan | Gregory | DDI Alliance | Co-Investigator | |
Jay | Greenfield | DDI Alliance | Co-Investigator | |
Emma | Slaymaker | Population Health | LSHTM | PhD Supervisor |
Jim | Todd | Population Health | LSHTM | PhD Supervisor |
Forename | Surname | Faculty / Dept | Institution | Role |
Gareth | Knight | LSHTM | PhD Advisor | |
Tito | Castillo | Guy's and St Thomas'? NHS Foundation Trust | PhD Advisor | |
David | Beckles | Independent IT Consultant | PhD Advisor |
Filename | Description | Access status | Licence | Embargo period |
Interview-transcripts.zip | Compressed archive contains: 10 anonymised interview transcripts provided in MS Word | Request access | Data Sharing Agreement | 2020-07-01 |
CiBDoS_Research_Protocol | CiBDoS_Research_Protocol | Open | Creative Commons Attribution (CC-BY) | |
CiBDoS_Requirements_Background | Centre in a Box data documentation (CiBDoS) software requirements elicitation study - Background information | Open | Creative Commons Attribution (CC-BY) | |
CiBDoS_Requirements_Questionnaire | Question guide template for the Centre in a Box software requirements elicitation study | Open | Creative Commons Attribution (CC-BY) |