Dataset codebook for a 30-year population-based study examining the contribution of remote M.tuberculosis infection to tuberculosis: Data Codebook

Permanent identifier

DOI: 10.17037/DATA.00002829

Description

An anonymised dataset of 57,267 study participants who took part in district-wide surveys between 1980-84 and 1985-1989 as part of long-term studies of leprosy in Karonga, Malawi. The second survey was the recruitment phase of a trial of repeat BCG and killed M leprae. In each survey a subset of the population selected by area of residence received a tuberculin skin test. These were done before the trial vaccines were given. Since the early 1980s, all individuals diagnosed with TB in the district have been recorded as part of epidemiological studies of TB and the vaccine trial follow-up. Case ascertainment relied largely on enhanced passive ascertainment, with field staff based at the hospital and peripheral health centres to screen individuals with chronic cough or otherwise suspected to have TB. Dataset contains variables

After the second survey in the 1980s there have been no further total population surveys, but follow-up information is available from other studies conducted in the district, including small population-based surveys and demographic surveillance in different areas of the district. All studies use common identifiers for individuals, so the date individuals were seen, or the date they were reported to have left the district or died, as recorded in other studies, can be used to estimate the date individuals were last known to be alive and in the district.

In 2009-2013, individuals recorded as having TST induration sizes of greater than 20mm in the 1980’s surveys were identified. Those who were not already known from continuing studies in the area to have developed TB or to have died or left the district were sought. Interviewers asked those found living in Karonga district about prior treatment for TB and current cough. For those who had left the district or died, a suitable informant was asked about prior treatment for TB as well as dates of departure or death.

Data access

The dataset was collected prior to data sharing expectations being introduced (pre-2000. As a result, additional ethical clearance must be obtained prior to sharing. Please contact MEIRU directly (info@meiru.mw) for correspondence related to the data.

Data codebook

The following table outlines variables contained within the dataset that have been analysed.

Variable name Variable description Answer code Answer label Variable type
id Identifier     integer
area Area of residence 1 rural and truck stop integer
    2 rural  
    3 periurban  
    4 urban  
    5 rural and trading area  
    6 rural and border  
sexn Sex 0 female integer
    1 male  
age3 Age (3 categories) 1 <15 years integer
    2 15-19 years  
    3 ≥30 years  
ageTSTgrp5 Age at initial TST (5 categories) 1 <15 years integer
    2 15 - 29  
    3 30 - 44  
    4 45 - 59  
    5 ≥ 60 years  
skint_date Date of skin test     date
exitdate16 Censor date   Date of earliest date of TB diagnosis or date last known alive in district etc integer
lep12 Participant was skin tested in LEP1 or LEP2 survey LEP1   factor
    LEP2    
tbCP16 Confirmed TB 0 No Integer
    1 Yes  
tbhiv016 HIV negative TB 0 No Integer
    1 Yes  
tbhiv116 HIV positive TB 0 No Integer
    1 Yes  
largeTST TST induration size (3 categories) 0 <5mm Integer
    1 18-20mm  
    2 >20mm  
large17 TST induration size (2 categories) 0 <5mm integer
    1 >17mm  
tbunlink Unlinked TB (based on molecular epi data) 0 No Integer
    1 Yes  
tblinked Linked TB (based on molecular epi data) 0 No Integer
    1 Yes  
nofup No follow-up information available 0   Integer
    1    
survgrp Source of follow up information 1 No follow-up information Integer
    2 Whole population survey  
    3 Sample population survey  
    4 Baseline census  
    5 Demographic surveillance  
    6 Follow-up of those with TST>20mm  
    9 Other