ERASE-TB: cross-sectional HIV, malnutrition and non-communicable diseases dataset – Data Codebook

Persistent identifier



Data produced as part of a study investigating the prevalence and clustering of chronic conditions among members of tuberculosis-affected households in Mozambique, Tanzania and Zimbabwe (ERASE-TB). This is a cross-sectional dataset, with one row per person, indicating the participant demographics and data on BMI category, HIV status, diabetes status (based on HbA1c), anaemia status, hypertension status, and presence of chronic lung disease (determined using spirometry).

Data codebook

The following variables are contained in the ERASE chronic conditions dataset. Further information may be found in the accompanying publications (linked from the above DOI) or by contacting the study team.

Variable name Variable label Answer label Answer code Variable Type
id ID number (first letter for Site, 3 numbers for Household ID, 2 numbers for person ID) SiteID-HouseholdID-PersonID   Numeric
Site Site     String
Sex Participant sex     String
Age_Cat Age categories     String
      10-17 years  
      18-39 years  
      40+ years  
Education Highest education level     String
      None or primary school  
      At least secondary school  
Pregnancy_Status Pregnancy status (Only asked for female participants)     String
    Not pregnant No  
    Pregnant Yes  
    Not applicable NA  
Smoking_Cat_Bin Smoking status     String
    Non-smoker Non smoker  
    Smoker (current and/or former) Smoker (current and/or former)  
    Not applicable/missing NA  
Smoking_PackYears Number of pack years smoking (number of years smoked x number of standard packs of cigarettes (20) smoked per day)     String
    Not applicable (Only asked for current and/or former smokers) NA  
AUDIT_Score_Cat Alcohol consumption (AUDIT score)     String
    Never drunk alcohol Never drunk alcohol  
    Alcohol, AUDIT-C negative Alcohol, AUDIT-C negative  
    Alcohol, AUDIT-C positive Alcohol, AUDIT-C positive  
    Not applicable NA  
InsufficientFood Was there any day in the past six months where you did not have enough food?     String
    No No  
    Yes Yes  
    Not applicable NA  
Index_Relation Participant's relationship to index case (person with TB in their household )     String
Hhold_Total_People Total number of people in household     Numeric
Hhold_Crowding_UN Household crowding as defined by the United Nations (>= 3 people per room), calculated from number of room and total number of people reported as living in the household     String
    No No  
    Yes Yes  
    Not applicable NA  
Hhold_Income_Day Household income per person per day (US dollars)     Numeric
Hhold_IndexBreadwinner Primary earner is the person with TB     String
    No No  
    Yes Yes  
    Not applicable NA  
Hhold_Poverty Household income less than 1.90USD per person per day     String
    No No  
    Yes Yes  
    Not applicable NA  
Hhold_ResidenceArea Area of residence     String
    Urban Urban  
    Peri-Urban Peri-Urban  
    Rural Rural  
    Not applicable NA  
Comorb_TBEver Participant self-report on whether they have been treated for TB in the past     String
    No No  
    Yes Yes  
HIV_IndexStatus HIV status of index case     String
      Positive, on ART  
      Positive, not on ART  
HIV_Status HIV status of participant (is participant HIV positive?)     String
HIV_Category HIV category     String
    Known Known  
    Screening detected Screening detected  
    None None  
    Not applicable NA  
HIV_CD4_Grade CD4 category     String
    Less than 200cells/uL <200cells/uL  
    200-499 cells/uL 200-499cells/uL  
    500 cells/uL or more >=cells/uL  
    Not applicable NA  
Diabetes_Status Participant diabetes status (do they have diabetes?)     String
    Yes Yes  
    No No  
    Not applicable NA  
Diabetes_Category Diabetes category     String
    Known Known  
    Screening detected Screening detected  
    None None  
    Not applicable NA  
Diabetes_Grade HbA1c category     String
    Less than 6.0% <6.0%  
    6.0-6.4% 6.0-6.4%  
    6.5-6.9% 6.5-6.9%  
    7.0% or more >=7.0%  
    Not applicable (Not assessed for participants <18 years) NA  
Hypertension_Status Hypertension status (does participant have hypertension?)     String
    Yes Yes  
    No No  
    Not applicable (Not assessed for participants <18 years) NA  
Hypertension_Category Hypertension category     String
    Known Known  
    Screening detected Screening detected  
    None None  
    Not applicable (Not assessed for participants <18 years) NA  
Hypertension_Grade Blood pressure category     String
    Normal BP Normal BP  
    High-normal BP High-normal BP  
    Grade 1 hypertension Grade 1 hypertension  
    Grade 2 hypertension Grade 2 hypertension  
    Not applicable NA  
BMI_Grade BMI category     String
    Moderate/severe underweight Moderate/severe underweight  
    Mild underweight Mild underweight  
    Healthy weight Healthy weight  
    Overweight Overweight  
    Obese Obese  
BMI_Underweight BMI <18.5kg/m2     String
    No No  
    Yes Yes  
BMI_Obese BMI ≥30kg/m2     String
    No No  
    Yes Yes  
Stunting Height for age category     String
    Normal Normal  
    Mild stunting Mild stunting  
    Moderate stunting Moderate stunting  
    Severe stunting Severe stunting  
    Not applicable (Assessed for participants <18 years only) NA  
TB_Status TB diagnosed     String
    Yes Yes  
    No No  
Smoking_Cat Smoking status     String
    Never smoked Never smoked  
    Former smoker Former smoker  
    Current smoker Current smoker  
    Not applicable NA  
ImpLungFn_Status Impaired Lung function (binary variable), determined using spirometry and interpreted according to European Respiratory Society/American Thoracic society guidelines with the African American reference standard     String
    Yes Yes  
    No No  
    NA NA  
ImpLungFn_Type Chronic lung disease type     String
    None None  
    Not applicable NA  
    Obstruction Obstruction  
    PRISm PRISm  
ImpLungFn_Grade Chronic lung disease severity grading     String
    Not applicable NA  
    Mild Mild  
    Moderate Moderate  
    Severe Severe  
    None None  
FVC_Grade Forced vital capacity grading, as per ERS/ATS criteria with the African American reference standard     String
    None None  
    Not applicable NA  
    Mild Mild  
    Moderate Moderate  
    Severe Severe  
FEV1FVC_Grade Forced expiratory volume in 1 second/vital capacity grading, as per ERS/ATS criteria with the African American reference standard     String
    None None  
    Not applicable NA  
    Mild Mild  
    Moderate Moderate  
    Severe Severe  
Anaemia_Status Anaemia (binary variable) defined using haemoglobin measurements, interpreted using World Health Organization age/sex-specific thresholds     String
    No No  
    Yes Yes  
Anaemia_Grade Anaemia category     String
    None None  
    Mild anaemia Mild anaemia  
    Moderate anaemia Moderate anaemia  
    Severe anaemia Severe anaemia  
HIV_CD4 CD4 count / cells/uL (for participants with HIV)     Numeric
    Participant does not have HIV NA  
HbA1c HbA1c measurement, expressed as a percentage     Numeric
BP_Sys_Overall Systolic Blood Pressure in mmHg     Numeric
BP_Dia_Overall Diastolic Blood Pressure in mmHg     Numeric
Hb Haemoglobin in g/L     Numeric