A compiled pseudo-anonymised quantitative dataset collected as part of the PrEPVacc registration cohort and trial. The registration cohort was set up to prepare a population of HIV negative individuals at risk of acquiring HIV for possible participation in the PrEPVacc phase IIB HIV prophylactic vaccine and pre-exposure prophylaxis trial. There are two components to this dataset, firstly the cohort dataset that was used to develop a regression model predicting HIV incidence. Secondly, a trial dataset with the participant characteristics that could be entered in the model and utilised for prediction of HIV incidence that could have occurred if participants in the active-controlled PrEP trial component had never been dispensed PrEP. The PrEPVacc trial was conducted at four study sites in Tanzania (2 sites), South Africa, and Uganda. This dataset is as of 31 Jan 2024.