
capture log close
log using "$logdir\an_timeseries_autocorrelation.txt", text replace

/*********************************
TESTING FOR RESIDUAL AUTOCORRELATION 
**********************************/

/*test in time series models as this is our model checking stage and we can't use
 tsset command with gender and age strata*/

/*********************************
PREPARE DATA
*********************************/

use "$datadir/cr_weeklycounts_region_gold_primary", clear
append using "$datadir/cr_weeklycounts_region_aurum_primary"

drop if agestratum == 99
collapse (sum) deaths denominator, by(year week)
gen studypop = 1
local strata = "studypop"

include "$dodir\inc_an_glm_prepare.do"
keep if pandemic == 0

***************************
*Model checking (residuals)
***************************
*(a) Re-fit the unconstrained distributed lag model
*? does it have to be the lag model if we do not think there are any lagged effects?

*glm deaths year_c year_c2 sin* cos* ib10.agestr ib2.gender i.pandemic, family(nb ml) link(log) exposure(denominator) eform base
glm deaths year_c year_c2 sin* cos*, family(nb ml) link(log) exposure(denominator) eform base


*(b) Generate the deviance residuals and plot vs time
predict dres, d
scatter dres week, name(FigA1, replace) 

*(c) Partial autocorrelation function plot (PAF) of the deviance residuals - original model
tsset weekfrom
pac dres, title("(a) From original model") name(a2a, replace) yscale(range(-0.2(0.2)0.6)) ylabel(-0.2(0.2)0.6)

*(d) Include the 1-day lagged residuals in the model, and re-draw the PAF plot
*? why 1 day lagged residuals?

gen dresL1=dres[_n-1]
x
glm deaths year_c year_c2 sin* cos* dresL1, family(nb ml) link(log) exposure(denominator) eform base

predict dresV2, d
pac dresV2, title("(b) From model adjusted for residual autocorrelation") name(a2b, replace) yscale(range(-0.2(0.2)0.6)) ylabel(-0.2(0.2)0.6)

graph combine a2a a2b, cols(1) ysize(9) name(FigA2, replace)
graph drop a2a a2b
x

***************************
*Alternative graphs
***************************

local autocorrelation = 1
glm deaths year_c year_c2 sin* cos*, family(nb ml) link(log) exposure(denominator) eform base
include inc_an_timeseries_estimates

X tidy up an_timeseries_graphs_sensitivity, run with new estimates and combine with 

glm deaths year_c year_c2 sin* cos* dresL1, family(nb ml) link(log) exposure(denominator) eform base
include inc_an_timeseries_estimates

X tidy up an_timeseries_graphs_sensitivity, run with new estimates and combine with 

drop dres*
