Transmission dynamics of monkeypox in the United Kingdom: contact tracing study

Conclusions Analysis of the instantaneous growth rate of monkeypox incidence indicates that the epidemic peaked in the UK as of 9 July and then started to decline. Short serial intervals were more common than short incubation periods suggesting considerable pre-symptomatic transmission, which was validated through linked patient level records. For patients who could be linked through personally identifiable data, four days was the maximum time that transmission was detected before symptoms manifested. An isolation period of 16 to 23 days would be required to detect 95% of people with a potential infection. The 95th centile of the serial interval was between 23 and 41 days, suggesting long infectious periods.

Results The mean age of participants was 37.8 years and 95% reported being gay, bisexual, and other men who have sex with men (1160 out of 1213 reporting). The mean incubation period was estimated to be 7.6 days (95% credible interval 6.5 to 9.9) using the ICC model and 7.8 days (6.6 to 9.2) using the ICRTC model. The estimated mean serial interval was 8.0 days (95% credible interval 6.5 to 9.8) using the ICC model and 9.5 days (7.4 to 12.3) using the ICRTC model. Although the mean serial interval was longer than the incubation period for both models, short serial intervals were more common than short incubation periods, with the 25th centile and the median of the serial interval shorter than the incubation period. For the ICC and ICRTC models, the corresponding estimates ranged from 1.8 days (95% credible interval 1.5 to 1.8) to 1.6 days (1.4 to 1.6) shorter at the 25th centile and 1.6 days (1.5 to 1.7) to 0.8 days (0.3 to 1.2) shorter at the median. 10 out of 13 linked patients had documented pre-symptomatic transmission. Doubling times of cases declined from 9.07 days (95% confidence interval 12.63 to 7.08) on the 6 May, when the first case of monkeypox was reported in the UK, to a halving time of 29 days (95% confidence interval 38.02 to 23.44) on 1 August.

Main outcome measures The incubation period and serial interval of a monkeypox infection using two bayesian time delay models—one corrected for interval censoring (ICC—interval censoring corrected) and one corrected for interval censoring, right truncation, and epidemic phase bias (ICRTC—interval censoring right truncation corrected). Growth rates of cases by reporting date, when monkeypox virus was confirmed and reported to UKHSA, were estimated using generalised additive models.

To estimate both the serial interval and the incubation period of monkeypox we used a large sample from the UK Health Security Agency (UKHSA) surveillance and contact tracing data. To obtain data on the incubation period we analysed completed case questionnaires and linked infected individuals to probable exposure dates. To obtain the serial interval data we used self-reported symptom onset dates and linked case-contact pairs (linked pairs of primary and secondary cases). We then applied a bayesian model correcting for double interval censoring (ICC) 17 and a bayesian model correcting for double interval censoring, right truncation, and epidemic phase bias (ICRTC) to these data to estimate the serial interval and incubation period distributions of monkeypox.

To understand the transmission dynamics of the monkeypox outbreak, accurate estimates are needed of the time between subsequent infections (generation time) and time from becoming infected to developing symptoms (incubation period). Infection time is rarely observed directly, so the generation time is generally approximated using the serial interval—the time from symptom onset in a primary case (an individual with the index infection) to symptom onset in a secondary case (an individual who becomes infected by the primary case). 9 Typical monkeypox symptoms are listed on the National Health Service website and include rash (for example, on the mouth, genitals, anus), high temperature, headache, and muscle aches. 10 Serial interval and incubation period estimates are important for informing policy decisions around post-infection quarantine periods and post-contact isolation periods, respectively, as well as for understanding the dynamics of viral transmission, such as potential transmissibility before symptoms manifest. 11 12 Based on observational studies of monkeypox from the Democratic Republic of the Congo, estimates for the incubation period range from 4-14 days and for the serial interval from 8-11 days. 13 14 For the Western Africa clade that is currently circulating in the UK, 15 early research suggests a mean incubation period of between 6.6 and 10.9 days. 16 This estimate is, however, based on limited data and thus far no research pertaining to the serial interval has been released.

Monkeypox, a zoonotic disease, was identified in 1958 in monkeys showing signs of a poxvirus. 1 The disease is caused by a virus belonging to the orthopoxvirus genus and was first detected in humans in 1970 in the Democratic Republic of the Congo. 2 The disease has since become endemic in that region and spread to other central and west African countries. Such spread has resulted in divergence of the virus, with two distinct clades circulating in different regions of Africa. The two clades, the Congo Basin and Western African, show distinct epidemiological characteristics. Surveillance and laboratory studies have found the Congo Basin clade to the more severe of the two, with higher transmissibility. 3 4 In May 2022, the World Health Organization reported a monkeypox outbreak in several originally non-endemic countries, 5 since linked to the Western African clade. 6 These cases were of considerable concern as they could not be clearly linked to recent travel from an endemic area. On 6 May 2022, monkeypox was detected in England in a patient who had recently travelled to Nigeria. A week later monkeypox was identified in two more people, with no links to the first patient. Between 6 May 2022 and 12 September 2022, 3552 cases of monkeypox have been confirmed in the United Kingdom. 7 The international dispersion of the virus has resulted in the largest outbreak of monkeypox reported outside of Africa. In July 2022 WHO declared the outbreak a Public Health Emergency of International Concern. 8

Methods

Epidemiological data Data were collected on monkeypox from UKHSA health protection teams, targeted testing of infected individuals (with specimens processed by UKHSA affiliated laboratories and NHS laboratories), and questionnaires (collected by UKHSA health protection teams). We defined a confirmed case as an individual with a positive polymerase chain reaction (PCR) test result for monkeypox virus, and a highly probable case as an individual with a positive PCR test result for orthopoxvirus. As of 25 July 2022, both definitions were recognised in the UK as representing a case of monkeypox. UKHSA health protection teams identified pairs of linked individuals through contact tracing. If an individual was identified as a contact by a case and became a case or was already a case, we recorded these as a case-contact pair. In the analysis, we assume that the direction of transmission is based on the date order of symptom onset, because the direction of transmission cannot be otherwise ascertained.

Data preparation Data were extracted as of 1 August 2022, at which time 2746 people had been identified with monkeypox in the UK. We identified the dates of symptom onset for the case-contact pairs from HPZone (see box 1) by matching pseudo identifier numbers to the line list (see box 1), and we selected only case-contact pairs with a confirmed positive PCR test result for monkeypox for both individuals. From the dataset we removed records with missing data for symptom onset and pseudo identifier number, as well as duplicates. If two records had the same pseudo identifier numbers for both individuals in the case-contact pair we assumed these to represent duplicates. A total of 220 case-contact pairs were reported in HPZone, 79 with a symptom onset date for both individuals in the case-contact pair, forming our serial interval cohort. For each case-contact pair, we refer to the individual with the primary infection as a primary case, and the individual infected by the primary case as the secondary contact. Box 1 Data source definitions HPZone UKHSA health protection teams store data collected during an incident in the HPZone Line list The line list contains a list of confirmed infected individuals in the UK obtained from test data (compiled and deduplicated) from UKHSA affiliated laboratories, National Health Service trust laboratories, and HPZone data, along with supplementary data from the case questionnaires Questionnaires Data are obtained from three types of questionnaires: The rapid sexual health questionnaire

A questionnaire administered by health practitioners

An anonymous self-completed questionnaire

All questionnaires are optional, and individuals are not required to complete all questions RETURN TO TEXT We identified exposure dates for the incubation period from questionnaire data filled out by cases. Cases had the option of answering “On what date did your illness begin?,” “In the 21 days (3 weeks) before first symptom onset did you have contact with anyone with suspected or confirmed monkeypox infection?,” and “Date of last contact with case.” An exposure date was therefore defined as the last date an individual reported contact with a known case in the 21 days before symptoms manifested, and symptom onset was defined as the date symptoms manifested. This definition of symptom onset describes the date that an individual first noticed their symptoms; however, the true date of symptom onset could have been earlier but not detected. As of 1 August 2022, 650 people had completed questionnaires, 54 of whom had provided information on symptom onset date and had reported the date of last contact with a primary case, forming our incubation period cohort. When negative incubation periods were identified, we assumed that the patient was not infected by the named contact, as this is not possible. If we identified negative serial intervals, we assumed that this record in the dataset had incorrectly identified the direction of transmission in the case-contact pairs, and the order of the primary case and secondary contract were reversed. Negative serial intervals are possible in the presence of pre-symptomatic transmission, although the nature of contact tracing data means that the direction of transmission can be difficult to infer from the data, and positive serial intervals are more likely than negative serial intervals. To investigate potential exposure dates that occurred before symptom onset in a primary case, which would suggest pre-symptomatic transmission, we linked data on the date of symptom onset in the primary case with exposure dates in their secondary contacts. Firstly, we linked case-contact pairs to questionnaire records for each secondary contact, which enabled the primary case and the exposure date to be identified for the secondary contact. We then used the pseudo identifier number to link the identified primary case to the line list to obtain the date of symptom onset. This relies on the assumption that the primary case identified by the secondary contact in the questionnaire is the same primary case identified in case-contact pair. As of 1 August 2022, 92 records (from 650 questionnaires) showed complete data for exposure date from the questionnaire and the pseudo identifier number needed for linking the datasets. From those, 30 had documented case-contact links in the HPZone data, allowing the primary case to be identified. Of the 30 primary cases, 19 had a symptom onset date recorded in the line list. Three of the 19 records we identified reported a negative incubation period and were excluded. From the remaining 16 records, seven primary cases had personal identifiable information in both the questionnaire and the line list, allowing us to verify whether the individual identified by the secondary contact was also the primary case from the case-contact pair. For the seven primary cases with available personal identifiable information, four matched and three did not match. After excluding the primary cases who did not match on personal identifiable information, 13 case-contact pairs remained. These 13 form our cohort for investigating pre-symptomatic transmission. When we compared the subsamples obtained through this data processing with the total set of patients (table 1), the mean age and proportion of patients who reported being gay, bisexual, and other men who have sex with men was consistent across all samples. The subsamples therefore captured the two key personal characteristics of infected individuals in the outbreak. Table 1 Proportion of patients who reported being gay, bisexual, and other men who have sex with men (GBMSM) and mean age of each study sample compared with the total set of patients View this table:

Time delay distribution modelling Incubation periods and serial intervals are examples of time delay distributions, which describe the distribution of times between two coupled events. For the incubation period, this is the time between the date patients were exposed (primary event) and their symptom onset date (secondary event). For the serial interval, this is the time between the symptom onset date in the primary case (primary event) and the symptom onset date in the secondary contact (secondary event). During an ongoing epidemic, time delay distribution observations are either right truncated or right censored. Right truncation emerges when data are only observed after the second event occurs, such as infections being identified only after cases emerge. Right censoring occurs when an individual is known to have been exposed to an event, but the event has not occurred yet. In the context of our study, a right truncation bias exists because individuals only enter our data after they develop symptoms and seek a test. Right truncation leads to the observed distribution of time delays being biased towards shorter observations, since for a delay when the primary event occurs close to the final date of observation, only the secondary event will be observed if the delay is short. To adjust for the right truncation, we fitted a double interval censoring and right truncation corrected parametric delay distribution. The right truncation primarily affects recent observations and has less of an influence on older observations. We adapted the method from Ward and Johnsen17 and Vekaria et al.18 The double interval censoring corrects for the coarseness of the data, whereby only the date each event occurs is known rather than the time, which leads to a 24 hour window during which each event could have occurred. In this method, we assume that the primary event (symptom onset in the primary case for serial interval or exposure date for incubation period) for each individual sits within an interval [e 1 , e 2 ], where e 1 is the reported event date and e 2 is the day after. Similarly, the secondary event time (symptom onset in secondary contact for serial interval or symptom onset for incubation period) sits within an interval [s 1 , s 2 ]. Equation 1 (fig 1) shows the probability of observing a given second event time (denoted by a random variable S), conditional on the observed first event time (denoted by a random variable E) given that the final observation date is T. Fig 1 Equations Equation 1 could be solved by integrating across the observation intervals. However, this would be computationally expensive. Instead, within our model we included estimated event times for each patient, z*∈[z 1 , z 2 ] for z∈{e, s}, as an unobserved variable. Our likelihood function therefore relies on three functions (equation 2, fig 1). We considered three parametric distributions: gamma, Weibull, and lognormal. For the gamma and Weibull distributions, we parameterised the models for mean, θ 1 , and the shape parameter, θ 2 , which describes the shape of the distribution, controlling the variance and skewness. For both distributions, it is assumed that the mean follows a normal distribution prior, with mean 5 and standard deviation 1, and that the shape parameters follow a flat prior. For lognormal the model was parameterised in terms of the log mean, θ 1 , and log standard deviation, θ 2 , parameters. It is assumed both θ 1 and θ 2 follow a standard normal prior distribution. These priors were chosen to be sufficiently informative to penalise unrealistic parameter combinations but specified with low precision to allow the data to maximally inform the estimates. Recentring the priors to alternative means with the same precision yielded consistent results. To fit the model to the data, we used a Markov chain Monte Carlo (MCMC) implemented in Stan through the Cmdstanr package, with full model formula (equation 3, fig 1). The data sharing section includes a link to a repository containing the code for the model and the trace results from the MCMC sample. To compare the model fits we calculated the leave-one-out cross validation (LOO) through Pareto smoothed importance sampling, using the LOO package in R.19 We applied MCMC to each model and evaluated its convergence using potential scale reduction factor, or Ȓ (calculated using Cmdstanr20), where it is desirable to have a value <1.05. From the MCMC output, we obtained a posterior distribution of parameters, which describes the distribution of parameters considered by the model. The MCMC algorithm preferentially selects parameters that better describe the data. From this posterior distribution, credible intervals are calculated and reported for the mean, standard deviation, and cumulative distribution function. We refer to this model as the double interval censoring and right truncation corrected model (ICRTC). If an epidemic is stable or declining the right truncation bias has less of an effect on the data. In such cases it may be reasonable to consider a model without the right truncation correction—that is, assuming that P(S<T│E=e*) ≈ 1. Under this assumption the model becomes simplified (equation 4, fig 1). We refer to this model as the double interval censoring corrected model (ICC). Other approaches can be applied to handle right truncation bias.21 We opted for our approach because the epidemic phase related terms, P(E=e*), cancel each other out, so we do not need to explicitly describe the phase of the epidemic within the model. Often, other methods introduce further assumptions to handle this term, which risk introducing bias.

Instantaneous growth rate We used a previously described method22 to estimate the growth rate of monkeypox cases since the start of the outbreak in England. To estimate the exponential growth rate, we need to assume an exponential structure to the data. In a period of constant exponential growth, an epidemic can be approximated using y(t)=y(0)ert, where y(0) is the initial number of cases and r is the exponential growth rate. Following the methods of Ward et al,22 this can be generalised to an epidemic that is not in an exponential phase, by replacing rt with a smooth function of time, s(t). To estimate this smooth function, we fit a generalised additive model to daily confirmed case counts with a negative binomial error structure and log link. We used the reporting date as it was robust to the (often long) reporting lags associated with specimen date. Cubic regression splines were used with one knot every 14 days. Under this model (y(t)=y(0)ert), the number of cases at time t, y(t), is proportional to the exponential of the smooth function with time, exp(s(t)). The time derivative of the smoother ds(t)/dt is therefore the instantaneous growth rate, r s and doubling times can be interpreted as t D =log(2)/r s . A random effect on the day of week accounts for the average difference in reporting between days.