SARS-CoV-2 infectivity by viral load, S gene variants and demographic factors and the utility of lateral flow devices to prevent transmission

Background: How SARS-CoV-2 infectivity varies with viral load is incompletely understood. Whether rapid point-of-care antigen lateral flow devices (LFDs) detect most potential transmission sources despite imperfect clinical sensitivity is unknown. Methods: We combined SARS-CoV-2 testing and contact tracing data from England between 01-September-2020 and 28-February-2021. We used multivariable logistic regression to investigate relationships between PCR-confirmed infection in contacts of community-diagnosed cases and index case viral load, S gene target failure (proxy for B.1.1.7 infection), demographics, SARS-CoV-2 incidence, social deprivation, and contact event type. We used LFD performance to simulate the proportion of cases with a PCR-positive contact expected to be detected using one of four LFDs. Results: 231,498/2,474,066(9%) contacts of 1,064,004 index cases tested PCR-positive. PCR-positive results in contacts independently increased with higher case viral loads (lower Ct values) e.g., 11.7%(95%CI 11.5-12.0%) at Ct=15 and 4.5%(4.4-4.6%) at Ct=30. B.1.1.7 infection increased PCR-positive results by ~50%, (e.g. 1.55-fold, 95%CI 1.49-1.61, at Ct=20). PCR-positive results were most common in household contacts (at Ct=20.1, 8.7%[95%CI 8.6-8.9%]), followed by household visitors (7.1%[6.8-7.3%]), contacts at events/activities (5.2%[4.9-5.4%]), work/education (4.6%[4.4-4.8%]), and least common after outdoor contact (2.9%[2.3-3.8%]). Contacts of children were the least likely to test positive, particularly following contact outdoors or at work/education. The most and least sensitive LFDs would detect 89.5%(89.4-89.6%) and 83.0%(82.8-83.1%) of cases with PCR-positive contacts respectively. Conclusions: SARS-CoV-2 infectivity varies by case viral load, contact event type, and age. Those with high viral loads are the most infectious. B.1.1.7 increased transmission by ~50%. The best performing LFDs detect most infectious cases.


Introduction
The global impact of SARS-CoV-2 is profound. 1 There is widespread on-going transmission despite control efforts predominantly focused on quarantining symptomatic cases and population-level selfisolation. 2 The emergence of potentially more transmissible variants, such as B.1.1.7 3 in the UK, has hampered control. However, vaccination offers the prospect of reduced disease and transmission. 4 Intermittent social distancing and self-isolation measures have been imposed in many countries. 5,6 Additional self-isolation measures for "contacts" exposed to SARS-CoV-2 vary, but generally last 7-14 days. 7 While reducing transmission, quarantine/isolation measures have indirect effects on economic productivity, well-being 8 and non-COVID-19-related excess deaths. 9-11 Not all SARS-CoV-2 exposure leads to infection; in some settings only 5-7% of contacts develop COVID-19 12, 13 and modelling suggests ~15% of individuals are responsible for most SARS-CoV-2 transmission. 14 Using isolation selectively for the most infectious could lessen its collateral impacts. 12,13 Several assays for infectivity have been proposed including functional (animal and cell culture) and nucleic acid (viral sub-genomic mRNA) models. 15 Detection of viral protein (antigen) by lateral flow devices (LFDs) has been more closely linked to viral culture infectivity than PCR measurements. 16 However, few of these surrogate measures have been convincingly demonstrated to predict the real-world likelihood of transmission.
Here we use data from the England's national contact tracing and testing programs to explore the relationship between infectivity and SARS-CoV-2 viral load, as measured by PCR cycle threshold (Ct) values. We identify demographic factors associated with infectivity and the impact of the B.1.1.7 variant. We apply our results to a population of PCR-positive individuals to estimate the proportion of infectious individuals detected by viral antigen LFDs under a range of performance conditions.

Methods
Data from community and hospital PCR testing in England between 01-September-2020 and 28-February-2021 were linked with national contact tracing data by the UK Government Department of Health and Social Care. De-identified data extracts included for PCR-confirmed cases and their contacts: the nature of the contact events, demographic details (age, sex, ethnicity), if symptoms were present for cases and the timing of testing relative to symptom onset, and test results.

Cases and contacts
We defined index cases as SARS-CoV-2 PCR-positive individuals with a community-based test performed by three high-throughput national testing facilities (Milton Keynes, Alderley Park, Glasgow), which reported Ct values indicating viral load. Contacts of index cases were defined as all individuals notified to the national contact tracing service from the day of the index cases' positive test until 10 days later with whom the index case had been in close proximity from 48 hours before their symptom onset to 10 days afterwards (further definitions in Supplement).

PCR testing
Combined nose and throat swabs from index cases were processed using the same RNA extraction and Thermo Fisher TaqPath PCR platform in each laboratory (targeting S and N genes, and ORF1ab; see Supplement, including estimation of viral load from Ct values). Only the first positive result per A c c e p t e d M a n u s c r i p t 4 person was included. Index cases without available Ct values were excluded. The B.1.1.7 variant contains a Δ69-70 deletion, resulting in S gene target failure (SGTF). From mid-December 2020 onwards, ≥99% of sequences with Δ69-70 were B.1.1.7 (see supplement), 17 allowing SGTF to be used as a proxy for B.1.1.7. Contacts could be tested PCR-positive through any community or hospitalbased test as results were nationally reported.

Statistical analysis
We determined factors associated with PCR-positive results in contacts, including demographics, viral load and SGTF status of the index case. To identify outcomes most likely representing onward transmission from the index case rather than a third party, we excluded contacts named by more than one index case. We also restricted to positive test results obtained 1-10 days following the index case's test date, i.e., the period when the index case may have been infectious, excluding earlier results in contacts to avoid contacts who were the source for the index case's infection. Given these exclusions, the absolute proportion of contacts testing PCR-positive cannot be interpreted as a secondary attack rate. In contacts with >1 PCR test within the follow-up window, all were considered to identify positive results.
We used multivariable logistic regression to investigate associations between PCR-confirmed infection in contacts (including contacts whether or not they had PCR tests) and the index case's Ct value and SGTF status, the contact event nature, the case's demographics, and incidence and social deprivation index at the contact's home location. We did not adjust for case symptoms, as these may be mediators of viral load's effect on transmission. We used splines to account for non-linearity in continuous variables and screened for all pairwise interactions between main effects (see Supplement).
We performed sensitivity analyses to test our restriction to contacts tested 1-10 days after each index case and including only contacts with PCR tests. We used unadjusted linear regression to investigate the proportion of the variation in Ct values in contacts explained by cases' Ct values.

Simulations of LFD performance
We used our findings to estimate the proportion of potential transmission events where the source case would have been detected using an antigen LFD, using existing data on the performance of four LFDs: Innova, Deep Blue, Orient gene and Abbott. 18 For each source case we simulated a positive or negative LFD result by randomly drawing from the probability of an LFD being positive by the source case's Ct value (Supplement, Figure S1). Each simulation was repeated 1000 times. Additionally, we ran simulations for a range of hypothetical LFD performances.

Ethics
The study was conducted as part of national COVID-19 surveillance under the provisions of Section 251 of the NHS Act 2006 and therefore did not require individual patient consent. It was approved by Public Health England's (PHE's) Research Ethics and Governance Group (PHE's research ethics committee), the UK COVID-19 LFD oversight group and NHS Test and Trace.
A c c e p t e d M a n u s c r i p t 5 Results 3,577,246 SARS-CoV-2 PCR-positive results were available from England between 01-September-2020 and 28-February-2021. 1,818,456 (51%) tests were performed by three national laboratories providing high-throughput community testing with a standardised PCR assay, yielding a first positive test per person in 1,796,139 individuals. 27,893 (2%) were excluded as no Ct value was available, 439,482 (24%) had no recorded contacts, and 264,760 (15%) had only recorded contacts shared with other index cases, leaving 1,064,004 index cases in the analysis ( Figure 1, Table S1). 487,653 (46%) cases had SGTF, consistent with B.1.1.7, increasing to near 100% by 28-February-2021 ( Figure S2). The 1,064,004 index cases had 2,974,596 contacts identified within 10 days of their test of whom 918,758 (31%) had a PCR test within ±10 days of the index case and 638,456 (21%) tested PCRpositive. 2,474,066 (83%) contacts were named only by a single case and are included in the analysis; 231,498 (9%) tested PCR positive 1-10 days after the index case's PCR-positive result, our main outcome measure, i.e., consistent with possible transmission from the index case to the contact.

Predictors of PCR-positive results in contacts
On univariable analysis (Table 2, Figures S3-S6), PCR-positive tests in contacts were associated with lower case Ct values (i.e. higher viral loads), SGTF in the index case, higher incidence in the local population, less social deprivation, white ethnicity and male sex. Household contacts were most likely to be PCR-positive. PCR-positive results were least frequent in contacts of children, with highest rates in contacts of older adults.
Adjusted multivariable analysis showed strong evidence of effect modification (interactions) and non-linear relationships, such that associations are best described graphically (Figures 2-3, S7-S11). Index case Ct value was an important determinant of PCR-positive results in contacts, with an approximately linear decline as Ct value increased, which was independent of the nature of the contact event ( Figure 2). For example, amongst household contacts, with other variables set to median values/baseline categories, rates of PCR-positive tests were 11.7% (95%CI 11.5-12.0%) for index case Ct=15 and 4.5% (4.4-4.6%) for Ct=30. Contacts were most likely to test PCR-positive after household contact (percentage of PCR-positive tests, at median Ct value=20.  Figure 3).
Contacts of children were the least likely to test positive, particularly following contact outdoors or at work or in education (Figure 4). Most contact types had similar rates of PCR-positive results across A c c e p t e d M a n u s c r i p t 6 adult ages, except for household contact where risk increased as age increased above 35 years and contact at work/education, events/activities and outdoors where risk of a PCR-positive result was highest in adults in their 20s.
Associations between PCR-positive results in contacts and sex varied with age ( Figure S8). Broadly, increasing incidence increased PCR-positive contacts, likely reflecting increased acquisition from third parties. There were fewer PCR-positive contacts in areas of greater social deprivation ( Figure  S11) and amongst Black, Asian and minority ethnic groups ( Figure S9-S10).
A sensitivity analysis supported the 1-10 day follow-up window for PCR results in contacts as the association between index case Ct and positive contacts attenuated beyond 10 days, consistent with more acquisition from third parties. There were relatively few contacts testing positive 11-14 days later ( Figure S12). Case Ct values explained only a small proportion of the variability in contact Ct values (unadjusted linear regression coefficient 0.14 [95%CI 0.13-0.14, p<0.001], R-squared = 0.02).

Predictors of PCR-positive results in contacts attending PCR testing
In a sensitivity analysis restricted to contacts who had a PCR test (Table S2, Figures S13-S19), similar relationships were seen between PCR-positive results and index case Ct values, contact type and SGTF (Figures S19). While rates of PCR-positive results remained highest in older adult household contacts, there was attenuation of the lower rates seen in children, consistent with main analysis findings of less transmission from children arising from less testing being required or undertaken in contacts of children ( Figure S18). In contrast to the main analysis, contacts of all non-white ethnic groups (Table S2) and those living in more deprived areas ( Figure S17) were more likely to be PCRpositive, potentially due to differences in access to and use of testing by different ethnic and socioeconomic groups.

Proportion of cases with PCR-positive contacts detected by LFDs
Overall, 85.4% (197,677/231,497) of case-contact pairs with PCR-positive contacts, i.e., plausible onward transmission, had estimated case viral loads of ≥10,000 RNA copies/ml (i.e. Ct ≤24.4) versus 75.2% of all cases (800,020/1,064,004). Index cases with SGTF had lower Ct values, except for results near the detection threshold ( Figure S20).
As antigen LFD sensitivity varies by viral load, we used the distribution of Ct values in case-contact pairs with a PCR-positive contact to simulate the proportion of such cases who would have been detected using antigen LFDs ( Figure 5). The Deep Blue LFD would have detected 85.9% (95%CI 85.8-86.0%) of cases who plausibly subsequently transmitted to a contact, the Innova LFD 83.0% (82.8-83.1%), the Orient Gene LFD 89.5% (89.4-89.6%) and the Abbott LFD 85.8% (85.7-86.0%). Performance was very similar before and after B.1.1.7 expansion (Table S3). The performance characteristics required to detect varying proportions of transmission sources by a novel LFD are illustrated in Figure S21.

Discussion
We have performed a large-scale analysis of combined SARS-CoV-2 contact tracing and testing data from England involving >2 million contacts of PCR-confirmed cases. We show SARS-CoV-2 infectivity is associated with index case viral load, including after adjustment for demographic factors and type A c c e p t e d M a n u s c r i p t 7 of contact event. SGTF, a proxy for the B.1.1.7 variant, increased transmission by ~50% at most viral loads. Onward transmission from children was relatively uncommon compared to adults, although this may partly be due to less testing in their contacts. We confirm earlier findings that household contact is associated with greater rates of transmission compared to workplace, educational or recreational contact outside of homes. 19,20 Except SGTF, it is noteworthy that we found no evidence of significant interactions between Ct values and any other variables in the analysis, i.e. the effect of viral load on infectivity is generalisable across populations and settings. These results are consistent and add to a recent smaller cohort study. 21 Consistent with other reports 3 we found that SGTF increased the proportion of contacts testing PCRpositive, by around 55% at high viral loads (Ct values of 10-20), rising with moderate viral loads to a maximum of 75% (Ct=25) before declining again to below 10% at low viral loads (Ct=34). SGTF also affected how the likelihood of transmission varied with age, contact event type and ethnicity. The higher relative infectiousness at moderate viral loads may represent increased infectiousness of individual virions at viral loads where stochasticity is more important compared to higher viral loads. The attenuation of the relative infectiousness at high Ct values partly arises from greater numbers of wildtype strains exhibiting SGTF due to stochastic failure to detect a single gene at low viral loads. As lower viral loads are less infectious, it may also reflect more PCR-positive contacts acquiring infection from third parties, such that the characteristics of the index case matter less. This is supported by the proportion of contacts testing PCR-positive not tending to zero at very low viral loads.
85.4% PCR-positive contacts had an index case with an estimated viral load of ≥10,000 RNA copies/ml (Ct≤24.4). Hence, 85.4% of infections in contacts are potentially attributable to the 75.2% of cases overall with a viral load of ≥10,000 RNA copies/ml. While differential interventions to prevent onward transmission could be focused on those with high viral loads, our findings suggest that most infected individuals still have some risk of transmitting onwards based on Ct values.
However, we show that several LFDs could have detected most cases that led to onward transmission. These tests offer potential advantages, returning results in 15-30 minutes, not requiring laboratory infrastructure and costing significantly less than PCR. However post-analytic infrastructure is still needed to collect results. Using the estimated sensitivity by viral load of four LFDs, we estimate they would detect 83.0%-89.5% of cases leading to onward transmission. While such performance is not sufficient to replace PCR for testing of all symptomatic individuals, use of LFDs in addition to existing testing, particularly of those who otherwise would not be tested at all (including those without symptoms), would allow many of the most infectious individuals to be identified earlier, potentially preventing onward transmissions and helping to drive reproduction numbers below 1, despite imperfect performance against PCR. The specificity of each LFD is another important consideration, particularly as incidence falls; the false positive rate for the Innova LFD has been previously reported as 0.32% (95%CI 0.20-0.48%), 18 and large-scale evaluations of the other LFDs are on-going. In settings where the positive predictive value of an LFD is insufficiently high, confirmatory PCR testing may be required.
Our study has important limitations. Firstly, ascertaining infection in contacts depends on the contact being reported by the case and the contact being tested. In the UK, PCR testing was only A c c e p t e d M a n u s c r i p t 8 recommended if symptomatic and therefore we do not ascertain most asymptomatic infections. Whilst Ct values are generally slightly lower in those without symptoms, 22 they may nevertheless contribute substantially to transmission. 23 Additionally, access to testing depends on social and demographic factors, e.g. the relationships between PCR-positive results in contacts and ethnicity varied if we conditioned on contact attendance for testing (Table 2 vs. Table S2). Furthermore, contacts who tested positive were tested a median (IQR) 3 (2-5) days after their index case, compared to 2 (1-4) days in contacts who tested negative. It is therefore possible that some negative tests in contacts tested earlier may have been positive if subsequently repeated.
Secondly, our classification of contact events is relatively simple, e.g., we do not have any direct measures of human behaviour, such as proximity or duration of contact. We do not account for the dynamic nature of viral loads over time, 24 relying on a single measurement at varying times post infection. Despite this, the time from symptom onset to testing in cases was relatively consistent, median (IQR) 2 (1-3) days, such that measured Ct values plausibly represent similar stages of infection. We use only a single assay to determine Ct values and it should also be noted that Ct values vary by extraction and PCR platforms and so our findings may equate with different Ct values in other diagnostic workflows. However, to facilitate comparisons with other platforms we use an external calibrant to estimate viral load from Ct values, although digital PCR may allow this to be done with more precision.
It was not possible to account for unobserved third-party transmission, although we designed our study population to minimise this risk. This likely means that some contact events identified as possible transmission events may actually not be the source of the infection in the contact. It is likely that proportionally this effect is greatest at lower viral loads (higher Ct values), as the likelihood of transmission rises with viral load.
In summary, we provide strong evidence that SGTF increases SARS-CoV-2 transmission and that SARS-CoV-2 infectivity increases with increasing viral load. We show that the relative strength of the effect of viral load is consistent across ages, ethnicities, and different types of contact events. Despite this association, most individuals have Ct values compatible with onward transmission. 25 Nevertheless, LFDs can detect most individuals who are potential transmission sources. This supports wider use of LFDs as rapid and regular screens to detect infectiousness in populations at high risk of acquisition, including recent contacts of cases. Further prospective studies will be required to demonstrate whether targeted isolation and/or contact tracing, together with wider use of LFDs in combination with vaccination are effective in preventing ongoing SARS-CoV-2 transmission.
A c c e p t e d M a n u s c r i p t

Role of the funding source
The funder of the study provided access to the data and facilitated data linkage. The funder had no role in study design, data analysis, data interpretation, or writing of the report. The corresponding author had full access to all the data in the study and had final responsibility for the decision to submit for publication

Declaration of interests
LYWL declares speaker fees from the Merck group and Servier, outside the submitted work. DWE declares lecture fees from Gilead, outside the submitted work. M.P. and S.R. report employment from Accenture. S.R. reports that DHSC were supplied professional services by Accenture. No other author has a conflict of interest to declare.   Model predictions are plotted after adjustment for index case age (set to the median value, 35 years), case ethnicity (set to white), index of multiple deprivation score at contact's home address (set to median, 14,465), incidence at contact's home address (set to median 350 cases per 100,000 population per week) and index case sex (set to female). The shaded area indicates the 95% confidence interval.