Bad Data and Bad Conclusions Will Lead to Bad Policy – Implausible Claims that Vaping Increases COVID-19 Risk for Youth and Young Adults

In this brief peer review, we argue that the data reported by Gaiha et al (https://doi.org/10.1016/j.jadohealth.2020.07.002) regarding associations between vaping and COVID-19 testing are so suspect that any conclusions drawn from it cannot be relied upon. We discuss six main areas of concern and conclude that the paper should be retracted. The letter below and attached as supplementary data was e-mailed to the Editor-in-Chief of the Journal of Adolescent Health. Qeios, CC-BY 4.0 · Article, September 8, 2020 Qeios ID: A58MQC · https://doi.org/10.32388/A58MQC 1/8

with serious problems that render the analyses presented by the authors and conclusions reached as not reliable. The concern about COVID-19 is real. However, we worry when policymakers rely upon faulty data to justify policy initiatives.
This has already happened in the case of the Gaiha et al paper which was cited as the basis for proposals (Krishnamoorthi, 2020;Shaheen et al, 2020) to ban the sale of nicotine vaping products until the COVID-19 pandemic is resolved.
We recognize and share the authors' concerns about youth vaping and acknowledge the scientific debate about the benefits and risks of nicotine vaping. We also recognize that the authors of the paper in question as well as many of your readers share a deep concern about adolescent use of tobacco products including e-cigarettes. However, in science, that does not excuse using flawed data to arrive at predetermined and perhaps misleading conclusions. The sensational claim that vaping increases COVID-19 risk for youth and young adults was based on the data which is seriously flawed. The implausible conclusions reached by the authors based on these flawed data have caused us to write to you since we feel strongly that the paper as presented should not remain as is in the scientific literature and certainly not be featured prominently on your journal's website, as it is currently.
Why are we so concerned about this paper? Already other scientists have pointed out the limitations of the paper and posted responses to the paper on Pubpeer (https://pubpeer.com/publications/CEB008BBD48F89272321EB50092793).
These scientists have pointed out an array of problems regarding the make-up of the sample and the apparent oddities reflected in the descriptive statistics. These potential problems would benefit from detailed responses from the authors as they are consistent with a flawed dataset. We also will post our critique of the paper on the preprint server, Qeios.
However, that is not good enough. When errors in data are identified and causal conclusions touted which are clearly not supportable by the evidence, the record needs to be set straight. In this case, we think that the Journal that published this paper has an obligation to set the record straight or otherwise risk losing scientific credibility.
Our concerns focus on six main areas:

1) Implausible Testing Counts
We conducted an analysis to examine the testing counts implied by the results of the Gaiha et al, manuscript, to gauge its plausibility. Our approach included computing an estimate of the absolute number of COVID-19 tests performed on those aged 13-24 using data from the publication (see below) and from population estimates of ever and current vaping and the claimed testing rates. Gaiha et al's data imply that approximately 4.8 million tests had been done on those aged 13-24 as of 14 May. Extrapolating testing rates implied by Gaiha's tables to the whole US population implies that over 60 million persons were tested by that date. Yet, the US had conducted less than 10.4 million tests (The COVID Tracking Project, 2020) by that date. Moreover, Gaiha et al report on a young sample, which would have been much less likely to be tested, particularly in the early days when availability was limited (CDC reports that, as of that week, <5% of all tests conducted were in those <18 years of age) (Centers for Disease Control and Prevention, 2020).

2) No difference in COVID-19 positivity
The data in Table 1 from the Gaiha et al paper shows that the positivity rate was similar for e-cigarette users and nonusers. Among never-e-cigarette users, the positivity rate was 0.8/5.7 = 14%; for e-cigarette users, it was 2.3/17.5 = 13.1%...almost identical. The finding that e-cigarette use was significantly related to COVID-19 (and not never use) is almost entirely due to the claim that e-cigarette users were more likely to be tested. No explanation is provided for why this group of respondents were more likely to be tested.

3) Implausible statistics
The paper's data imply that individuals who were underweight were more likely to be tested and more likely to test positive for COVID-19 than those who were overweight and at even greater risk than those with obesity. While parallel data (i.e., associations of weight status to COVID-19 testing among youth and young adults) do not appear to be available, the fact that obesity is an acknowledged risk factor for severe illness from COVID-19 at least suggests that these results are questionable.

4) Unrepresentative sample
This was a convenience sample. The precise origins of the sample are unclear as the authors note: "Participants were recruited from Qualtrics' existing online panels using a survey Web link on gaming sites, social media, customer loyalty portals, and through website intercept recruitment." This describes in a single sentence two completely different, nonoverlapping, and thus contradictory, methods. Panel members are already known to the panel company and would not need to be recruited via public websites. The recruitment and consent/assent provision of participants ages 13-17 is not adequately described.
Fundamentally, this is NOT a random probability sample of the US population. The authors' attempts at weighting the population are complex and not well described. Regardless, this is not a random sample so statistical weighting does not magically transform what is a convenience sample into a representative sample of the US population as is implied. Biases that may be inherent in the convenience sampling procedure are not eliminated. Simply put, this means that the results of the study cannot be generalized to the US population.

5) Lack of biological plausibility
Biological plausibility is an important criterion for evaluating data, especially when causal inferences are made or implied.
For example, the paper claims that ever-users of e-cigarettes -which includes people who may have tried an e-cigarette testing. In other words, one would logically expect to observe that more recent and frequent use of e-cigarettes would be related to a higher probability of COVID-19 outcomes. That is not what the data in this study reveal. This finding implies that the risk is higher if one used an e-cigarette in the past but is not presently using one than if one is currently using e-Qeios, CC-BY 4.0 · Article, September 8, 2020 Qeios ID: A58MQC · https://doi.org/10.32388/A58MQC 4/8 cigarettes.
The one behaviorally-feasible explanation to explain this pattern of the data is a reverse-causation mechanism, whereby vapers would be infected and stop vaping as a result of the positive test or actual illness. To assess the viability of this explanation, the authors should have explored, if they can, the temporal association of the testing and product use.
However, it would require extraordinary and unlikely circumstances for this dynamic to hold, because testing became more common over time, and this account would require that many tests occurred early on, when tests were scarce.

6) Causal Inference with Cross-Sectional Data
The flaws in the Gaiha et al data are compounded by the cross-sectional study design and reliance on self-reported data along with the suggested causal inferences about disease risk. The paper itself makes such claims, e.g., by stating that the data "show that e-cigarette use, and dual use of e-cigarettes and cigarettes are significant underlying risk factors for COVID-19." The university's press release (Stanford Medicine, 2020) and the authors' statements therein go further still in implying causality and failing to highlight limitations, e.g., : "Teens and young adults need to know that if you use e-cigarettes, you are likely at immediate risk of COVID-19 because you are damaging your lungs," said the study's senior author, Bonnie Halpern-Felsher, PhD, professor of pediatrics.
As I'm sure you are aware the Gaiha et al paper published in JAH has garnered substantial media coverage. Here is a link to a recent New York Times article last week which certainly leaves one with the impression that youth vaping is contributing to COVID-19. https://www.nytimes.com/2020/09/04/health/covid-vaping-smoking.html As the Sumner et al (Sumner et al, 2014) paper makes clear, such extensions of correlational data to causal claims are very common in science communications. What makes these exaggerations problematic here is that this paper is already being used as the basis for policy and public health communication. These policies are premised on this being a causal relationship, and thus may be misdirected and could have unintended adverse consequences, such as promoting the transition from vaping to smoking (Kenkel et al, 2020;Pesko et al, 2016). Policymaking requires good evidence -clear, consistent, biologically-plausible, and reliable data. The data presented in the Gaiha et al. paper are suspect and should not be relied upon.
Let us reinforce, that we do not dismiss the possibility that vaping could in fact lead to an increased risk of COVID-19 as implied by the authors. We think it is a worthy question to research, but the authors of this paper have gone well beyond the data in reporting and interpreting their findings. Two explanations do suggest themselves. First, associations between e-cigarette use and COVID-19 risk could exist but would be better explained by behavioral linkages related to the respondents' personal characteristics. Those with a history of using e-cigarettes may be more risk-taking and could have Qeios,.0 · Article, September 8, 2020 Qeios ID: A58MQC · https://doi.org/10.32388/A58MQC 5/8 exposed themselves to greater COVID-19 risk. This explanation could potentially explain the apparent association if the testing data themselves were plausible; however, as noted above, the testing rates are implausible to start with, which raises concerns about the validity of the study's data from the outset. A second explanation may be that the self-reports that are the basis for this report are inaccurate, which is certainly possible, however, the associations observed would require current and past vapers and smokers to differentially misrepresent their COVID-19 testing behavior compared to nonvapers/smokers, which we think is unlikely.
In any case, the data relied upon by the authors of this paper on their face are highly suspect and are not of the quality required to make credible scientific conclusions, much less a causal conclusion, which as noted above has happened and has stimulated policymaking. We know this letter may be uncomfortable for you since one of the authors of this paper (i.e., Bonnie Halpern-Felsher, PhD) is on the JAH editorial board. However, that is the reason that JAH needs to be extra cautious in ensuring that all authors who contribute papers to your journal acknowledge limitations and anomalies in their data and not move beyond the data to issue conclusions that are not fully justified.
We don't write this letter to you lightly. In fact, some of us who have signed this letter do so with some trepidation due to the sometimes toxic and unprofessional discourse that has come to define the research field of alternative nicotine products. However, we have chosen to speak up because the obvious flaws in the data do not justify the authors conclusion, which has gained wide spread acceptance already in the media. We recognize that no single study is perfect, and that science depends on the accumulation of reliable evidence. We also recognize that mistakes can be made.
Making causal statements which are beyond what the data might allow is one such mistake. However, when mistakes are made, they should be acknowledged and corrected. Unreliable papers have no place in the scientific literature. For the reasons outlined above, we are asking you to retract the paper by Gaiha and colleague (2020) until the flaws and