Critical analysis of common methodology flaws in e-cigarette surveys

The prevalence of vaping, also known as using e-cigarettes, vapes, vape pens, or electronic nicotine delivery systems (ENDS) has prompted a demand for reliable, evidence-based research.1 However, published literature on the topic of vaping is often unreliable, characterized by serious flaws and a failure to adhere to accepted scientific methodologies. In this narrative review, we analyze 24 popular vaping studies, published in medical journals, that purport to evaluate the association of vaping and smoking initiation, smoking cessation or health outcomes. We analyzed these studies to identify the questions they claimed to address, stated methods, manner of implementation, discussions, and stated conclusions. After critical appraisal, we noted a multiplicity of flaws in these studies, and identified patterns as to the nature of such flaws. Many studies lacked a clear hypothesis statement: to the extent that a hypothesis could be inferred, the methods were not tailored to address the question of interest. Moreover, main outcome measures were poorly identified, and data analysis was further complicated by failure to control for confounding factors. The body of literature on “gateway” theory the for initiation of smoking was particularly unreliable. Overall, the results and discussion contained numerous unreliable assertions due to poor methods, including data collection that lacked relevance, and assertions that were unfounded. Many researchers claimed to find a causal association while not supporting such findings with meaningful data: the discussions and conclusions of such studies were therefore misleading. Herein, we identify the common flaws in the study design, methodology, and implementation found in published vaping studies. Our Qeios, CC-BY 4.0 · Article, April 30, 2021 Qeios ID: IZGWNJ · https://doi.org/10.32388/IZGWNJ 1/27 aim is to prompt future researchers to adhere to scientific methods to produce more reliable findings and conclusions in the field of vaping research. Critical analysis of common methodology flaws in e-cigarette surveys Carl V Phillips1, Cother Hajat2, Emma Stein3, Riccardo Polosa4,5, and the CoEHAR study group6 1. Independent Researcher 2. Public Health Institute, UAE University 3. Independent Researcher 4. Center of Excellence for the Acceleration of HArm Reduction (CoEHAR), University of Catania, Catania, Italy. 5. Department of Clinical and Experimental Medicine, University of Catania, Catania, Italy. 6. Center of Excellence for the Acceleration of Harm Reduction, University of Catania, Italy. Full list of author information is available at the acknowledgment section.


purpose.
As many journal articles on the topic of vaping or tobacco smoking appear misleading or unreliable, we undertook a critical appraisal of such research articles. Herein, we delineate our findings, including common flaws in study design, participant recruitment, data analysis, and other methods that undermine the reliability of vaping studies. The purpose of this paper is threefold: (i.) to help guide researchers who endeavor to improve the quality of study design and methods; (ii.) to prepare readers to critically evaluate the reliability of vaping research and literature; (iii.) to address myths and misconceptions perpetuated by flawed vaping literature.

Methods
We used the Google Scholar search engine (30 November 2020) to obtain the most popular journal articles on vaping research. We used the Google algorithm definition of "popular" i.e., the articles most read and most cited in other literature and policy discussions. We searched behavioral human subjects research on causal claims related to vaping.
Specifically, we ran the search string: "e-cigarette OR 'electronic cigarette' OR vaping OR 'electronic nicotine delivery system.'" We stepped through the articles in order of search results ranking. We then identified the ten most frequently cited articles on each of the following topics: (i.) the effects of vaping on smoking cessation/reduction; (ii.) the effects of vaping on smoking initiation; and (iii.) the health outcomes associated with vaping and/or smoking.
We acknowledge that alternative methods exist to define "popular" and to identify vaping literature, such as a PubMed search. However, we used the Google Scholar algorithm for the purpose of this paper because it better reflects the search methods used by many researchers, policy makers, advocacy groups, health care providers, and patient populations. We conducted a review and critical appraisals of the 24 most popular journal articles on causal claims related to vaping and discuss our findings below.
An analysis of each paper includes a discussion of common study design and methodology flaws. In particular, we critically analyzed papers for significant limitations: improper methods; significant flaws in applying potentially useful methods; suboptimal participant recruitment and retention. Specifically, papers meeting inclusion criteria (hereinafter, included papers) were analyzed for the following strengths and limitations: 1. Did the study clearly describe the method of investigating causal pathways? Scientific standards require researchers to specify a causal hypothesis, and describe a study design and data collection methods to investigate that hypothesis. If researchers merely discuss a causal association and present statistical data without establishing causation, we highlight such deficiencies.
2. Were the study design and research methods sufficiently robust to control for confounding factors?
3. Do the results support the stated conclusions, without overstatement? 4. Do the researchers present language or data that is misleading, or fail to acknowledge significant limitations? While many included papers contained idiosyncratic problems, we did not address such flaws as they fell beyond the scope of our analysis. Instead, we highlight the themes of common flaws that warrant focused attention that will guide future researchers.
Our critical appraisal grouped studies according to whether they addressed the effects of vaping on smoking cessation and reduction; the effects of vaping on smoking initiation; and the epidemiology and smoking and/or vaping outcomes (Table 1.) We reported on each study separately, presenting strengths and limitations. We also analyzed the studies collectively as a body of literature, highlighting common missteps in study design, methodology, and implementation ( Table 2). As many included papers demonstrated common research flaws involving confounding factors, causative associations, and the counterfactual analysis, 3 these terms are set forth in Figure 1

Results
The 24 journal articles identified by our search method are listed in Table 1. An initial search returned the titles of articles, which, upon manual review, were determined to not truly meet search criteria. Such articles were replaced by continuing through the search. Excluded papers were those that did not meet our intended search criteria eg, those that addressed descriptive epidemiology, chemistry and toxicology, acute responses to exposure, and analytic papers that were not empirical. Notably, many papers purporting to be scientific literature contained mere subjective impressions. Also excluded were the approximately 20% of search results that addressed lung injuries due to cannabinoid vaping (EVALI), and the approximately 33% of search results that were case studies.
The majority of papers devote their focus to either smoking cessation or initiation, not both; such articles are assigned to one category accordingly, even if the article contains secondary discussion of the other topic (Table 1). One paper included a substantial analysis of both smoking cessation and smoking initiation and is addressed in both sections ( Table   1). The remaining papers address the health outcomes associated vaping and/or smoking ( Table 1).
The 24 included papers addressing the effect of vaping on smoking cessation and reduction; the effect of vaping on smoking initiation; and epidemiologic assessments of health outcomes were replete with flaws ( Table 2.) The effect of vaping on smoking cessation or reduction was discussed in ten of the included papers. These studies had serious limitations. Notably, the researchers failed to acknowledge that vaping as a quit strategy may increase the number of quit attempts, thereby increasing the likelihood of success. Further, many studies lacked a robust design with a multivariate analysis that controls for confounding. Moreover, the researchers often failed to articulate a hypothesis or identify a suspected causal pathway. Finally, the researchers used flawed inclusion/exclusion criteria for study participants, such that former smokers who already quit by using vaping as a quit strategy are excluded, effectively reducing the number of people who found this method successful.
The effect of vaping on smoking initiation was addressed in 11 of the included papers. Many researchers investigate the possibility that vaping may cause an individual to later begin smoking, commonly called a "gateway effect." The included papers did not reliably establish a causal association between vaping and smoking initiation. Many papers referred to a so-called "gateway effect" as if supported by data, when it is not. Several of such papers had an alarmist tone, lacked meaningful metrics, and lacked relevant descriptions of vaping related behaviors. As such, the authors' conclusions were unreliable.

Effects of vaping on smoking cessation and reduction
An individual who smokes cigarettes may engage in vaping as a strategy to aid smoking cessation or reduction.
Several research studies purport to assess the effect of vaping on smoking cessation and reduction success. A critical appraisal of these studies revealed numerous flaws. Researchers often evaluate the probability of success for a given quit method, yet mistakenly assume that the number of quit attempts is fixed. In fact, education as to a novel quit strategy may prompt additional quit attempts. Thus, the quit method (eg, vaping) warrants credit for prompting an additional quit attempt.
Research study designs should include a multivariate analysis, and control for confounding, assessing factors such as vaping status, smoking status, cessation and reduction goals, number and method of quit attempts.
Several researchers failed to clearly state the causal pathway they were investigating. For example, if an individual has a successful smoking quit attempt, and would not have successfully quit in the absence of vaping, then vaping caused that cessation. However, other potential causal pathways exist that the researchers did not explore. Consider the individual who would not otherwise have made a quit attempt, yet does so (and succeeds) because the option of vaping motivates the quit attempt. This second pathway includes both intentional quit attempts, and unintentional quitting (dubbed, the "accidental quitter" phenomenon), whereby someone who smokes tries vaping without the intention of switching, but finds it so appealing that they switch.
The researchers also selected flawed inclusion/exclusion criteria. For example, a given population may consist of a many former-smokers who have made successful quit attempts by switching to vaping. To conduct a study in such a population, but include only active smokers, researchers will be evaluating only those already less likely to quit by switching (as suggested by the fact that others did, and they did not). Moreover, to exclude former-smokers who successfully quit by vaping creates a biased participant population. In our literature review of vaping effects on smoking cessation, many researchers did not account for these trends. The numerous anecdotal reports of those who found vaping to be an effective aid to smoking cessation may inspire future researchers to formulate robust study designs and participant recruitment methodology.
Epidemiological studies assessing population trends may note the incidence and prevalence of vaping and smoking. However, such studies generally lack the specificity needed to establish a causal association between vaping and smoking cessation. Moreover, incidence and prevalence of vaping-related behaviors tend to fluctuate due to a confluence of variables, such as changing technology, marketing, and media coverage. Research on general population trends should include relevant data points in their analysis and not overstate their conclusions.
The specific strengths and limitations of studies examine the effect of vaping on smoking cessation and reduction are the following: Gomajee et al. (2019) 15 In this study, researchers examined a large, nationally representative cohort in France, comprised of more than 5000 current smokers and 2000 former smokers. Recruitment began in 2012 and participants were followed for an average of two years. Data was collected on recent e-cigarette use, and the time point that a participant began vaping regularly: the latter data point is meaningful and often omitted from research. The multivariate models included a good collection of data on socioeconomic and demographic factors, as well as psychological, behavioral, and health traits. This comprehensive data collection was important to control for confounding. The results suggest that vaping substantially reduces the quantity Qeios, CC-BY 4.0 · Article,  of cigarettes consumed and increases the likelihood of cessation. The implicit counterfactual question is whether the act of vaping causes someone to smoke less. The study design was sound and while there were some limitations, the conclusions seem reliable. Hitchman et al. (2015) 17 This was a two-period cohort study in Great Britain, starting in 2012 (a time when vaping was fairly well established there), that recruited people who smoked at baseline, asking whether they vape, and measured vaping and smoking status a year later. The study design embodies a stock-flow problem that undermines the credibility of the results. Moreover, a cohort study is a weak study design for answering the main question of interest, unless it starts while vaping is still rare in a population or captures retrospective data and includes people who already quit smoking (effectively incorporating good case-control methods). Without that, the study population selects for smokers who differ from the average smoker in important ways.
As a result of this design flaw, this study found that subjects who both smoked and vaped at baseline were less likely to be smoking abstinent at follow-up, compared to those who did not vape at baseline. The same association can also be observed for every other smoking cessation method: anyone who already tried it and is still smoking is less likely to quit over a given period than people who still smoke but have never tried it. However, it is impossible to discern that with the stock-flow bias and the "more dedicated smoker" confounding problem present.
The authors focus on the details of vaping behavior (measured at follow-up) and overlook the role of the behavior as an effect of preferences and choices rather than a cause. People who used closed-system e-cigarettes were less likely to be smoking abstinent than non-vapers, and those who vaped closed systems less-than-daily were particularly unlikely to have become abstinent. Those who vaped open systems daily were considerably more likely to be smoking abstinent than average, while those who used open systems less-than-daily were a bit less likely. When discussing the implications of these associations, the authors suggest that the only possible causal story is that the vaping behaviors caused the different smoking outcomes.
The authors had substantial data on points relevant to causation and causal pathways, yet data analyses were not presented on many interesting questions worthy of exploration. This failure was probably a result of the authors not thinking through the counterfactual claims and causal pathways. These authors include in their model, a collection of covariates related to the propensity to quit smoking. While it seems reasonable to include all these variables. It is a mistake to include them in the analysis without giving serious consideration to their potential causal role.

Biener and Hargraves (2014) 8
This article discussed a 2014 follow-up of participants in two U.S. cities, who smoked in 2012. The authors emphasized the result that those who reported intensive (daily) vaping at follow-up were much more likely to have quit smoking than those who did not vape at all, while those who vaped intermittently were much less likely to have quit. The survey was characterized by a notably granular assessment of vaping frequency, which aided the reliability and relevance of the data analysis. The authors conclude that the vaping behavior caused the smoking cessation outcome. They did not acknowledge that quitting smoking may cause someone to be more likely to vape every day rather than less often (if they quit smoking by switching).
Further, these results seem largely driven by the fact that quitting smoking causes some people to not vape at all (if Qeios, CC-BY 4.0 · Article,  they chose to become completely abstinent), whereas they might have vaped occasionally as part of their smoking routine had they not quit. This is another issue with interpreting the association of intermittent vaping and continued smoking as the former causing the latter: alternative causal pathways were not considered. There is some useful information in this research. For example, the study shows that for this population, even with this relatively early baseline, most people who smoked were aware of e-cigarettes and many had tried them, and thus we can infer that many of those who were inclined to switch to vaping would have already done so, creating the stock-flow bias. Grana et al. (2014) 16 This brief, one page article describes a one-year follow-up of a U.S. cohort study, which began in 2011. The data and analysis appear flawed, and the study design did not account for the stock-flow problem, thereby introducing serious limitations. Further, participant traits in the sample population suggest that the researchers did not appropriately consider inclusion/exclusion criteria, and instead introduced bias into the study. These limitations were not accounted for in the data interpretation or discussion, and the conclusions appeared to be unreliable and misleading.

Martińez et al. (2020) 20
The researchers followed a population of current smokers who had volunteered for a smoking cessation trial (2016-2017, across the USA, online recruitment) and who also vaped. The study design failed to account for stock-flow bias. The authors assess data collected retrospectively, and report a substantial reduction in the quantity of cigarettes consumed after someone started vaping. However that authors also claim to find that adopting vaping caused smokers to have a substantial increase in nicotine use and nicotine dependence. This is, of course, a function of how they chose to define nicotine dependence and the quantifications, since there is no established definition of these, let alone standard metrics. It appears that these comparisons were driven mainly by the number of "vaping sessions" per day compared to previous smoking sessions.
Notably, metrics that work well for smoking may not be applicable to vaping. Smoking sessions almost always consist of one whole cigarette. By contrast, a vaping "session" is characterized by different patterns. In addition, nicotine delivery intensity --already highly variable for smoking, though almost never measured --varies even more for vaping.
Thus, attempts to measure "increases" are unreliable. Actual increases might have occurred, but the data collection in this study seem inadequate for this metric. These limitations undermined the reliability of the conclusions. Gmel et al. (2016) 13 This study followed about 5000 Swiss men aged 20 years, with a baseline survey (sometime between 2010 and 2012, at their induction into mandatory military service) and a follow-up a year or two later. At that time vaping was relatively rare in this population, and sales of nicotine-containing e-liquid were banned in Switzerland, though it was available in neighboring countries. The vaping exposure measure was any consumption in the last 12 months at follow-up only. Smoking exposure was also last-12-months, but was done at baseline also and included a frequency measure.
This paper recognizes that vaping may prompt additional attempts to quit smoking, as vapers had more selfidentified lifetime smoking quit attempts. The study design does not account for the stock-flow problem, because only the people who smoked were asked about vaping, so anyone who already quit smoking with vaping were not included among the vapers. There is also the obvious causation in the other direction because smokers who make more unsuccessful quit attempts are more likely to try vaping during at least one of them. It might also be that the majority of vapers were only Qeios, CC-BY 4.0 · Article,  vaping non-nicotine e-liquid.
The successful smoking cessation (between baseline and follow-up) results are difficult to interpret. People who vaped at follow-up were less likely to have quit smoking than those who did not, based on a small effective sample size, but this association is a combination of the actual effects of vaping (perhaps without nicotine), the stock-flow problem, the fact that those making unsuccessful quit attempts are more likely to have at least tried vaping during the follow-up period (or before), and the fact that occasional vaping appeals to many dedicated smokers. While the authors claim that vaping does not aid smoking cessation, the research has substantial limitations.
Etter and Bullen (2014) 11 In this study, researchers used a rolling worldwide convenience sample of people who vaped (volunteers recruited via internet e-cigarette retailers and vaper social media), collecting baseline information from 2010 through 2013, with follow-up surveys one month and one year after that. This sampling method seemed likely to select for more dedicated vapers. Researchers found that a large portion of subjects were still vaping at follow-up, and a large portion of those who still smoked at baseline had quit. However, since the sampling properties are unknown, the results may not be generalizable beyond the survey population.
The authors claim that vaping prevents "relapse" to smoking. However, it is difficult to estimate what the baseline rate of resuming smoking would have been, absent vaping. Moreover the researchers did not collect data on participant characteristics to propose causal pathways and control for confounding. As such, it is difficult to determine whether a particular variation on vaping behavior was associated with an outcome. Unfortunately, the authors did not focus on these answerable questions.

Warner (2016) 26
This research examined representative U.S. surveys of 12th graders in 2014. The reported results focus on how smoking status was associated with vaping. The paper characterizes most vaping as being concentrated among smokers and rare among never-smokers. The author notes that vaping may aid teenagers in smoking cessation or reduction.
The analysis addresses the issues of different measures of usage, and the impact such measures may have on results. In particular, the author examines how classifications such as "any use ever" or "any use in the last 30 days" will tend to include many dabblers who do not vape regularly. Classifications that include dabblers with those who vape regularly lack granularity and undermine the reliability of causal claims. There is the interesting result that intensity of current smoking does not seem to be associated with vaping, but the existence of current smoking is when measured dichotomously (yes/no). Figuring out if this is a robust and generalizable relationship and exploring why it happens could be useful.

Giovenco and Delnevo (2018) 12
This cross-sectional study assessed people who currently smoked in representative samples of the US population, surveyed in 2014 and 2015, plus those who had quit since 2010. The authors report that vaping daily was strongly associated with smoking cessation, while vaping less than daily was strongly associated with ongoing smoking. The author does not assert whether the association is causative.
The cross-sectional study design was sound as were the inclusion/exclusion criteria; for example, researchers did not exclude almost everyone in the population who quit via vaping, as many cohort studies did. A further strength is that it Qeios, CC-BY 4.0 · Article,  does not merely pick one measure of the vaping exposure, but contrasts the results for different measures. As the authors note, there may be a lot of occasional vapers who are just doing it to deal with smoke-free situations. In addition, many people who still smoke and occasionally vape may have tried to switch but decided they did not like it enough to do so, yet still vape some because they became accustomed to it. These differences relate to the stock-flow problem. They further demonstrate the need to disaggregate a given population of vapers by specific traits, recognizing that they are a heterogeneous population. The authors note that switching is causing a lot of smoking cessation (someone who is vaping because that is what is keeping them from smoking will probably vape daily). One implication of this is that intensive vaping --often denigrated as bad --may actually be a marker for vaping "doing its job" of causing smoking cessation.

Brown et al. (2014) 10
In this research article, primarily descriptive epidemiology (a representative sample from Great Britain, 2012), the authors refrain from drawing causal conclusions, but present statistics relevant to assessing the causal connection between vaping and smoking behaviors.
The study examined people who currently or recently smoked (ie, quitting within the previous year). Several observations are informative about causative or contributing factors. For example, among people who never vaped, those who had already quit smoking were far less likely to be interested in trying vaping. This is evidence of a potential causal pathway, that is ignored in other studies: people who have already just quit smoking are not particularly interested in a substitute, so quitting smoking causes never-vaping. In 2012, there was extensive awareness of vaping and a commonly held belief that its health risks were lower than that of smoking cigarettes. The stock-flow problem was a limitation of this study, and retrospective questions may have been useful. A strength of the article was that publication included a display of the survey instrument.

Effect of vaping on smoking initiation
Of the 24 included studies, ten addressed the potential effect of vaping on smoking initiation. One health risk from vaping investigated by researchers is the possibility that those who initiate vaping are more likely to subsequently initiating smoking. Often dubbed a "gateway effect," this potential causal association is often asserted as if proven by data, when it is not.
The flaws in studies addressing the "gateway effect" have been discussed at length. 28 The studies we analyzed lacked sound research methods, and as such, could not reliably establish causation or identify a gateway effect. Moreover, health behaviors related vaping and smoking were described with insufficient detail as to the duration, amount, and frequency of the vaping/smoking. This renders participant classification uninformative, and the resulting data unreliable.
For example, the phrases "tried vaping" and "was a vaper" may describe two very different levels of vaping exposure, yet these participants may be classified together in a research study. Moreover, researchers should be sufficiently culturally competent to explore causal pathways. For example, "tried vaping, discovered an appreciation for nicotine, and as a result took the opportunity to start smoking when it was presented" is a plausible causal pathway, whereas "became a dedicated vaper and then switched to smoking" might represent someone who would have become a smoker anyway.
Further, the propensity to initiate tobacco use in the absence of vaping is also poorly established, and as such, constitutes a questionable metric. This is particularly troublesome when the claims of a "gateway effect" may be exploited, without support, to create concern regarding other risk taking behaviors, such as illicit drug use. Discussions in health literature should be grounded in data and not be unduly alarmist.
Further, it is important to categorized participants according to nicotine preference: about half the population likes being under the influence of nicotine and half does not. This variation alone guarantees a substantially higher smoking uptake among vapers (and vice versa). This is partially a result of physiology and psychological characteristics, and partially a matter of attitude. Most people who do not use any nicotine product are actively averse to doing so. Thus people who never vape, smoke, or use any other any tobacco product will inevitably initiate one such product less often than users of tobacco products.
The studies we analyzed did not control for confounding or use methods designed to account for heterogeneity among participants. As such, the limitations were significant, and the researchers could not credibly make causal claims.
Finally, many studies contained data findings suggesting that vaping behavior may replace would-be smoking.
Populations studies reveal trends of increased rates of vaping associated with decreased rates of smoking. However, the role of vaping in preventing smoking initiation has not been fully investigated, and is a meaningful topic for future research.
Barrington-Trimis et al. (2018) 4 The authors used cohort data for older American teenagers, discussing vaping and smoking without identifying a clear hypothesis, or clearly analyzing exposures, outcomes, or causal pathways. They nevertheless allude to causal associations suggesting that vaping is a gateway to smoking. The authors do not adequately control for confounding factor of smoking status, but include as covariates sex, race, and academic grade level. Moreover, they appear to aggregate datasets from diverse cohorts without accounting for differences in the study settings (eg, time, location). Finally, the reporting of statistics and classifications lack details, undermining the reliability of the stated findings and conclusions.

Leventhal et al. (2016) 18
This paper is derived from one of the datasets used in the previous entry by consisting of a 6-month follow-up of 10th graders in Los Angeles, USA. While it does not embody all deficiencies noted in the above studies, it is contains numerous flaws and seems unreliable. The paper was critically examined here. The variables are weakly defined. The authors claim to find an association with higher level of vaping intensity ("frequent" vaping is defined as 3 times per month and "heavy" smoking is defined as 2 cigarettes per day on smoking days. The authors do not adequately control for confounding variables. There is a common misconception that a dose-response relationship is suggestive of causation rather than confounding, however, confounding often has a dose-response relationship. The data findings seem to support the conclusion that having a greater taste for nicotine is associated with the propensity to vape more and to smoke more.
However, the causal association is not clearly established.

Bold et al. (2018) 9
The researchers examined patterns of consumption of cigarettes and e-cigarettes; the use of other tobacco products; and a socio-economic status (SES) index. This choice of exposure and outcome measures is flawed, and the collection of covariates was not adequately controlled for. All measures are dichotomous measures of having used the product, even just once, within the past month at the time of each of the three survey waves (they had other measures, but chose this one). Using such a measure means that many of the "gateway" events might be occurring among subjects who already smoked. Someone who vapes but does not smoke during a particular month might already be an occasional smoker who just did not happen to smoke that month (perhaps because they had a supply of vapes and no supply of cigarettes at the time). Worse, those who also smoked in the previous period are apparently included in the "vaped and then later smoked" outcome. The authors report associations with no analysis addressing whether associations are causal. They report an upward trend across survey waves in e-cigarette consumption prevalence and quantity, and attribute this to an alarming social secular time trend, while simultaneously acknowledging that their participants are aging, a confounding variable. Thus, causation could not be established, and causal claims seem unreliable. Goldenson (2017) 14 In this paper, researchers assess whether a small group of California teenagers' choice of nicotine density in their vapes is associated with subsequent smoking. The study design involves a flawed model (it assumes that each step from one of their arbitrary nicotine density categories to the next will always have the same effect on the outcome). There are barely-noticeable changes in the univariate statistics and crosstabs, from baseline to follow-up. Yet these became dramatic ratios in their multivariate model, and this disconnect is not explained. Confounding factors are not adequately addressed.
Further, the authors reference associations as if they are causal, without support.
The main drawback of this paper is that most of the subjects who reported vaping higher nicotine concentrations were already smokers at baseline. It is their greater prevalence and intensity of smoking at follow-up that drives all the main results. The authors do include the useful observations regarding nicotine preference: some people like to consume nicotine a lot, some do not like it, and others are in between. The same group who vaped high-nicotine also smoked, and smoked more. The authors imply that vaping higher nicotine causes smoking, without acknowledging the alternate explanation: that people who like a lot of nicotine like products that deliver a solid dose of nicotine more than do people who do not like nicotine. Such apparent flaws detract from the reliability of the literature.

Unger et al. (2016) 25
This is a cohort study of Hispanic teenagers, followed into young adulthood, in Los Angeles, CA, USA, examining smoking, vaping, and cannabis use. The results show an apparent association, yet failed to control for confounding (the only covariates were demographic variables, use of alcohol, and use of other tobacco products). Despite this, the authors present apparent associations as causal claims, with inadequate discussion of alternate causal pathways. Definitions for usage data were also questionable, eg, the dichotomous "any use in the last month", which causes the problem previously noted: someone who dabbles in tobacco/nicotine product use (25-year-olds dabble), vaping sometimes and smoking sometimes, would be a "gateway" case if they happened to have vaped but not smoked for one month in 2014 and happened to have smoked during one month in 2015. As such, the discussions and conclusions were unreliable, particularly as to causal claims. Gmel et al. (2016) 13 In this study, the measure of smoking initiation was "had not smoked (at all) in the year before age-20 baseline, but had in the year before age-21 or -22 follow-up" with the exposure of interest being "vaped (at all) in the last year before follow-up." These definitions for smoking initiation and vaping seem to create skewed classifications, rendering apparent associations questionable. Moreover, researchers did not seem to adequately control for confounding. Measuring vaping exposure only at follow-up reduced the stock-flow problem for their smoking cessation results, but it exacerbates the difficulty in interpreting the smoking initiation results. It is unclear whether someone who smoked for the first time during follow up tried both smoking and vaping for the first time, or if they were already vaping and caused to smoke as a result.
Such lack of clarity undermines the reliability of the research.

Spindle et al. 2017 24
This two-period cohort study included students from a mid-tier college in Virginia, USA, during 2014-15. A strength of the study was its survey of ongoing health status and behaviors. As a result it has better deconfounder variables for "risk taking" inclinations, though still nothing to control for having a taste for nicotine or an aversion to nicotine products.
However, the numerical data had significant flaws. For example, 30% of participants recorded as "ever-vapers" at baseline were recorded as "never having vaped in their lives" at follow-up. The authors do not adequately investigate a causal pathway. Moreover, the study design does not adequately control for confounding, rendering causal claims unreliable.

Miech et al. (2017) 22
This article addresses a one-year follow-up (in 2015) of a sample from a U.S. national survey of 12th graders from the previous year. The article seems misleading in assuming a causal association between vaping and smoking, without support and without adequately identifying causal pathways. The authors looked only at never smokers at baseline, eliminating some of the fatal flaws of other studies in this category. They also made an attempt at reducing the proclivity variations in the population by restricting one of their analyses to subjects who reported a belief that smoking poses "great risk" (though since that is almost everyone, that restriction did not matter much). Still, they had obvious uncontrolled confounding (they had only a handful of demographic covariates), and thus the association remained inevitable. The same confounding problem exists within a subpopulation who have not yet smoked and who affirm the risks of smoking. As such this study embodied serious limitations.

Primack et al. (2015) 23
This two-period follow-up study included a sample of American older teens and young adults, with baseline during 2012-13 and follow-up a year later. Included subjects were "non-susceptible" never smokers, a weakly defined classification that introduced limitations to the study. This flawed classification may support the suggestion that vaping corrupts people who were unlikely to ever smoke, and may promote the flawed gateway theory. This paper also suffers from the problem of a small effective sample size for its main outcome. There are only 16 positive exposures (vapers at baseline). Moreover, while the control covariates were somewhat meaningful, they could not adequately control for confounding. The article implies that drug consumers must be chasing harms and drawn to risk taking behaviors. As such, this article has a tone of fearmongering, a serious limitation.

Chatterjee et al. (2018) 5
In this paper, authors reviewed longitudinal studies that assessed for a gateway effect between vaping and smoking. However, that authors did not critically analyze the underlying studies, but merely reiterated the proffered conclusions, many of which had serious limitations impacting reliability. As such, this papers is likely to mislead readers Levy et al. (2018) 19 This article is the one popular paper that addresses the fact --unacknowledged in the "gateway" literature --that smoking uptake among teenagers drops dramatically when vaping uptake increases. A serious limitation in this article is that the authors present data is for consumption prevalence, while making claims regarding uptake incidence. Hence, there are different hypotheses that call for different analyses. For the populations and exposure in question (where incidence is presumed to be recent) prevalence may be a reasonable proxy for incidence, however, the analysis remains Qeios, CC-BY 4.0 · Article,  unclear. Moreover, the authors do not identify important time data points, eg, whether smoking predated vaping. This data point is essential to the reliability of any causal claim. Finally, the authors allude to flaws in gateway theory studies, without applying that that knowledge to the context of the research article.

Beard et al. (2019) 5
This ecological study of vaping and smoking, examined population prevalence and incidence statistics in England from 2006-2017. One of the major confounding problems for individual associations, that it may be that most people who add vaping to smoking are particularly dedicated smokers, is transformed into a comparatively minor source of bias. For smoking cessation, the ecological analysis avoids the stock-flow problem by not restricting analysis to those who are vaping at a particular time, and the long time series available avoids the problem of missing those who were most interested in switching and so switched before the first data was collected. However, this still brings with it the complication of properly modeling the diminishing marginal "effectiveness" of vaping, as those who are the most promising candidates for switching are depleted from the at-risk population.

Epidemiology of smoking, vaping, and health outcomes
Of the 24 included papers, four addressed the epidemiology of vaping, smoking and related health outcomes.
These four articles contained many common flaws. For example, researchers attempted to assess the non-acute effects of vaping in a population of former smokers without acknowledging an inherent limitation: the characteristic clinical traits of this population include the consequences of prior smoking, which mask the non-acute effects of vaping. In such a population, it is difficult to determine whether morbidity and mortality outcomes are attributable to vaping or prior smoking. It is also difficult to design a study and identify potential participants to control for likely confounding factors.
There is ample evidence to suggest that the individual chemical exposures from vaping cause either a fraction of the risk posed by smoking, or even a slight health benefit. The plausible range here is an order of magnitude smaller than the variation in the residual health effects from former smoking, which vary based not merely on the existence of former smoking (typically the sole metric), but other factors, eg, the duration and quantity of former smoking (occasionally measured), time since quitting (occasionally measured), intensity of use and puffing behavior (rarely measured).
Classifications of smoking status also lacked granularity. For example, many studies merely classify smoking status generally (eg, current, former, or never) without accounting for duration of smoking, time since quitting, or frequency and amount of tobacco use.
These flaws were found consistently in each of the four articles analyzed, rendered their conclusions misleading.
Long-term prospective studies of appropriately categorized participants would be useful to compare the health outcomes associated with vaping to those of smoking. It is also essential to control for confounding. The specific strength and limitations of these article is as follows: Bhatta and Glantz (2019) 6 This article was retracted, apparently due to issues surrounding the authors' legal access to the data, not due to fatal flaws in methods and analysis. 7 Despite retraction, this paper continues to be rated as among the most popular (as measured by the Google Scholar algorithm); further, citations in both academic articles and political documents continue to occur. This paper examines the association between vaping and myocardial infarction, using a large populationrepresentative longitudinal dataset. However, researchers fail to report that most myocardial infarctions occurred before the subjects started vaping. Researchers Rodu and Plurphanswat publicized the problem; successfully campaigned for the retraction; and conducted a new analysis categorizing myocardial infarction that occurred before vaping initiation as having occurred in non-vapers. Rodu and Plurphanswat found a strong protective association with vaping, a contrary to the prior researchers' misleading claim.
Alzahrani, Pena, Temesgen, and Glantz (2018) 3 While not retracted, this article seems misleading. The researchers examined a retrospective reporting of myocardial infarction cases in a large dataset, and claimed to have found an association with vaping. The datasets lack indicators to establish whether the myocardial infarctions occurred before vaping initiation. This issue had been noted before the journal version of the paper appeared. One critique noted the failure to control for the effects of smoking/prior smoking. That critique also noted that the effect estimate doubled, without explanation, after the fourth author was added to the paper. The use of a retrospective dataset that lacked metrics as to timing of onset is questionable in this context. In particular, if vaping behaviors do not precede the myocardial infarction, claims as to causation lack credibility.

McConnell et al. (2016) 21
In this paper, researchers fail to identify timing of onset of exposures/outcomes, and fail to address confounding factors. The study used survey data of older high school students' behavior and self-reported health status, seeming to claim that vaping causes acute respiratory symptoms (bronchitis, wheezing). The study seemed to show an association between self-reported respiratory symptoms and E-cigarette use; however, this association disappears when controlling for tobacco smoking and second-hand smoke exposure. Socioeconomic status was also a potential confounding factors that was unacknowledged. Data regarding the timing of onset (as to exposures or health conditions) is flawed. As such, no causal claim between vaping and health outcomes can be established. The causal claims of interest are also unclear.
Moreover, definitions were not reflective of real world vaping behaviors. Notably, the datasets were treated as a single cross-sectional study: in reality, it was the 12th-year of a longitudinal study, and relevant exposure and outcome data seem to have been available but omitted. As such, the findings in this paper seem unreliable.

Wills et al. (2018) 27
This cross-sectional survey examined respiratory health among vapers in Hawaii, USA aged 55 years and older, and seemed to show an association of E-cigarette use with self-reported chronic respiratory conditions (asthma as well as COPD). However, the researchers did not seem to control for confounding or classify participants by relevant health status.
As such, the health outcomes of asthma and COPD seem more likely to be caused by the confounding factor, smoking.
Notably, vaping behavior was defined as "one puff ever;" yet the smoking variable was not similarly defined: in a population with a mean age of 55 years, smokers were likely to have smoked cigarettes more frequently and for a longer duration. The use of these definitions and classifications in this population is a serious flaw in study design and methodology.

Discussion
A critical review of the included literature revealed numerous flaws, and limitations notably outweighed strengths.
Some studies contained interesting data points worthy of future research, but lacked generalizability beyond conditions specific to the study. The lessons available from the papers in this review are predominantly negative. There are several Qeios, CC-BY 4.0 · Article,  papers that are solid workaday building blocks, but their generalizable lesson simple: "don't overreach." Most of the included papers offer only errors from which to learn. The questions most researchers address are far more difficult than typical epidemiology questions. We found no studies that employed carefully designed, fit for purpose methods to try to address the particular challenges of answering these difficult research questions. Our analysis provides several specific lessons.
First, none of the included papers proposes a valid hypothesis, and none assesses what associations we should expect to find if truly based on causal pathways. Determining causal association is very complex, particularly in the context of vaping and smoking behavioral research. This research study designs do not account for such complexity.
Second, changing the exposure and result measurements from "vaped at least once in the last 30 days and smoked every day for the last month" to, for example, "vapes daily and smoked at least once in the past week" would be more relevant for public and clinical health purposes.
Third, proposed causal claims must be made precisely, in terms of exposure(s) and outcome(s), and with hypotheses about the various potential causal pathways. The research should then be designed to assess whether the results support the primary hypothesis of interest. Attention to causal pathways would avoid many of the problems noted in the included papers.. However, we acknowledge the challenge of addressing multiple causal pathways that would produce a particular association, and the difficulty in distinguishing them, as is the case with the gateway studies.
Fourth, it is important to recognize the pathway that vaping inspires additional quit attempts that would not otherwise happen. Overlooking this pathway is a common failure in research design. The stock-flow problem could be avoided by recognizing, for example, that the pathways to smoking plus vaping at a particular point in time include discovering that vaping is not a satisfying complete substitute, while the pathways to being among nonsmokers includes discovering that it is.
Fifth, causal pathways are most often considered in research with regard to confounding and identifying which variables should and should not be used as control covariates. The use of causal pathway analysis would also be useful in these areas of research, but was not contemplated in the papers we reviewed.
Finally, using conventional epidemiology methods to assess complicated causal questions is not appropriate for real world science such as vaping. For example, trying to identify health effects of vaping in a cohort of former smokers is quite challenging, as it is nearly impossible to reliably distinguish what can only be a tiny signal from enormous noise. This is made worse by flawed measures of smoking history and vaping patterns.
One of our aims in preparing this analytical review was to identify common, avoidable methodologic mistakes and to provide simple lessons for conducting more robust research. Perhaps, an important lesson is to identify important relevant research questions. Useful questions are those that are precise, contingent, nuanced, and focused on quantifications that are motivated by externally defined questions rather than what is convenient to do with a dataset. Perhaps the most practical methodological advice to the field would consist of creating lists of valid testable hypotheses and considerations for how to test them.
Another aim was to empower readers of vaping literature to critically analyze the studies, findings, and conclusions of papers they may read. Skepticism as to the validity of conclusions may be warranted, because they are often misleading and unsupported. This is not currently a field where trust in "the scientific literature" is warranted and where readers would be able to extract reliable information without consideration of the methodological issues pointed out in this review. Readers however, should be able to come away from reading the present paper with a better collection of ideas about how to assess what research results really show, and whether the authors' claims are accurate.
It is also worth noting that due to the slow pace of the publishing process, and the even slower rate of data releases, almost all publications in the journal literature are already out of date by the time they appear. In an area of study where technology is improving year-to-year, fads and social acceptability have changed multiple times, and dominant messaging can change month-to-month, timing is important. Non-academic literature and pre-prints may then be a useful adjunct to the more formal peer-reviewed journal articles. Authors should keep this in mind and make sure to emphasize the timing of their study and the fact that any estimates and conclusions they offer will probably no longer describe the present reality by the time the reader encounters their paper.

Conclusion
Our critical appraisal reveals common, preventable flaws, the identification of which may guide future researchers.
One striking result of the review is that a large portion of the high-ranking papers came out of research institutions funded by the U.S. government, thereby de facto not supporting a tobacco harm reduction agenda. This reflects both an American dominance in the discourse, and a dominance of anti-vaping partisans amongst those whose actions determine popularity.
However, this does not mean there is a trove of good research out there that answers the big questions, but merely did not make the popularity cut. There is not. Notably, papers discussing the effect of vaping on smoking initiation shared common flaws. By contrast, papers addressing the effect of vaping on smoking cessation or reduction demonstrated a broader variety of flaws, yet common themes emerged. Our analysis of common flaws and limitations may guide future researchers to conduct more robust studies and, concomitantly, produce more reliable literature. There are countless sources of good building-block information that can be pieced together to provide knowledge. In order to provide useful information, research questions should be precise, contingent, nuanced and focused on quantifications that are motivated by externally defined questions. Such research necessitates proactive design, rather than utilizing already existing, but not fit-for-purpose, datasets.

FIGURES AND TABLES
Causal relationship • A causal relationship between an exposure (eg, tobacco) and a health outcome (eg, cancer) may be identified if the exposure preceded the outcome, and the outcome would not have occurred in the absence of the exposure.

Counterfactual
• Researchers often conduct retrospective observational studies in which both the exposure and the outcome have already occurred, and changing the variable is no longer possible; • Counterfactual methods assess for causal relationships by exploring how the health of individuals in a cohort may be different if the exposure was absent; • Researchers often use study designs comparing two groups of people, such that individuals in Group A are exposed to a suspected causative agent (eg, tobacco) , and individuals in Group B are not. If the health outcomes for the two groups are different (eg, significantly more individuals in Group A than Group B develop lung cancer), researchers may infer that the exposure is causing the outcome. Notably, research scenarios and study participants often have many variables, which complicates the task of determining causation.

Confounding factors
• Factors that complicate the task of determining causation in research are called "confounding factors." • In some research studies, the exposures and the outcome may both be present, yet this is not proof that the exposure caused the outcome.
• For example, study participants who use tobacco (exposure) may also drink coffee, reside in an urban setting, and be aged ≥ 40 years. The three potential confounding factors here are coffee consumption, urban residence, and age. Researchers must control for such confounding factors to determine which exposure is truly causative.

Reporting Results
Relate results to hypothesis clearly and specifically.
Included papers make causal claims that are unreliable because they fail to specify which health outcome/exposures are the subject of the conclusion.
They also offer conclusions and summary statements unsupported by data.

Counterfactual Analysis
Researchers state a hypothesis but fail to state a counterfactual claim that could guide data analysis and clarify interpretations.
In the included papers the authors often fail to state their hypothesis clearly and have not discussed counterfactual claims.

Causal Pathway Identification
Identifying and specifying the suspected causal pathway is needed to explain why "reverse causation" is not plausible. It is also needed to determine which variables are potential confounders, and assess for their impact on causation. the participant population. The extent to which results cannot be extended beyond such populations should be specified.
generalizability of their findings beyond their participant population. Switching:

Tobaccorelated Behaviors
Time trends in switching behavior are likely to occur, with the population of potential switchers being used up depleted. Thus switching rates will not be constant but rather will be affected by some expected attrition.
In the included studies, the researchers do not account for this trend.

Problem
Many studies attempt to measure the proportion of people who quit smoking, ie, the "flow" of people. However, many people who have already quit smoking successfully due to vaping are excluded from studies. Thus, the impact of vaping on smoking cessation is not effectively measured in many studies due to inclusion/exclusion criteria.
For many candidate exposure measures, the "stock" of people who vape comprise a disproportionately high number of people who are not likely to switch, and exclude those who previously quit smoking by switching. A study cohort with such a skewed proportion of participants creates bias in the study.
In all included studies, the "stock-flow" problem is a serious limitation. While some of the researchers included steps to mitigate the impact of the stock-flow problem, it remained significant and unacknowledged.

Impact of Smoking
Cessation on

Vaping Behaviors
Smoking cessation causes particular vaping behaviors (eg, increase in vaping frequency to daily use; purchasing vaping paraphernalia), irrespective of whether or not those behaviors impact smoking cessation.
The authors who looked at these variables did not acknowledge this issue.
Other Potential

Biases: Longitudinal
Studies A longitudinal cohort study design is potentially more informative than other study designs, however a case-control study design may be more informative to research the topics of vaping and smoking cessation. The overall study design, methods, and data collection, implementation, and interpretation are essential to determining the value of the study.
The included studies reveal multiple methodology flaws such as the stock-flow problem, cross-sectional surveys of subjects, with no meaningful retrospective questions.

Retrospective and Motivational
Questions Retrospective studies comprising participant questionnaires are informative only to the extent that meaningful questions elicit accurate information on relevant metrics.
The retrospective studies included do contain some meaningful questions; however, the overall study designs are not sufficiently comprehensive to produce reliable responses or data.

Confounding: Individual
Propensities Confounding may occur because many participants engage in multiple behaviors (eg, use of illicit drugs, tobacco use, vaping of nicotine and/or cannabis) which may or may not be reported. Such variables warrant assessment to determine association(s) with main outcome measures, and whether any associations are causative The included studies discussing "gateway" behaviors do not adequately address confounding variables, and therefore, do not reliably discuss causation.

Intractable Confounding
Research reveals heterogeneity among those who vape and/or smoke as to an essential metric: some individuals like consuming nicotine and some do not.
Studies must distinguish participants according to these traits.
In the included studies, researchers fail to design studies that recognize the heterogeneity of participants according to the key trait of nicotine preference.

Gateway Theory
The "Gateway" theory posits that a person who vapes is more likely to begin using other substances (eg, illicit drugs, cannabis, smoking cigarettes) than someone who never vaped.
The "Gateway" theory is often stated as a forgone conclusion -but must be supported with data and reliable research. Unsupported assertions of a "Gateway" theory warrant skepticism, as the other behaviors of concern have significant barriers to adoption (eg, more harmful, less flavorful, less convenient, and more expensive).
The included studies embracing the "Gateway" theory do so without scientific support.

Methodology
Flaws For research to be reliable, sound methods are required, including recruiting and retaining an adequate number of participants, accurate measurements, and reasonable follow up.
The included studies reveal many preventable methodology flaws, including small sample size, poor participant follow up and retention, and unreliable measurements.

Small Effect Size
The main outcome measure of the study should be measurable and salient. If potential confounders obscure researchers' ability to assess the main outcome, the study will be uninformative.
Many of the included studies addressing health effects attempt to discuss an outcome measure that is too small to have been meaningfully assessed.

Real World
Science If research findings conflict with real world, common sense observations, the researchers should explain this apparent inconsistency.
In many included papers, authors assert that vaping reduces the likelihood of smoking cessations and/or promotes smoking initiation. Such assertions are not