Acceptability of chatbot versus General Practitioner consultations for healthcare conditions varying in terms of perceived stigma and severity (Preprint)

Aims: T his study aimed to assess how perceived stigma and severity of health issues are associated with the acceptability for a health consultation source: i) a chatbot, ii) a General Practitioner (GP) or iii) a GP-chatbot combination. Methods: Between May and June 2019, an online study, advertised via Facebook, was completed by a convenience sample of 237 participants from the UK. T he design was an online factorial simulation experiment with three within-subject factors (health issue stigma, health issue severity and consultation source) and six betweensubject covariates resulting in 12 conditions where participants rated acceptability of each consultation source for each health condition. Both research questions were analysed with a single mixed-model analysis of variance (ANOVA). Results: More severe health issues decreased the acceptability for chatbots as a consultation source F(2, 372) = 118.14, p < .001, partial η2 = 0.388, while more stigmatised health issues increased the acceptability for chatbots as a consultation source F(2, 372) = 12.99, p < .001, partial η2 = 0.65. T here was no significant association between participants’ characteristics and acceptability of consultation sources. Conclusions: Chatbots may be more acceptable for consultations regarding more stigmatised health and less acceptable for conditions of higher perceived severity.

T he participants were presented with three health issues for each high/less stigmatised or high/low severity condition. T hey were asked to rate their acceptability of each of the three consultation sources for each experimental factor. In total, participants were asked to complete 36 acceptability ratings (three for each condition) and were blinded to the predicted stigma or severity of the health issue. T his was to mitigate for the possibility that the presented health issue's stigma or severity may be interpreted differently.
T he health issues were identified and selected via a pilot study consisting of 37 participants using convenience sampling. Participants rated their perceived stigma and severity of 40 health issues, identified from the NHS's Health A to Z directory (NHS, 2019), on a five-point Likert scale (1 = low perceived sigma/ severity, 5 = high perceived stigma or severity). As per consensus from the research team, mean scores under two were rated as low perceived stigma or severity and mean scores over 3.5 were rated as high perceived sigma or severity.

Procedure Procedure
Patients were presented with the information page and asked to consent to the survey.
Participants were then asked about their knowledge of chatbots and were presented with  T he interaction between the severity of the health issues and acceptability of the different consultation sources were found to be significant (T able 2). GPs and the GPchatbot combination were found to be more acceptable than chatbots for more severe health issues but not the less severe ones.
T here was a significant interaction between the level of stigma of the health issue and the acceptability of the different consultation sources. T he greater acceptability for GPs over chatbots was attenuated in the high stigma conditions (T able 2).
T here was a significant three-way interaction between the stigma, the severity of the health issue and the acceptability of the different consultation sources (T able 2). T he differences between the different sources were only evidence in the high severity conditions. In the low severity conditions, there was no clear evidence that participants None of the participant characteristics significantly influenced the acceptability ratings (T able 3).

Discussion Discussion
T he results showed that for health conditions with low perceived severity chatbots, GPs or chatbot-GP combinations were judged to be approximately equally acceptable.
However, for conditions of high perceived severity GPs were judged more acceptable than a chatbot-GP combination which was judged more acceptable that chatbots alone, with this difference between attenuated in conditions with high stigma. T here was no clear evidence that participant characteristics influenced acceptability ratings. ) and provides evidence that chatbots may be a viable method of consultation for more stigmatised health issues. However, considering that acceptance of chatbots as a consultation source is low for more stigmatised/high severity health issues, it is apparent that severity of a health issue is a more important determinant of acceptability than stigma.

Participant acceptability of a GP-chatbot combination Participant acceptability of a GP-chatbot combination
As argued by Fadhil (2018a), a GP-chatbot combination did increase the chatbot acceptability. It is interesting that participants were not more accepting of a GP-chatbot combination. One would expect that a GP-chatbot combination would be the most acceptable consultation source as patients are receiving the same benefits as seeing a GP about their health issues, with the added benefit of triage and validation by chatbots and AI. T he low acceptance scores could be explained by the negativity effect (Reeder & Brewer, 1979); this is where negative attitudes are weighted more than positive attitudes when forming an evaluation. 2017). By implementing chatbots into areas where they are more likely to be accepted, for example as consultation source for more stigmatised or less severe health issues. As more people use and accept chatbots in these areas, it may inadvertently increase acceptance as people start to perceive the need for the technology (Rogers, 1962).

Limitations Limitations
T he study had several limitations. As this study was advertised online, people who use the internet regularly are more likely to be exposed to the study advert. Consequently, the population may have resulted in a more technology accepting population participating in the study due to self-selection bias. Secondly, the amount of excluded cases equated to almost 1/3 of the total participants. T herefore, there is a risk of bias due to this missing data (assuming it is not missing-at-random). T herefore, reducing the population validity of the study. T his may explain why there was insufficient evidence to conclude that participant characteristics influenced the acceptability ratings of the different consultation sources.
Regarding the study's measures, the self-report of willingness is not real-life behaviour enactment. Willingness is a measure of intention; there is a well-established gap between an intention to do a behaviour an actual behaviour enactment (Bhattacherjee & Sanford, 2009;Sheeran & Webb, 2016). T his study's results may not provide an accurate representation of how acceptability influences a person's usage of a chatbot in healthcare. T his limitation exists due to a lack of a valid measure of acceptability.
T his was a simulation study; the conditions participants undertook were not real situations, therefore implicating the predictive validity of these results. If a person has or has experienced a specific health issue, they may be more/less accepting of using a chatbot as a consultation source than if they imagine the health issue. T he health issues used in this study may not be accurate representations of more/less stigmatised or high/ low severity health issues. Hence, the face validity of this experiment may be reduced.