Journal of Tropical Diseases and Parasitology ›› 2023, Vol. 21 ›› Issue (4): 223-227.doi: 10.3969/j.issn.1672-2302.2023.04.010

• CONTROL STUDIES • Previous Articles     Next Articles

Analysis on the factors affecting infection among close contacts of COVID-19 based on random forest and multi-factor interactive logistic regression models: A case study in Tongling City

ZHANG Fan1(), QI Ping2()   

  1. 1. Tongling Center for Disease Control and Prevention, Tongling 244000, Anhui Province, China
    2. Department of Mathematics and Computer Science, Tongling University
  • Received:2023-04-03 Online:2023-08-20 Published:2023-08-23
  • Contact: QI Ping E-mail:527973792@qq.com;qiping929@tlu.edu.cn

Abstract:

Objective To analyze the factors affecting the infection and the interaction of the influencing factors among close contacts of patients with coronavirus disease 2019 (COVID-19) in Tongling for evidence to formulate accurate prevention and control strategies. Methods The data were collected from close contacts related to local COVID-19 cases reported in Tongling from March 14-30 in 2022. Strongly correlated influencing factors were initially screened out using random forest algorithm, and then multi-factor interactive logistic regression model was established to analyze the infection risk and its influencing and interaction factors among close contacts of patient with COVID-19. Results The overall infection rate was 1.95% (101/5 168) in the close contacts of patients with COVID-19 in Tongling. Random forest algorithm generated 8 factors affecting the important evaluation scores, including contact mode, contact frequency, relationship of associated cases, contact location, clinical situation of associated cases, age, gender and occupation. Analysis by multi-factor interactive logistic regression model showed that the infection risk of close contacts of patients with COVID-19 was positively related to “living together” (r=0.382,P<0.05) and “frequent contact” (r=0.139, P<0.05). In terms of interaction effects, the infection risk was positively related to the interaction effect of “living together” + “family” (r=0.761, P<0.05), “age≤10” + “relative” (r=0.252, P<0.05), and “colleagues or friends” + “frequent contact” (r=0.132,P<0.05), yet negatively to the interaction effect of “no-direct-contact-in-common-space” + “occasional contact” (r=-0.122,P<0.05) and “age>60” + “occasional contact” (r=-0.221,P<0.05). The correct rate, accuracy rate, recall rate and F1 score were increased by 8.04%, 13.24%, 4.44% and 7.45%, respectively, in multi-factor interactive logistic regression model compared to the traditional logistic regression model. Conclusion Combined random forest with logistic complete quadratic regression model can excavate interaction effects among the influencing factors from multi-factor data with limited samples, which may provide strong groundwork for disease prevention and control.

Key words: COVID-19, Close contacts, Influencing factors, Random forest model, Tongling City

CLC Number: