Authors
Jina Huh, Meliha Yetisgen-Yildiz, Wanda Pratt
Publication date
2013/12/1
Journal
Journal of biomedical informatics
Volume
46
Issue
6
Pages
998-1005
Publisher
Academic Press
Description
Objectives
Patients increasingly visit online health communities to get help on managing health. The large scale of these online communities makes it impossible for the moderators to engage in all conversations; yet, some conversations need their expertise. Our work explores low-cost text classification methods to this new domain of determining whether a thread in an online health forum needs moderators’ help.
Methods
We employed a binary classifier on WebMD’s online diabetes community data. To train the classifier, we considered three feature types: (1) word unigram, (2) sentiment analysis features, and (3) thread length. We applied feature selection methods based on χ2 statistics and under sampling to account for unbalanced data. We then performed a qualitative error analysis to investigate the appropriateness of the gold standard.
Results
Using sentiment analysis features, feature selection methods, and …
Total citations
20132014201520162017201820192020202120222023202418151616111513121061
Scholar articles
J Huh, M Yetisgen-Yildiz, W Pratt - Journal of biomedical informatics, 2013