Gaussian process based text categorization for healthy information

Sih Huei Chen, Yuan Shan Lee, Tzu Chiang Tai, Jia Ching Wang

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

As the development of the medical technology, more and more people start to pay attention to their health. A large amount of health information can be easily obtained from the website now. Therefore, text categorization is important to analyze the information. In this work, we propose a system for text categorization that is based on a Gaussian process. Our proposed system involves the two parts- feature learning and classification. In the first part, we apply the latent Dirichlet allocation (LDA) to obtain the K latent topics proportion from each document. The K-dimensional vector is regarded as the feature of each document. In the classification part, a Gaussian process (GP) is utilized for the text categorization. 10 classes of text documents are categorized by the one-versus-one approach. The experimental results show that our proposed system performs well in text categorization, especially with the small size of training dataset.

Original languageEnglish
Title of host publicationProceedings of 2015 International Conference on Orange Technologies, ICOT 2015
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages30-33
Number of pages4
ISBN (Electronic)9781467382373
DOIs
StatePublished - 22 Jun 2016
Event3rd International Conference on Orange Technologies, ICOT 2015 - Hong Kong, Hong Kong
Duration: 19 Dec 201522 Dec 2015

Publication series

NameProceedings of 2015 International Conference on Orange Technologies, ICOT 2015

Conference

Conference3rd International Conference on Orange Technologies, ICOT 2015
Country/TerritoryHong Kong
CityHong Kong
Period19/12/1522/12/15

Keywords

  • classification
  • feature learning
  • Gaussian process
  • Latent Dirichlet Allocation
  • text categorization

Fingerprint

Dive into the research topics of 'Gaussian process based text categorization for healthy information'. Together they form a unique fingerprint.

Cite this