OLERA: Semisupervised Web-data extraction with visual support

Chia Hui Chang, Shih Chien Kuo

Research output: Contribution to journalReview articlepeer-review

70 Scopus citations

Abstract

A semi-supervised information extraction (IE) system, OLERA (On-Line Extraction Rule Analysis), proposed by Chia-Hui Chang of National Central University, Taiwan and Shih-Chien Kou, Trend Micro, Taiwan, is described. The system allows users, with minimal effort, train extraction rules from semistructured Web pages without requiring detailed annotation of the training documents. OLERA offers visual interaction by displaying discovered records in a spreadsheet-like table for schema assignment. It performs well for program-generated Web pages with few training pages and limited user intervention.

Original languageEnglish
Pages (from-to)56-64
Number of pages9
JournalIEEE Intelligent Systems
Volume19
Issue number6
DOIs
StatePublished - Nov 2004

Fingerprint

Dive into the research topics of 'OLERA: Semisupervised Web-data extraction with visual support'. Together they form a unique fingerprint.

Cite this