Apply computer vision in GUI automation for industrial applications

Research output: Contribution to journalArticlepeer-review

6 Scopus citations


Technology has reshaped the workplace and the rapid improvements have transformed how we work nowadays. In the pursuit of industry 4.0, we build smart machines and robots to replace manual labor. While the manual labor is replaced by machines, in many cases, humans are transformed into desktop software users. Jobs such as testing, quality inspection, data monitoring, data entry, and routine editing remain to be done by humans in front of desktop computers. The operations to software applications in principle can be reduced to screen output understanding and mouse and keyboard operations. When the characteristics of these jobs are repetitive, tedious, and monotonous, they can be replaced by GUI automation techniques. GUI automation can be achieved by different underlying technologies, each has its pros and cons. In this paper, we describe a tool-Korat, which uses computer-vision to achieve maximum cross-platform capability for industrial applications, including test automation and robotic process automation. Although Korat has been successfully adopted by several industrial customers, difficult problems remain to be addressed. The problems and difficulties in applying computer vision for GUI automation are discussed and studied in this paper, particularly the experiences of applying open source OCR to GUI automation over color screenshots. By introducing critical pre-processing stages and algorithms, the recognition rate is significantly increased and becomes feasible for practical usage.

Original languageEnglish
Pages (from-to)7526-7545
Number of pages20
JournalMathematical Biosciences and Engineering
Issue number6
StatePublished - 2019


  • Computer vision
  • GUI automation
  • Image analysis
  • Optical character recognition
  • Test automation


Dive into the research topics of 'Apply computer vision in GUI automation for industrial applications'. Together they form a unique fingerprint.

Cite this