A corpus-based study of vocabulary in massive open online courses (MOOCs)

Chen Yu Liu

Research output: Contribution to journalArticlepeer-review


Massive open online courses (MOOCs) provide rich academic content to learners around the world. However, understanding such content is challenging to second-language learners. Given the importance of vocabulary knowledge to comprehension, this study constructed a 10.2-million-word corpus of MOOCs from four disciplinary areas (engineering, humanities and arts, science and math, and social sciences), and examined (a) the lexical demands of MOOCs, (b) the coverage of general and discipline-specific academic vocabulary lists in MOOCs, and (c) the extent to which these lists helped learners with MOOCs' lexical demands. The results show that – together with proper nouns, marginal words, transparent compounds, and acronyms – the most frequent 3,000 and 4,000 word families of general English respectively provide 90% and 95% coverage of the corpus, indicating that MOOCs are as lexically challenging as university lectures. Also, because both general and discipline-specific academic vocabulary lists provide high coverage of MOOCs, studying them can lower students' learning burdens and help them achieve higher coverage of MOOCs than learning words by frequency. Lastly, based on learners’ existing vocabulary knowledge and target disciplines, this study provides pedagogical recommendations to teachers on how to employ general and discipline-specific academic word lists as vocabulary support for EAP students.

Original languageEnglish
Pages (from-to)40-50
Number of pages11
JournalEnglish for Specific Purposes
StatePublished - Oct 2023


  • Academic spoken vocabulary
  • Disciplinary variation
  • EAP
  • Lexical demands
  • Massive open online courses (MOOCs)


Dive into the research topics of 'A corpus-based study of vocabulary in massive open online courses (MOOCs)'. Together they form a unique fingerprint.

Cite this