Early constructionist approaches to language (Fillmore, Kay, and O’Connor 1988, inter alia) devoted copious attention to the inextricable mix of the idiomatic and the productive aspects of language use. As construction grammar has steadily grown in its reach and influence over recent decades, however, it has also proven resistant to computational modeling. Most resistant to such modeling has been the poorly covered lexico-grammatical territory that falls between frozen items found in dictionaries on the one hand and patterns of maximally productive rules of grammar on the other. The resistance to computational modelingof this territory is especially unfortunate since it is preciselythis terrainwhich early constructionist research persistently highlighted as being of central importance but neglected by mainstream modular theories.The motivation of the current proposal lies in a theoretically grounded model that we have developed over the past eight years called StringNet which has provided uniquely promising results in capturing this lexico-grammatical territory. The current version of StringNet and our classroom field-testing with learners have revealed specific sources of potential for further breakthroughs in both capturing more lexico-grammatical constructions in the model and in making these patterns and their meaning more readily discoverable by independent users. The purpose of the proposed research is to create fundamentalrefinements to fulfill thispotential and make the patterns and their meaning even more accessible and intelligible to users. At the heart of this new potential is the unique opportunity the model affords for detecting contextual correlates of meaning. These contextual features, however, are latent in the current model, distributed throughout billions of patterns and thus out of reach for discovery by current users. Our proposed approach consists in bootstrapping from the relations indexed in the current model into a more refined model with additional higher-order relations. Importantly, unlike black-box distributed language models such as vector space models (VSMs), we are careful to retain our model’s transparency and navigability, making the higher order relations of the new model traceable and therefore intelligible to users. This makesour proposed model and our tools for accessing it uniquely suited for supporting learner-centered language exploration and discovery. We have implemented and published a test of concept of the core idea behind the proposed refinements (Tsao and Wible 2013) with promising results andrich implications for the design of the proposed model. We propose to use the new StringNet design to construct an academic StringNet out of a corpus of 50 million words of academic journal articles. The new general English StringNet and the academic StringNet will be cross-indexed to allow for detailed comparisons and therefore deeper research into the distinctive aspects of academic English beyond simple academic word and formula lists.
|Effective start/end date||1/08/18 → 31/07/20|
UN Sustainable Development Goals
In 2015, UN member states agreed to 17 global Sustainable Development Goals (SDGs) to end poverty, protect the planet and ensure prosperity for all. This project contributes towards the following SDG(s):
- Lexico-grammatical constructions
- Construction grammar
- English learning resource
- Computational language model
Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.