Project Details
Description
With the rise of Internet and social media, online encyclopedia has become an important source of knowledge in modern people's daily life. Wikipedia is the largest and most important encyclopedia knowledge base on the Internet because of its free and open spirit. Although Chinese Wikipedia has a large number of participants, its number of entries is still far less than the number of Wikipedia entries in mainstream Western languages (such as English). This research project is expected to develop methods for automatically generating Chinese encyclopedia entries based on other language entries to expand the Chinese Wikipedia knowledge base, thereby achieving the circulation and dissemination of Chinese knowledge content. Our plan is as follows. In the first year, we will develop cross-language English-Chinese Wikipedia entry content rewriting technology and use the "cross-language hierarchical Transformer" to convert English content into Chinese entry content. In the second year, we plan to develop an article style conversion method based on recurring adversarial networks, and automatically learn how to adjust the style according to the taste of target users. In the third year, we will develop automatic rewriting technology that uses multilingual entries and external sources. This technology takes advantage of Wikipedia's multilingualism and presents encyclopedic knowledge in different language versions in Chinese, thereby making the content of Chinese Wikipedia more complete.
Status | Finished |
---|---|
Effective start/end date | 1/08/20 → 31/07/21 |
UN Sustainable Development Goals
In 2015, UN member states agreed to 17 global Sustainable Development Goals (SDGs) to end poverty, protect the planet and ensure prosperity for all. This project contributes towards the following SDG(s):
Keywords
- Wikipedia article content generation
- Cross lingual hierarchical transformer
- Cycle generative adversarial network
- Article style conversion
- Automatic rewriting
Fingerprint
Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.