TY - GEN
T1 - Enhancing ESG Reporting Analysis
T2 - 4th IEEE International Conference on Electronic Communications, Internet of Things and Big Data, ICEIB 2024
AU - Wu, Jun Wei
AU - Lin, Lydia Hsiao Mei
AU - Tsai, Richard Tzong Han
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - ESG reporting, covering environmental, social, and governance aspects, is a crucial resource for investors, companies, and governments to understand a company's value. However, the sheer volume of data and information in the reports makes it difficult to retrieve specific information on ESG. ESG-BERT models are developed to categorize the content of ESG reports based on the relevant field. However, the accurate prediction of multiple classifications is difficult due to unevenly distributed labels. To address this problem, we used a self-made assistant based on GPT to label the data and then resampling technology to adjust the balance of the data set. After the data set was balanced, we adjusted to ESG-BERT to improve its multi-label classification accuracy. We evaluated multiple resampling methods and determined the most suitable strategy for classifying ESG report content. Thus, the accuracy of ESG content analysis was improved through more refined model adjustment and preprocessing methods. The results indicated that balancing the data for improved classification ensures fairness and objectivity in distributing content for ESG reports with multiple labels.
AB - ESG reporting, covering environmental, social, and governance aspects, is a crucial resource for investors, companies, and governments to understand a company's value. However, the sheer volume of data and information in the reports makes it difficult to retrieve specific information on ESG. ESG-BERT models are developed to categorize the content of ESG reports based on the relevant field. However, the accurate prediction of multiple classifications is difficult due to unevenly distributed labels. To address this problem, we used a self-made assistant based on GPT to label the data and then resampling technology to adjust the balance of the data set. After the data set was balanced, we adjusted to ESG-BERT to improve its multi-label classification accuracy. We evaluated multiple resampling methods and determined the most suitable strategy for classifying ESG report content. Thus, the accuracy of ESG content analysis was improved through more refined model adjustment and preprocessing methods. The results indicated that balancing the data for improved classification ensures fairness and objectivity in distributing content for ESG reports with multiple labels.
KW - ESG
KW - LLM
KW - imbalanced data
KW - multi-label classification
KW - resampling method
UR - http://www.scopus.com/inward/record.url?scp=85201212110&partnerID=8YFLogxK
U2 - 10.1109/ICEIB61477.2024.10602711
DO - 10.1109/ICEIB61477.2024.10602711
M3 - 會議論文篇章
AN - SCOPUS:85201212110
T3 - 2024 IEEE 4th International Conference on Electronic Communications, Internet of Things and Big Data, ICEIB 2024
SP - 415
EP - 418
BT - 2024 IEEE 4th International Conference on Electronic Communications, Internet of Things and Big Data, ICEIB 2024
A2 - Meen, Teen-Hang
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 19 April 2024 through 21 April 2024
ER -