CPsyExam: A Chinese Benchmark for Evaluating Psychology using Examinations

Jiahao Zhao, Jingwei Zhu, Minghuan Tan, Min Yang, Renhao Li, Di Yang, Chenhao Zhang, Guancheng Ye, Chengming Li, Xiping Hu, Derek F. Wong

科研成果: 书/报告/会议事项章节会议稿件同行评审

1 引用 (Scopus)

摘要

In this paper, we introduce a novel psychological benchmark, CPsyExam, constructed from questions sourced from Chinese examination systems. CPsyExam is designed to prioritize psychological knowledge and case analysis separately, recognizing the significance of applying psychological knowledge to real-world scenarios. We collect 22k questions from 39 psychology-related subjects across four Chinese examination systems. From the pool of 22k questions, we utilize 4k to create the benchmark that offers balanced coverage of subjects and incorporates a diverse range of case analysis techniques. Furthermore, we evaluate a range of existing large language models (LLMs), spanning from open-sourced to proprietary models. Our experiments and analysis demonstrate that CPsyExam serves as an effective benchmark for enhancing the understanding of psychology within LLMs and enables the comparison of LLMs across various granularities.

源语言英语
主期刊名Main Conference
编辑Owen Rambow, Leo Wanner, Marianna Apidianaki, Hend Al-Khalifa, Barbara Di Eugenio, Steven Schockaert
出版商Association for Computational Linguistics (ACL)
11248-11260
页数13
ISBN(电子版)9798891761964
出版状态已出版 - 2025
已对外发布
活动31st International Conference on Computational Linguistics, COLING 2025 - Abu Dhabi, 阿拉伯联合酋长国
期限: 19 1月 202524 1月 2025

出版系列

姓名Proceedings - International Conference on Computational Linguistics, COLING
Part F206484-1
ISSN(印刷版)2951-2093

会议

会议31st International Conference on Computational Linguistics, COLING 2025
国家/地区阿拉伯联合酋长国
Abu Dhabi
时期19/01/2524/01/25

引用此