尹志 (2025-01-31 17:05):
#paper https://doi.org/10.48550/arXiv.2403.07183 Monitoring AI-Modified Content at Scale: A Case Study on the Impact of ChatGPT on AI Conference Peer Reviews 一篇讨论大语言模型使用情况的文章, 特别举了在AI顶会评审中使用的具体例子。(包括ICLR 2024、NeurIPS 2023、CoRL 2023和EMNLP 2023。)研究发现,这些论文review中,有6.5%至16.9%可能被LLM大幅修改,而且这些review有很多有趣的特点,比如confidence比较低,接近ddl才提交,而且不太愿意回应作者反驳等。更多有趣的现象可参考原文。文章中贴了最常见的AI喜欢使用的形容词,比如“commendable”, “meticulous”, and “intricate”等,确实很像AI搞的,哈哈哈。 看来以后审稿人要对作者更加负责才行噢。
arXiv, 2024-03-11T21:51:39Z. DOI: 10.48550/arXiv.2403.07183
Monitoring AI-Modified Content at Scale: A Case Study on the Impact of ChatGPT on AI Conference Peer Reviews
Weixin Liang, Zachary Izzo, Yaohui Zhang, Haley Lepp, Hancheng Cao, Xuandong Zhao, Lingjiao Chen, Haotian Ye, Sheng Liu, Zhi Huang, ... >>>
Weixin Liang, Zachary Izzo, Yaohui Zhang, Haley Lepp, Hancheng Cao, Xuandong Zhao, Lingjiao Chen, Haotian Ye, Sheng Liu, Zhi Huang, Daniel A. McFarland, James Y. Zou <<<
Abstract:
We present an approach for estimating the fraction of text in a large corpus<br>which is likely to be substantially modified or produced by a large language<br>model (LLM). Our maximum likelihood model leverages expert-written and<br>AI-generated reference texts to accurately and efficiently examine real-world<br>LLM-use at the corpus level. We apply this approach to a case study of<br>scientific peer review in AI conferences that took place after the release of<br>ChatGPT: ICLR 2024, NeurIPS 2023, CoRL 2023 and EMNLP 2023. Our results suggest<br>that between 6.5% and 16.9% of text submitted as peer reviews to these<br>conferences could have been substantially modified by LLMs, i.e. beyond<br>spell-checking or minor writing updates. The circumstances in which generated<br>text occurs offer insight into user behavior: the estimated fraction of<br>LLM-generated text is higher in reviews which report lower confidence, were<br>submitted close to the deadline, and from reviewers who are less likely to<br>respond to author rebuttals. We also observe corpus-level trends in generated<br>text which may be too subtle to detect at the individual level, and discuss the<br>implications of such trends on peer review. We call for future<br>interdisciplinary work to examine how LLM use is changing our information and<br>knowledge practices.
回到顶部