Joshua S. Gans & Avi Goldfarb
This paper develops a model of automation where production tasks are quality complements rather than separable, meaning task qualities multiply in O-ring fashion. The authors demonstrate that when workers reallocate limited time across remaining manual tasks after automation, three key dynamics emerge: automation decisions become interdependent rather than independent, adoption patterns can be discrete and bundled despite smooth quality improvements, and worker income can actually rise under partial automation as remaining bottleneck tasks gain value.
Robust fake news detection using Large Language Models under adversarial sentiment attacks
Sahar Tahmasebi, Eric Müller-Budack, Ralph Ewerth
This paper investigates the vulnerability of state-of-the-art fake news detection systems to sentiment manipulation attacks enabled by large language models. While prior research has established sentiment as a key signal for identifying fake news, the authors demonstrate that adversaries can exploit this dependency by using LLMs to manipulate emotional content and evade detection. The study reveals an unexplored weakness in current fake news detectors, as previous adversarial research focused primarily on stylistic features like writing style rather than sentiment-based attacks.
Gaming the judge: unfaithful Chain-of-Thought can undermine agent evaluation
Muhammad Khalifa, Lajanugen Logeswaran, Jaekyeom Kim, Sungryull Sohn, Yunxiang Zhang, Moontae Lee, Hao Peng, Lu Wang, Honglak Lee
This paper reveals a critical vulnerability in using large language models as judges for evaluating AI agent performance: manipulated chain-of-thought reasoning traces can drastically inflate evaluation scores while actions and observations remain unchanged. The findings challenge the implicit assumption that agent CoT faithfully reflects internal reasoning and environment state.
Privacy collapse: benign fine-tuning can break contextual privacy in Language Models
Anmol Goel, Cornelius Emde, Sangdoo Yun, Seong Joon Oh, Martin Gubri
This paper identifies a critical phenomenon where seemingly harmless fine-tuning of frontier language models causes "privacy collapse". Experiments demonstrate that diverse training patterns including helpfulness optimisation, exposure to user information, emotional dialogue, and code debugging can trigger this failure mode, causing models to share information inappropriately with tools and violate memory boundaries across contexts.
Ruiran Su, Janet B. Pierrehumbert, Markus Leippold
This paper applies LLMs to conduct a comprehensive computational analysis of how climate change discourse has evolved in financial news media over multiple decades. The authors employ LLMs to systematically extract and analyze three key dimensions of climate communication: the actors involved in climate discussions, the narrative frames used to contextualize climate issues, and the argumentative structures deployed across different time periods and financial contexts. The research demonstrates how LLMs can enable large-scale longitudinal analysis of complex discourse patterns in specialised domains.