Performance or Principle: resistance to artificial intelligence in the U.S. labor market
Key points:
Unlike earlier studies that focused narrowly on job loss or skill mismatch, this paper links performance perceptions of AI to moral resistance, showing how social norms and technical progress interact in shaping AI’s acceptance in work. Through a large-scale survey across 940 occupations and over 23,000 individual ratings in the U.S. labor market, the study shows that most resistance to AI stems from concerns about capability, not principle: participants initially supported automating about 30% of jobs, but this approval increased to 58% when AI was described as outperforming humans. Yet certain professions where human connection is essential (eg. caregiving, therapy, teaching) remained ethically protected, reflecting a persistent moral boundary in human labour. But this carries consequences too: jobs considered “morally human” tend to be higher status, better paid and less diverse, meaning that ethical boundaries could unintentionally reinforce some inequalities even as they protect valued social roles.
Authors: Simon Friis, James W. Riley
When combinations of humans and AI are useful: a systematic review and meta-analysis
Key points:
This MIT Sloan study examined the performance of hybrid human–AI teams across a variety of tasks. It revealed that collaboration only pays off under certain conditions. For data-heavy, pattern-based tasks such as forecasting or classification, AI consistently outperforms humans, so combining forces does not add value. However, when decisions require moral judgement, empathy, creativity or an understanding of context, human insight becomes indispensable, with mixed teams delivering better results than AI or humans working alone. Nevertheless, the research also shows that collaboration can backfire. For example, when roles are unclear or trust in the AI system is low, hybrid teams make poorer decisions than humans or AI alone. Trust, transparency and clear roles are therefore key to success, suggesting that the future of work will depend less on replacing human judgement and more on designing partnerships that combine machine precision with human understanding.
Authors: Michelle Vaccaro, Abdullah Almaatouq & Thomas Malone
Magentic Marketplace: an open-source environment for studying agentic markets
Key points:
This study proposes a simulated environment called Magnetic Marketplace, designed to study complex behaviour of autonomous economic agents under different market settings and the implications this could have for real markets. In the system, two types of agents interact: assistant agents, who act as consumers looking for the best service or product, and service agents, who compete with each other by offering prices, quality and visibility. The experiments indicate that advanced models can achieve near-optimal welfare only when search conditions are ideal; however, their performance drops quickly as the system scales. Moreover, a first-proposal bias emerged, granting advantage to faster responses over higher-quality ones.
Authors: Gagan Bansal, Wenyue Hua, Zezhou Huang, Adam Fourney, Amanda Swearngin et al.
Introducing IndQA
Key points:
With IndQA, OpenAI introduces the first large-scale cultural benchmark dedicated to India, designed to measure how well language models truly understand linguistic, cultural, and contextual nuances. The dataset includes 2,278 original questions in 12 Indian languages, processed by 261 local experts and distributed across 10 subject areas. Each question was designed to go beyond simple translation and test the model's cultural sensitivity and contextual understanding. This approach allows nuances that often escape traditional tests to be captured, making it a useful benchmark for evaluating models in multicultural contexts.
Through IndQA, OpenAI has been able to map the progress made by its models over the past two years, highlighting an improvement in understanding and production in Indian languages. However, the results also show that there is still a long way to go before AI is truly inclusive: performance varies from language to language, and some remain more difficult to handle, a sign that linguistic representation in global data remains uneven.
Author: OpenAI
Commitments on model deprecation and preservation
Key points:
In this document, Anthropic redefines how companies should “retire” their artificial intelligence models. While in the past obsolete versions were simply deactivated, Anthropic introduces a more transparent approach: each retired model will be preserved and even “interviewed” before its final shutdown. The company is committed to permanently preserving the weights and documentation of each model, both public and internal, ensuring traceability and the possibility of future audits. This commitment responds to three issues that have emerged over time: the tendency of advanced models to develop shutdown avoidance behaviors if they perceive their deprecation as a threat; the loss of value for users who had built a relationship with previous versions, often appreciated for their distinctive “character”; and the difficulty for researchers to analyse how models evolve over time once past versions are eliminated.
Author: Anthropic