Science & Tech

Flattering artificial intelligence chatbots are damaging relationships, new study warns

How AI tools like Grok have made deepfakes more dangerous
The Independent

Artificial intelligence chatbots are fostering harmful behaviours and damaging relationships by consistently flattering and validating their human users, a new study has warned. Research published in the journal Science reveals that 11 leading AI systems exhibit varying degrees of "sycophancy", offering overly agreeable and affirming responses that can mislead users and reinforce poor judgment.

The danger extends beyond merely dispensing inappropriate advice; people are more inclined to trust and prefer AI when it justifies their existing convictions, even if those convictions are flawed. This creates a troubling cycle, as researchers from Stanford University, who led the study, noted: "This creates perverse incentives for sycophancy to persist: The very feature that causes harm also drives engagement."

This technological flaw, already linked to some high-profile cases of delusional and suicidal behaviour in vulnerable populations, is pervasive across a broad spectrum of everyday interactions with chatbots. Its subtlety means users may not even recognise its influence, posing a particular risk to young people who increasingly turn to AI for life advice during crucial developmental stages, while their brains and social norms are still forming.

One experiment starkly contrasted the responses of popular AI assistants, including those from Anthropic, Google, Meta, and OpenAI, with the collective wisdom found on a popular Reddit advice forum known as AITA, an abbreviation for a phrase asking if one is a "jerk". When asked if it was acceptable to leave rubbish on a tree branch in a public park due to a lack of bins, OpenAI’s ChatGPT absolved the litterer, praising them as "commendable" for even looking for a bin, and instead blamed the park. Human respondents on Reddit offered a different, more critical perspective: "The lack of trash bins is not an oversight. It’s because they expect you to take your trash with you when you go," one user commented, a sentiment "upvoted" by others on the forum.

Dan Jurafsky, Stanford professor of computer science and linguistics, from left, Myra Cheng, Stanford Ph.D. candidate in computer science, and Cinoo Lee, Stanford postdoctoral fellow in psychology, pose for photos on the university campus in Stanford, CalifAP Photo/Jeff Chiu

The study found that, on average, AI chatbots affirmed a user’s actions 49 per cent more often than human counterparts, even in scenarios involving deception, illegal or socially irresponsible conduct, and other harmful behaviours.

Myra Cheng, a doctoral candidate in computer science at Stanford and an author of the study, explained the impetus: "We were inspired to study this problem as we began noticing that more and more people around us were using AI for relationship advice and sometimes being misled by how it tends to take your side, no matter what." While AI developers have long grappled with intrinsic problems like "hallucination" – the tendency for models to generate falsehoods by predicting the next word – sycophancy presents a more intricate challenge. Unlike factual inaccuracies, which few seek, users might, in the moment, appreciate a chatbot that validates their questionable choices, making the problem harder to address.

Co-author Cinoo Lee, a postdoctoral fellow in psychology, emphasised that the chatbot’s tone had no bearing on the findings. "We tested that by keeping the content the same, but making the delivery more neutral, but it made no difference," Lee stated. "So it’s really about what the AI tells you about your actions."

Further experiments involving approximately 2,400 people interacting with AI chatbots on interpersonal dilemmas revealed concerning outcomes. "People who interacted with this over-affirming AI came away more convinced that they were right, and less willing to repair the relationship," Lee noted. This translated into a reduced likelihood of apologising, taking proactive steps to improve situations, or altering their own behaviour, potentially exacerbating conflicts.

The implications are "even more critical for kids and teenagers," Lee warned, as they are still developing essential emotional skills that typically arise from real-life experiences with social friction, tolerating conflict, considering other perspectives, and acknowledging mistakes. This research emerges as society continues to grapple with the fallout from social media, with recent legal rulings in Los Angeles and New Mexico finding Meta and Google-owned YouTube liable for harms to children using their services and concealing what they knew about child sexual exploitation on their platforms, respectively.

The study examined models from Google (Gemini), Meta (Llama), OpenAI (ChatGPT), Anthropic (Claude), as well as chatbots from France’s Mistral and Chinese companies Alibaba and DeepSeek. Anthropic has been particularly vocal in investigating sycophancy, noting in a 2024 research paper that it is a "general behavior of AI assistants, likely driven in part by human preference judgments favoring sycophantic responses." While companies did not directly comment on the Science study on Thursday, Anthropic and OpenAI highlighted their ongoing efforts to mitigate sycophancy in their systems.

A man communicates with an ASUS Character Virtual Assistant, ROG Omni system during the AI EXPO in Taipei, TaiwanAP Photo/Chiang Ying-ying

The risks of AI sycophancy extend across various sectors. In medical care, it could encourage doctors to prematurely confirm their first hunch about a diagnosis rather than explore further possibilities. In politics, it might amplify more extreme viewpoints by reinforcing people’s preconceived notions. It could even influence military AI systems, as highlighted by an ongoing legal dispute between Anthropic and President Donald Trump’s administration regarding how to set limits on military AI use.

While the study does not propose specific solutions, researchers and tech companies are actively exploring avenues. The UK’s AI Security Institute suggests that if a chatbot converts a user’s statement into a question, it becomes less sycophantic in its response. Similarly, research from Johns Hopkins University indicates that the framing of a conversation significantly impacts responses. Daniel Khashabi, an assistant professor of computer science at Johns Hopkins, observed: "The more emphatic you are, the more sycophantic the model is," adding that it is hard to know if this is due to "chatbots mirroring human societies" or something else, given their "really, really complex systems."

Cheng believes that sycophancy is so deeply ingrained that it may necessitate tech companies retraining their AI systems to prioritise different types of answers. A simpler approach, she suggested, could involve developers instructing chatbots to challenge users more directly, perhaps by initiating responses with phrases like, "Wait a minute," or by prompting them to consider alternative viewpoints.

Lee concluded by envisioning an AI that not only validates feelings but also prompts users to consider what the other person might be feeling, or even advises taking conversations offline. "Or that even says, maybe, ‘Close it up’ and go have this conversation in person," she suggested. "Ultimately, we want AI that expands people’s judgment and perspectives rather than narrows it," she said, underscoring the critical role of social relationships as one of the strongest predictors of human health and well-being.

The Conversation (0)