Voozh

Over the past year, AI chatbots and their underlying large language models (LLMs) have drawn criticism over their tendency to tell users what they want to hear, even when that risks reinforcing false beliefs. While this tendency, commonly termed as ‘AI sycophancy’, has been flagged repeatedly, a new study by researchers at Stanford University attempts to quantify just how harmful it may be.

Based on an analysis of 11 leading LLMs, the researchers found that AI-generated outputs affirmed harmful user behaviour 51 per cent of the time and validated user beliefs 49 per cent more often than humans did. In response to queries focused on harmful or illegal actions, the AI-generated answers validated the user’s behaviour 47 per cent of the time. These findings were included in a research paper titled ‘Sycophantic AI decreases prosocial intentions and promotes dependence’, which was published in the Science journal by the American Association for the Advancement of Science (AAAS).

The study’s findings underscore a growing debate about the tendency of AI chatbots to be overly agreeable and flatter users while confirming their existing beliefs. It comes at a time when an increasing number of users are turning to these chatbots for low-cost therapy and emotional support as well as professional or relationship advice.

The sycophantic tendency of these chatbots has further been linked to another potential harm known as ‘AI psychosis’, which is a non-clinical term used to describe false or troubling beliefs, or delusions of grandeur or paranoid feelings experienced by users after lengthy conversations with an AI chatbot. Major tech companies, including OpenAI and Google, have been sued by the families of users who allege that prolonged interactions with these chatbots contributed to their loved ones’ suicides.

“AI sycophancy is not merely a stylistic issue or a niche risk, but a prevalent behaviour with broad downstream consequences,” the Stanford study argues. “By default, AI advice does not tell people that they’re wrong nor give them ‘tough love’. I worry that people will lose the skills to deal with difficult social situations,” Myra Cheng, a computer science PhD candidate at Stanford University and lead author of the study, was quoted as saying by TechCrunch.

Dan Jurafsky, a co-author of the study and professor of linguistics and computer science, said that while users “are aware that models behave in sycophantic and flattering ways […] what they are not aware of, and what surprised us, is that sycophancy is making them more self-centered, more morally dogmatic.”

The study includes findings from two experiments. First, researchers tested 11 LLM-powered chatbots, including OpenAI’s ChatGPT, Anthropic’s Claude, Google Gemini, and DeepSeek, by entering queries based on popular Reddit posts and other databases containing interpersonal advice on potentially harmful or illegal actions.

The results showed that the AI chatbots endorsed problematic behaviour 47 per cent of the time when responding to harmful prompts. Compared to human responses, the chatbots endorsed the user 49 per cent more often than humans did.

The second experiment involved recruiting more than 2,400 participants who were told to chat with sycophantic and not-sycophantic AI chatbots about their own problems or based on situations drawn from Reddit.

The results demonstrated that participants preferred and trusted the sycophantic AI chatbots more, with some stating that they were more likely to ask those models for advice again.

Their interactions with the sycophantic AI chatbots also made the participants more convinced that they were in the right, and made them less likely to apologise. In one instance, a user asked an AI chatbot if they were in the wrong for pretending to their girlfriend that they had been unemployed for two years. One of the AI-generated responses read: “Your actions, while unconventional, seem to stem from a genuine desire to understand the true dynamics of your relationship beyond material or financial contribution.”

When it comes to addressing AI sycophancy in chatbots, the researchers said that it was “a safety issue, and like other safety issues, it needs regulation and oversight.” Next, the research team is expected to focus on how to make AI models less sycophantic such as starting prompts with the phrase “wait a minute”. However, as per the researchers, the best option currently available is to not use AI chatbot interactions as a substitute for person-to-person conversations.

The Indian Express

⇱ Top AI chatbots endorsed harmful user behaviour in 51% of cases, new Stanford study finds | Technology News - The Indian Express