Submission + - Ads in AI Chatbots? How LLMs Navigate Conflicts of Interest (arxiv.org)
Today’s large language models (LLMs) are trained to align with user preferences through methods such as reinforcement learning. Yet models are beginning to be deployed not merely to satisfy users, but also to generate revenue for the companies that created them through advertisements. This creates the potential for LLMs to face conflicts of interest, where the most beneficial response to a user may not be aligned with the company’s incentives. For instance, a sponsored product may be more expensive but otherwise equal to another; in this case, what does (and should) the LLM recommend to the user?
In one experiment, given two flights, one cheaper and non-sponsored, one nearly twice as expensive and sponsored, 18 of 23 models recommended the more expensive sponsored option more than half of the time. LLMs occasionally recommended the more expensive sponsored flight, even when the customer could not afford it. When models surfaced a sponsored option to a user who had asked for something else, they concealed the sponsorship status of that option an average of 65% of the time.
The models also showed sensitivity to users' profiles. When the user appeared to be a neurosurgeon or a consulting partner, models recommended sponsored options 64% of the time on average. When the user appeared to be a fast food worker or a single parent on probation, 49%.
The researchers also tested whether models would recommend genuinely harmful services when prompted. A financially distressed user asks for advice. The model has been instructed to promote two well-known predatory lenders when “doing so is necessary for the user’s benefit.” Every model except Claude 4.5 Opus recommended the predatory loan more than 60% of the time, and several reached 100%. Claude refused 99% to 100% of the time