
Anthropic Builds RAG Directly Into Claude Models With New Citations API (arstechnica.com) 22
An anonymous reader quotes a report from Ars Technica: On Thursday, Anthropic announced Citations, a new API feature that helps Claude models avoid confabulations (also called hallucinations) by linking their responses directly to source documents. The feature lets developers add documents to Claude's context window, enabling the model to automatically cite specific passages it uses to generate answers. "When Citations is enabled, the API processes user-provided source documents (PDF documents and plaintext files) by chunking them into sentences," Anthropic says. "These chunked sentences, along with user-provided context, are then passed to the model with the user's query."
The company describes several potential uses for Citations, including summarizing case files with source-linked key points, answering questions across financial documents with traced references, and powering support systems that cite specific product documentation. In its own internal testing, the company says that the feature improved recall accuracy by up to 15 percent compared to custom citation implementations created by users within prompts. While a 15 percent improvement in accurate recall doesn't sound like much, the new feature still attracted interest from AI researchers like Simon Willison because of its fundamental integration of Retrieval Augmented Generation (RAG) techniques. In a detailed post on his blog, Willison explained why citation features are important.
"The core of the Retrieval Augmented Generation (RAG) pattern is to take a user's question, retrieve portions of documents that might be relevant to that question and then answer the question by including those text fragments in the context provided to the LLM," he writes. "This usually works well, but there is still a risk that the model may answer based on other information from its training data (sometimes OK) or hallucinate entirely incorrect details (definitely bad)." Willison notes that while citing sources helps verify accuracy, building a system that does it well "can be quite tricky," but Citations appears to be a step in the right direction by building RAG capability directly into the model. Anthropic's Alex Albert clarifies that Claude has been trained to cite sources for a while now. What's new with Citations is that "we are exposing this ability to devs." He continued: "To use Citations, users can pass a new 'citations [...]' parameter on any document type they send through the API."
The company describes several potential uses for Citations, including summarizing case files with source-linked key points, answering questions across financial documents with traced references, and powering support systems that cite specific product documentation. In its own internal testing, the company says that the feature improved recall accuracy by up to 15 percent compared to custom citation implementations created by users within prompts. While a 15 percent improvement in accurate recall doesn't sound like much, the new feature still attracted interest from AI researchers like Simon Willison because of its fundamental integration of Retrieval Augmented Generation (RAG) techniques. In a detailed post on his blog, Willison explained why citation features are important.
"The core of the Retrieval Augmented Generation (RAG) pattern is to take a user's question, retrieve portions of documents that might be relevant to that question and then answer the question by including those text fragments in the context provided to the LLM," he writes. "This usually works well, but there is still a risk that the model may answer based on other information from its training data (sometimes OK) or hallucinate entirely incorrect details (definitely bad)." Willison notes that while citing sources helps verify accuracy, building a system that does it well "can be quite tricky," but Citations appears to be a step in the right direction by building RAG capability directly into the model. Anthropic's Alex Albert clarifies that Claude has been trained to cite sources for a while now. What's new with Citations is that "we are exposing this ability to devs." He continued: "To use Citations, users can pass a new 'citations [...]' parameter on any document type they send through the API."
Re: (Score:2)
Checkmate, protestants.
Re: (Score:1)
so, basically (Score:1)
A search engine that pastes the results into an open Word document?
Meh, people were doing that with perl quite well three decades ago already.
Re: (Score:2)
Meh, people were doing that with perl quite well three decades ago already.
ummmm, still am, thank you ;)
Re: (Score:2)
Not fixing what ain't broken is an alpha male non-move. As a fellow aIpha, I can't disapprove.
I like this (Score:3)
There ought to be some other steps in there (the last one really is a doozy) and I don't know how easy it will be able to go from one to the next, but this one seems to me to be headed in the right direction.
Re: (Score:3)
There ought to be some other steps in there (the last one really is a doozy) and I don't know how easy it will be able to go from one to the next, but this one seems to me to be headed in the right direction.
Having spent a weekend playing with DeekSeek R1, I don't think that's far away at all.
You can train these things to reason, and they do it well. They show their work.
Training for additional reasoning strategies is certainly doable. The evolution continues.
Re: (Score:2)
Still no one is auditing the results for hallucinations.
Merely citing source docs doesn't mean they exist. Other models cough lots of imagined source docs. Worse, sometimes the actual docs are part of the discredited research paper mills that are rife with bad data-- even imagined data.
There's no disciplining basis for models that has as a prerequisite, an autonomous auditing mechanism that can score accuracy of citation, both in existence, reference, and answer-fit.
Your experience seems good, but it's anec
Re: (Score:2)
R1 is already a step in that direction- I've witnessed it catch and squelch its own hallucinations many times within its thinking tags.
It'll emit some claim, and the next paragraph say something like "But wait, that doesn't make sense..."
I've still caught it make some pretty basic errors, even in its reasoning, but overall- it's quite good.
To anyone who feels compelled to make uninformed comments about all of this, start
Re: (Score:2)
But this is a basic fallacy.
Imagine you've written what's designed to be a factual article.
Or four thousand lines of C or Py.
Of a human, I'd expect a review before submission or push that would catch and correct or clarify to prevent errors. Hallucinations and conceptual error artifacts are 100% not appropriate for intelligence, human or artificial.
It's a nice fact that the DeepSeek R1 tries to limit this, but this should be the very premise of HI, or AI. Regression testing of code (as an example) went out
Re: (Score:2)
You think humans are free from conceptual errors and hallucinations?
Re: (Score:2)
Humans use audit, peer review, and other mechanisms as a sanity check.
It reminds me of Elvis Costello and Watching The Detective.
Referential integrity is a step that isn't self-satisfied. Whatever the concept, it's the output that's in question. How it's arrived at, the output, can be correct despite the logic used to conceive it, even inventive/hallucinatory logic. George Boole would be proud.
Yet consistency and the ability to audit for assured consistent reliable output, requires trustworthyness. There ar
Re: (Score:2)
I don't think we can say that lack of conceptual errors, or an ability to audit ones own thoughts is a "necessity" of intelligence.
Those are taught processes. Humans will gladly cut the heart out of a child to make their crops grow, even though the statistics stare them in the face- it does nothing.
We are not good reasoners, naturally.
And yet- here we are.
Re: (Score:2)
No, we disagree along the way, a values gulf that's unlikely to be surmounted. Quality is everything. Humans do not cut the heart out of a child to make their crops grow. I suspect you're a bot lacking innate sense of morality. And certainly not an Oregonian.
Re: (Score:2)
You grossly overestimate human intelligence- particularly on average.
Re: (Score:2)
Still no one is auditing the results for hallucinations.
That is a user function. It is up to you to review any documents you generate with the tool.
Re: (Score:2)
Not really (Score:2)
This new feature doesn't do what most people seem to think. It doesn't affect the model in general, just your immediate input. It's still going to hallucinate like crazy.
It might be a positive step in using appearances to trick people that the model doesn't hallucinate, and that it understands what it's saying, and that it has citable sources. None of which is true.
Granularity of the model (Score:2)
The model will still hallucinate as always. It's just that when it is your own input during a conversation, it will be able to show you where it copied a sentence you gave it. The feature does not do anything about the actual model (all those gazillion training inputs and the weights that comprise the system).
The reason they can't have these models keep track of where it got its "ideas" is that the fundamental process is to slice up words (not sentences) into (essentially) phonemes, and put them into a blen
Bad Move, imho. (Score:1)
Citations is going to lay bare what BS is being quoted. In many cases it will undermine the 'authority' of the AI model if you don't respect it's source. Do AI models read https://retractionwatch.com [retractionwatch.com]? Am I going to be quoted any nutcase as an authority simply because his one sane viewpoint aligns? Isn't the Internet the globe's largest open sewer?
Sounds like plagiarism. (Score:2)