AI detection startup GPTZero scanned all 4,841 papers accepted by the distinguished Convention on Neural Data Processing Methods (NeurIPS), which occurred final month in San Diego. The corporate discovered 100 hallucinated citations throughout 51 papers that it confirmed as pretend, the corporate tells TechCrunch.
Having a paper accepted by NeurIPS is a résumé-worthy achievement on the earth of AI. Provided that these are the main minds of AI analysis, one would possibly assume they’d use LLMs for the catastrophically boring job of writing citations.
So caveats abound with this discovering: 100 confirmed hallucinated citations throughout 51 papers will not be statistically important. Every paper has dozens of citations. So out of tens of hundreds of citations, that is, statistically, zero.
It’s additionally vital to notice that an inaccurate quotation doesn’t negate the paper’s analysis. As NeurIPS informed Fortune, which was first to report on GPTZero’s analysis, “Even when 1.1% of the papers have a number of incorrect references on account of the usage of LLMs, the content material of the papers themselves [is] not essentially invalidated.”
However having stated all that, a faked quotation will not be a nothing, both. NeurIPS prides itself on its “rigorous scholarly publishing in machine studying and synthetic intelligence,” it says. And every paper is peer-reviewed by a number of people who find themselves instructed to flag hallucinations.
Citations are additionally a kind of forex for researchers. They’re used as a profession metric to indicate how influential a researcher’s work is amongst their friends. When AI makes them up, it waters down their worth.
Nobody can fault the peer reviewers for not catching a number of AI-fabricated citations given the sheer quantity concerned. GPTZero can be fast to level this out. The purpose of the train was to supply particular knowledge on how AI slop sneaks in through “a submission tsunami” that has “strained these conferences’ overview pipelines to the breaking level,” the startup says in its report. GPTZero even factors to a Might 2025 paper known as “The AI Conference Peer Review Crisis” that mentioned the issue at premiere conferences, together with NeurIPS.
Techcrunch occasion
San Francisco
|
October 13-15, 2026
Nonetheless, why couldn’t the researchers themselves fact-check the LLM’s work for accuracy? Certainly they need to know the precise record of papers they used for his or her work.
What the entire thing actually factors to is one massive, ironic takeaway: If the world’s main AI consultants, with their reputations at stake, can’t guarantee their LLM utilization is correct within the particulars, what does that imply for the remainder of us?

