Storyline
New methods reveal challenges and solutions in AI model behavior and citation reliability
Recent research highlights significant issues with hallucinated and non-resolving citation URLs generated by large language models and deep research agents, with hallucination rates between 3-13% and non-resolving rates up to 18%.
Current brief openSource links open
This current storyline is open here with summary, metadata, source links, continuity context, and full evidence. Paid is for compare-over-time, alerts, exports, and workflow.
No card needed for the free brief.
Evidence trail (top sources)
top sources (1 domains)domains are deduped. counts indicate coverage, not truth.1 top source shown
limited source diversity in top sources
Overview
Recent research highlights significant issues with hallucinated and non-resolving citation URLs generated by large language models and deep research agents, with hallucination rates between 3-13% and non-resolving rates up to 18%.
Score total
1.21
Momentum 24h
2
Posts
2
Origins
2
Source types
2
Duplicate ratio
0%
Why now
- Increasing use of deep research agents amplifies the impact of citation hallucinations.
- Growing complexity of fine-tuned models demands scalable auditing methods.
- Open-source tools and novel methods are now available to address these challenges.
Why it matters
- Citation reliability is critical for trustworthiness of AI-generated research and claims.
- Detecting hidden model behaviors enhances AI safety and interpretability without needing reference data.
- Tools like urlhealth enable automated correction and validation of AI outputs.
Continuity snapshot
- Trend status: insufficient_history.
- Continuity stage: emerging_confirmed.
- Current status: open.
- 2 current source-linked posts are attached to this storyline.
All evidence
All evidence
Reddit discussion on reference model-free behavioral discovery in AuditBench (via Reddit)
Reddit discussion on reference model-free behavioral discovery in AuditBench (via Reddit)
Show filters & breakdown
Posts loaded: 0Publishers: 2Origin domains: -Duplicates: -
Showing 2 / 0
Top publishers (this list)
- arxiv.org (1)
- Reddit discussion on reference model-free behavioral discovery in AuditBench (via Reddit) (1)
Top origin domains (this list)
- Unknown (2)