AI Pulse
📄 论文解读

让AI闭嘴:Whisper幻觉检测与修复

Whisper语音识别模型有个毛病:没声音时它也会“脑补”出连贯的文本,这叫幻觉。研究者发现,通过分析模型内部神经元的激活模式,可以检测到它是否在“瞎编”。他们用了一种叫稀疏自编码器的技术,把模型内部信号拆解成稀疏特征,然后像方向盘一样微调这些特征。结果:在无语音音频上,Whisper small的幻觉率从72.63%降到14.11%,large-v3从86.88%降到27.33%,而对正常语音的影响很小。这方法不需要重新训练模型,接近微调的效果。它不是你明天能用上的,但展示了如何用模型内部信号来纠正自身错误——一种更透明的AI修复思路。

📄 原文摘要(英文)

Whisper, a widely adopted ASR model, is known to suffer from hallucinations - coherent transcriptions generated for non-speech audio entirely disconnected from the input. We investigate whether hallucinations can be detected and mitigated through Whisper's internal representations. We extract audio encoder activations and evaluate two representation spaces: raw Whisper activations and Sparse AutoEncoder (SAE) latents. We show that both spaces encode linearly separable hallucination-related information, with discriminative power concentrated in a sparse feature subset and increasing toward deeper encoder layers. We propose two steering strategies: activation-space steering and SAE latent-space steering. SAE-based steering reduces hallucination rate from 72.63% to 14.11% for Whisper small and from 86.88% to 27.33% for Whisper large-v3 on the full non-speech test set, with small WER degradation on speech data, approaching the performance of fine-tuning-based methods.

arXiv 原文

📬 订阅 AI Pulse

每天三次更新,不错过重要信号

▲ 回到顶部