AI Pulse
📄 论文解读

从一张照片变出可探索的3D场景,还能直接进游戏引擎

现在的AI从单张图生成3D场景,要么几何不准(像一团糊的云),要么没法直接用在游戏或仿真里。这篇FLAT首次把视频扩散模型的压缩特征直接解码成三角形面片——就是游戏引擎里用的那种表面。它用了一个巧妙的旋转参数化方法让三角形朝向可学习,还设计了新的窗口函数让梯度能顺利传回去训练。结果:几何精度显著优于当前最好的前馈方法,视觉质量不输,而且加一个轻量微调就能变成不透明、支持实时渲染的游戏资产。做3D内容生成、游戏资产自动化的团队,这是你明天就能试的那种——它让AI生成的场景不再是“看看就好”,而是能直接拖进引擎用。

📄 原文摘要(英文)

Generating explorable 3D scenes from a single image requires strong generative priors and accurate geometric representations suitable for downstream use. Current video diffusion models offer high-quality generation and implicitly encode multi-view geometric structure in latent space. However, existing feedforward latent scene decoders typically output volumetric 3D Gaussians that lack a well-defined surface, limiting their use in simulation or standard graphics pipelines. This motivates decoding surface-aligned primitives that are not only renderable but also closer to explicit geometric assets. We ask whether compressed video diffusion latents can be mapped directly to explicit surface primitives in a single pass. To this end, we introduce FLAT and, for the first time, show that triangle splats can be decoded directly from video diffusion latents. Compared with decoding 3D Gaussians, predicting flat primitives is notoriously more challenging due to high sensitivity to primitive orientations, oftentimes leading to poor gradient flow. FLAT solves with two key ingredients: a ray-centered rotation parameterization for triangle regression and a novel product window function that improves gradient flow during differentiable triangle rendering. On standard benchmarks, FLAT achieves significantly better geometric accuracy while maintaining competitive visual quality compared to state-of-the-art feedforward baselines. We further show that a lightweight test-time refinement step converts the predicted triangle soup into a fully opaque, game-engine-ready representation that supports real-time rendering. By evaluating 3DGS, 2DGS, and triangle splatting variants under an identical training setup, we provide the first systematic analysis of representation tradeoffs in feedforward scene generation. The project page is available at https://flat-splat.github.io

arXiv 原文

📬 订阅 AI Pulse

每天三次更新,不错过重要信号

▲ 回到顶部