3D版「变脸」:一个模型,两个角度,两种含义
你见过那种从左边看是猫、从右边看是狗的3D模型吗?以前做这种「视觉错觉」3D物体,要么慢得像蜗牛(优化几小时),要么拼接痕迹明显、颜色过饱和。这篇论文用了一个巧妙的「两步走」方法:第一步,在3D空间里同时生成两个角度的形状,然后像揉面团一样无缝融合成一个几何体;第二步,再根据每个视角单独贴上对应的纹理。整个过程只需3-5分钟,不需要额外训练,生成的物体从不同角度看语义清晰、几何自然。它不是你明天就能用的工具,但展示了AI如何理解「同一个物体在不同视角下可以代表不同事物」——这离真正的3D创意设计又近了一步。
📄 原文摘要(英文)
Creating 3D visual illusions, a single 3D mesh that reveals entirely different semantics from various viewing angles, is a fascinating but tough challenge. Existing optimization-based methods are slow and can produce oversaturated colors. In contrast, naive stitching approaches fail to produce geometrically coherent objects. This results in visible unnatural seams and semantic leaks. In this paper, we present a fast and training-free framework for generating text-driven 3D visual illusions. Our approach decouples the generation into two stages. First, we propose a cross-space dual-branch denoising process. This process dynamically decodes 3D latents into voxel space for CLIP-guided orientation alignment and Signed Distance Field (SDF) blending, which ensures seamless geometric fusion. Second, we introduce a view-conditioned texture synthesis module that projects and aggregates view-specific 2D diffusion priors onto the fused geometry. Extensive experiments demonstrate that our method generates highly realistic, dual-semantic 3D illusions in just 3-5 minutes. It significantly outperforms existing methods in geometric integrity, semantic recognizability, and efficiency. Project page: https://siang1105.github.io/JanusMesh.github.io/