Control3D: Towards Controllable Text-to-3D Generation

Bibliographic Details
Title: Control3D: Towards Controllable Text-to-3D Generation
Authors: Chen, Yang, Pan, Yingwei, Li, Yehao, Yao, Ting, Mei, Tao
Publication Year: 2023
Collection: Computer Science
Subject Terms: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Multimedia
More Details: Recent remarkable advances in large-scale text-to-image diffusion models have inspired a significant breakthrough in text-to-3D generation, pursuing 3D content creation solely from a given text prompt. However, existing text-to-3D techniques lack a crucial ability in the creative process: interactively control and shape the synthetic 3D contents according to users' desired specifications (e.g., sketch). To alleviate this issue, we present the first attempt for text-to-3D generation conditioning on the additional hand-drawn sketch, namely Control3D, which enhances controllability for users. In particular, a 2D conditioned diffusion model (ControlNet) is remoulded to guide the learning of 3D scene parameterized as NeRF, encouraging each view of 3D scene aligned with the given text prompt and hand-drawn sketch. Moreover, we exploit a pre-trained differentiable photo-to-sketch model to directly estimate the sketch of the rendered image over synthetic 3D scene. Such estimated sketch along with each sampled view is further enforced to be geometrically consistent with the given sketch, pursuing better controllable text-to-3D generation. Through extensive experiments, we demonstrate that our proposal can generate accurate and faithful 3D scenes that align closely with the input text prompts and sketches.
Comment: ACM Multimedia 2023
Document Type: Working Paper
Access URL: http://arxiv.org/abs/2311.05461
Accession Number: edsarx.2311.05461
Database: arXiv
More Details
Description not available.