📸 NVComposer

Generative Novel View Synthesis with Sparse and Unposed Images

🌍 Project Page | 📃 ArXiv Preprint | 🧑‍💻 Github Repository

Welcome to the demo of NVComposer. Follow the steps below to explore its capabilities:

  1. Choose camera movement mode: Spherical Mode or Rotation & Translation Mode.
  2. Customize the camera trajectory: Adjust the spherical parameters or rotation/translations along the X, Y, and Z axes.
  3. Upload images: You can upload up to 4 images as input conditions.
  4. Set sampling parameters (optional): Tweak the settings and click the Generate button.

⏱️ ZeroGPU Time Limit: Hugging Face ZeroGPU has a inference time limit of 180 seconds. You may need to log in with a free account to use this demo. Large sampling steps might lead to timeout (GPU Abort). In that case, please consider log in with a Pro account or run it on your local machine.

🤗 Please 🌟 star our GitHub repo and click on the ❤️ like button above if you find our work helpful.

Camera Mode

Spherical Mode allows you to control the camera's movement by specifying its position on a sphere centered around the scene. Adjust the Polar Angle (vertical rotation), Azimuth Angle (horizontal rotation), and Radius (distance from the center of the anchor view) to define the camera's viewpoint. The anchor view is considered located on the sphere at the specified radius, aligned with a zero polar angle and zero azimuth angle, oriented toward the origin.

-30 30
-30 30
0.5 1.5
1 10
0 25
1 3
1 9999
1 4

🧐 Reminder: As a generative model, NVComposer may occasionally produce unexpected outputs. Try adjusting the random seed, sampling steps, or CFG scales to explore different results.
🤔 Longer Generation: If you need longer video, you can increase the trajectory extension value in the advanced sampling settings and run with your own GPU. This extends the defined camera trajectory by repeating it, allowing for a longer output. This also requires using smaller rotation or translation scales to maintain smooth transitions and will increase the generation time.
🤗 Limitation: This is the initial beta version of NVComposer. Its generalizability may be limited in certain scenarios, and artifacts can appear with large camera motions due to the current foundation model's constraints. We’re actively working on an improved version with enhanced datasets and a more powerful foundation model, and we are looking for collaboration opportunities from the community.
✨ We welcome your feedback and questions. Thank you!

Quick Examples
Input Image 1 (Anchor View) Input Image 2 Camera Mode Polar Angle (Theta) Azimuth Angle (Phi) Radius Rotation X Rotation Y Rotation Z Translation X Translation Y Translation Z Classifier-Free Guidance Scale Extra Classifier-Free Guidance Scale DDIM Sample Steps Output Video Number of Input Image(s)