Apple releases open-source model that instantly turns 2D photos into 3D views
3 months ago
- #3D Gaussian Splatting
- #View Synthesis
- #Real-time Rendering
- SHARP is a method for photorealistic view synthesis from a single image, regressing 3D Gaussian scene parameters in under a second on a GPU.
- The 3D Gaussian representation allows real-time rendering of high-resolution images for nearby views and supports metric camera movements.
- SHARP sets a new state of the art, reducing LPIPS by 25–34% and DISTS by 21–43% compared to prior models, while being significantly faster.
- Installation involves creating a Python environment and installing dependencies via pip. The model checkpoint is downloaded automatically or can be manually specified.
- The output is 3D Gaussian splats (.ply files) compatible with various renderers, following the OpenCV coordinate convention.
- Video rendering with camera trajectories is supported but requires a CUDA GPU. The gsplat renderer has a slow initialization on first launch.
- The paper includes quantitative and qualitative evaluations, with video comparisons available on an examples page.
- Users are encouraged to cite the provided paper if they find the work useful. The codebase acknowledges multiple open-source contributions.
- Licenses for the code and models are provided, with details in the repository's LICENSE and LICENSE_MODEL files.