Generating Cats with KPN Filtering
17 days ago
- #KPN-filtering
- #generative-modeling
- #image-generation
- The post explores generative modeling for cat images using KPN denoising in pixel space.
- Unlike typical diffusion models operating in latent space, this approach uses KPN bilateral filters and predicts low-rank targets directly.
- KPN filters offer good regularization bias and efficient GPU implementation, suitable for edge devices.
- The model is trained on 64x64 cat images, using an architecture with an 8x8 patch transformer and upscaling convolutions.
- Training involves gradually noising images to Gaussian noise and predicting the original using L2 and LPIPS losses.
- Bilateral filters' convex combination limitation is mitigated by a low-capacity network predicting color drift (bias) added post-filtering.
- Non-convex filtering is enabled by not normalizing bilateral weights and using tanh activation, allowing new color/detail generation.
- A simplified version of Neural Partitioning Pyramids and Procedural Kernel Networks optimizes the filtering network.
- Color drift prediction uses a small U-Net for low-frequency components, improving color fidelity while KPN filtering can be quantized.
- Generated samples after 5k epochs serve as a proof of concept, though results are not yet impressive.