Trio Diffusion
My Experiment with Infinite Image Generation
"What if I could generate images that go on forever? – seemed like a fun problem to tackle."

This project continues my TailorGAN work but with diffusion models, aiming to generate images of arbitrary size with no real boundaries through auto-regressive patch generation.
The Basic Idea
Standard diffusion models generate fixed-size images. For larger outputs, you’re limited to upscaling or tiling. My approach: generate images piece by piece, auto-regressively, where each new patch conditions on three neighboring patches in an L-shape configuration.
The model looks at a “trio” of patches (top-left, top-right, bottom-left) and generates the missing bottom-right patch, creating seamless spatial continuity.
Trio Configuration:
[Top-Left] [Top-Right]
[Bottom-Left] [To-Be-Generated]
Loss Function Architecture
The model uses a RobustCombinedLoss with five key components:
- MSE Loss (welknown) - Base diffusion reconstruction loss
- Perceptual Loss (welknown) - VGG-based perceptual similarity
- LPIPS Loss (welknown) - Learned perceptual metric
- Edge Loss (welknown) - Preserves high-frequency details
- Boundary Loss - added for spatial continuity
Boundary Loss Details
The Boundary Continuity Loss extracts edge regions from generated patches and compares them to expected boundaries from context patches using both pixel-level MSE and gradient-based MSE (via Sobel filters). This helps reduce visible seams at patch boundaries, though it’s just one piece of the auto-regressive generation puzzle.
Early Results






The current model shows promise with improved local continuity compared to naive tiling. The boundary loss effectively reduces visible seams, though generating complex and realistic objects with long-range coherence remains challenging and requires further experimentation.
Generation Process:
- Top and left patches of the initial image condition on real images
- All subsequent patches condition on previously generated content
- Auto-regressive generation enables theoretically infinite image extension