-
Notifications
You must be signed in to change notification settings - Fork 377
[Feature Request] taesd VAE (distilled VAE) #36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I've taken a look, and it appears that the VAE model structure differs from the original. I'll find some time to study whether corresponding support needs to be added. |
By the way, I believe the speed bottleneck is not in the VAE image generation phase but rather in the UNet sampling phase. |
Now with LCM support, VAE is now ~50% of the time (on my cpu). Also, VAE is the point with peak memory consumption. |
@leejet The operation that consumes the most memory in the VAE is im2col, up to 1 GB for a 512 x 512 image and 2 GB for a 512 x 768 image. I am considering a way to split the input into chunks and merge them in the output as memory optimization but the time of compute probably will be a little bit slow. |
here they are talking about tiled-decoding madebyollin/taesd#8 |
@leejet I have been working since yesterday on implementing this autoencoder, but for some reason, I'm getting a somewhat over-saturated image. I don't know what I have wrong. https://github.com/FSSRepo/stable-diffusion.cpp/tree/taesd-impl Ported from: https://github.com/madebyollin/taesd/blob/main/taesd.py AutoEncoderKLcompute buffer: 1664 MB TinyAutoEncodercompute buffer: 416 MB |
@FSSRepo do you know if your model has VAE finetuning? |
i don't know |
@FSSRepo in any case, good job :) the speed improvement and low memory is insane. |
I have not yet checked if there is memory fragmentation since many operations are repeated, so memory usage could be improved |
"taesd is tiny, distilled version of a stable diffusion vae."
Image generation results for this vae (showcased on their github) looks nearly identical, maybe having this supported in
stable-diffusion-cpp
this could increase generation speed.Github: https://github.com/madebyollin/taesd
The text was updated successfully, but these errors were encountered: