Skip to content

convert to a rest service api, rather than a cli tool #674

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
HarvieKrumpet opened this issue May 7, 2025 · 2 comments
Open

convert to a rest service api, rather than a cli tool #674

HarvieKrumpet opened this issue May 7, 2025 · 2 comments

Comments

@HarvieKrumpet
Copy link

Found it more convenient to just spawn off the tool augmenting an existing inference model as a tool call. Also aborted on relying on sdk wrappers around your tool and just spawn it programatically. But of course it's only one render per spawn. and quite a conglomeration of args to feed it. If the tool had a rest service. A single json could hold all the parameters and be more robust on types. and eliminate the huge stringbuilder required to do this.

What is more important though is that the tool could remain running listening for future runs. it could either return a base64 (webP) or (AviF) string for the image. or just go ahead and write the file directly as it does now. This saves that initial bootup it needs to do for subsequent runs as it never exits.

If it was a rest service, then this tool could be "remote" and have a dedicated computer. Whereas now it shares the space with an active LLM. It could even be on a cloud and serve up Flux renders. could be LocalHost or http://domain here.

@stduhpf
Copy link
Contributor

stduhpf commented May 7, 2025

I'm "working" (not very actively) on a simple server for that: #367. It's a bit annoying because if anything goes wrong during inference, the server kills itself. An arguably better option would be to use a koboldcpp server, which does support image generation based on this project.

@HarvieKrumpet
Copy link
Author

I went ahead and wrote a wasm/browser front-end based on the kobold approach. thanks for turning me on to that project. It is what I was hoping for. It turned out really well. Although the diffusion service is not cancel-able which is critical for this to be practical. This cancel-ability is something I think only you can add, and the missing link to many applications of a service/server based image generator. Currently it requires fully aborting the process to stop it. Which incurs all the overhead of initializing it again. Since your cli based design. This made perfect sense at the time. maybe a solution is to totally separate the initialization/loading portion from the generation portion. so that if the generation side does crash/(cancel), can just fire up the generation side again.
The other missing link is for the diffusion service to formally send out the Step progress info rather than needing to trap/redirect the messages from the spawned process. this has to also happen on the kobold approach since he has the same problem. If this happens, we can fire up dedicated diffusion servers. separate from LLM inferencing servers. run generations simultaneously with the LLM. and dedicate hardware to just imaging. I can show you some snaps of what I have on discord if you want.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants