convert to a rest service api, rather than a cli tool #674

HarvieKrumpet · 2025-05-07T04:22:44Z

Found it more convenient to just spawn off the tool augmenting an existing inference model as a tool call. Also aborted on relying on sdk wrappers around your tool and just spawn it programatically. But of course it's only one render per spawn. and quite a conglomeration of args to feed it. If the tool had a rest service. A single json could hold all the parameters and be more robust on types. and eliminate the huge stringbuilder required to do this.

What is more important though is that the tool could remain running listening for future runs. it could either return a base64 (webP) or (AviF) string for the image. or just go ahead and write the file directly as it does now. This saves that initial bootup it needs to do for subsequent runs as it never exits.

If it was a rest service, then this tool could be "remote" and have a dedicated computer. Whereas now it shares the space with an active LLM. It could even be on a cloud and serve up Flux renders. could be LocalHost or http://domain here.

stduhpf · 2025-05-07T09:29:03Z

I'm "working" (not very actively) on a simple server for that: #367. It's a bit annoying because if anything goes wrong during inference, the server kills itself. An arguably better option would be to use a koboldcpp server, which does support image generation based on this project.

HarvieKrumpet · 2025-05-09T20:09:16Z

I went ahead and wrote a wasm/browser front-end based on the kobold approach. thanks for turning me on to that project. It is what I was hoping for. It turned out really well. Although the diffusion service is not cancel-able which is critical for this to be practical. This cancel-ability is something I think only you can add, and the missing link to many applications of a service/server based image generator. Currently it requires fully aborting the process to stop it. Which incurs all the overhead of initializing it again. Since your cli based design. This made perfect sense at the time. maybe a solution is to totally separate the initialization/loading portion from the generation portion. so that if the generation side does crash/(cancel), can just fire up the generation side again.
The other missing link is for the diffusion service to formally send out the Step progress info rather than needing to trap/redirect the messages from the spawned process. this has to also happen on the kobold approach since he has the same problem. If this happens, we can fire up dedicated diffusion servers. separate from LLM inferencing servers. run generations simultaneously with the LLM. and dedicate hardware to just imaging. I can show you some snaps of what I have on discord if you want.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

convert to a rest service api, rather than a cli tool #674

convert to a rest service api, rather than a cli tool #674

HarvieKrumpet commented May 7, 2025

stduhpf commented May 7, 2025

Uh oh!

HarvieKrumpet commented May 9, 2025

Uh oh!

convert to a rest service api, rather than a cli tool #674

convert to a rest service api, rather than a cli tool #674

Comments

HarvieKrumpet commented May 7, 2025

stduhpf commented May 7, 2025

Uh oh!

HarvieKrumpet commented May 9, 2025

Uh oh!