I just set up a new dedicated AI server that is quite fast by my standards. I have it running with OpenWebUI and would like to integrate it with other services. I think it would be cool to have something like copilot where I can be writing code in a text editor and have it add a readme function or something like that. I have also used some RAG stuff and like it, but I think it would be cool to have a RAG that can access live data, like having the most up to date docker compose file and nginx configs for when I ask it about server stuff. So, what are you integrating your AI stuff with, and how can I get started?

  • piefood@feddit.online
    link
    fedilink
    English
    arrow-up
    1
    ·
    3 days ago

    I have it integrated with a few things:

    What did you use for RAG? I’ve been getting interested in that, and could use a few pointers.

  • hedgehog@ttrpg.network
    link
    fedilink
    English
    arrow-up
    2
    ·
    edit-2
    3 months ago

    Right now I have Ollama / Open-WebUI, Kokoro FastAPI, ComfyUI, Wan2GP, and FramePack Studio set up. I recently (as in yesterday) configured an API key middleware with Traefik and placed it in front of Ollama and Comfy, but currently nothing is using them yet.

    I’ll probably try out Devstral with one of the agentic coding frameworks, like Void or Anon Kode. I may also try out one of the FOSS writing studios (like Plot Bunni) and connect my own Ollama instance. I could use NovelCrafter but paying a subscription fee to use my own server for the compute intensive part feels silly to me.

    I tried to use Open Notebook (basically a replacement for NotebookLM) with Ollama and Kokoro, with Kokoro FastAPI as my OpenAI endpoint, but turns out it only supported, and required, text embeddings from OpenAI, so I couldn’t do that fully on my local. At some point, if they don’t fix that, I’m planning to either add support myself or set up some routes with Traefik where the ones OpenNotebook uses point to the service I want to use.

    ETA: n8n is one of the services I plan to set up next, and I’ll likely end up integrating both Ollama and Comfy workflows into it.

          • hedgehog@ttrpg.network
            link
            fedilink
            English
            arrow-up
            1
            ·
            3 days ago

            Ah, gotcha. Nothing had been using them yet because I’d only just gotten the API key configured the day prior. But I already had Traefik running several dozen self hosted services that I use all the time, so the only “new” piece was adding API key support to Traefik.

            One of my planned projects is an all-in-one, self-hostable, FOSS, AI augmented novel-planning, novel-writing, ebook and audiobook studio. I’m envisioning being able to replace Scrivener, Sudowrite, Vellum, and then also have an integrated audiobook studio, but making it so that at every step you could easily import or export artifacts to / from other tools.

            Since I also run a tabletop RPG, and there’s a lot of overlap in terms of desirable functionality with novel planning and ttrpg planning, I plan to build it to be capable in that regard, too.

            In both cases, the critical AI functionality that I want to implement (that afaik hasn’t been done well), is how to elegantly handle concepts from the world building section. For example:

            • Automatic State tracking, where a scene following the outline is written or generated, and the changes to state are calculated based off the text.
              • Example: the MC started with $100 and spends $5 buying a magazine. Now MC has a magazine and $95
              • Example: a character leaves the scene, heading to another location
              • Example: a minor character overhears a secret conversation about the villain’s plan
              • Example: a character is killed
            • Manual State tracking
              • Example: MC left the Macguffin with their mentor, but off page the mentor was killed and the Macguffin was stolen by the villain
              • Example: MC thinks something happened, but they misinterpreted it, so the user edits the automatically calculated state with a clarification: this is what MC thinks; this is what actually happened
            • Syncing state changes with timelines.
              • Example: a scene in chapter 8 is a flashback to before the start of the book, so nothing that’s happened since then has happened yet
              • Example: after having written the first draft, you realize you should have introduced the Macguffin much earlier, so you edit a scene in chapter 3 to include a mention of it. The timeline is updated to incorporate that information.
              • Example: you move a scene from chapter 7 to chapter 4 for the sake of pacing. This causes the state at the start of scene to be analyzed and the changes in the scene to be propagated and for any conflicts to be noted, both in this scene and any following ones, e.g., MC had $95 in chapter 4 and $60 in chapter 7, and lost their wallet in this scene, so now MC should have lost a wallet containing $95 and won’t be able to make the purchases they made between this scene and chapter 7
              • Example: You add a new scene in chapter 5 after having already written chapters 6-20. The changes in state due to this scene are propagated out and any resulting conflicts are noted
            • Information concealing
              • Example: MC doesn’t know that the Macguffin has been stolen, and neither does the reader. But if you tell the LLM that it’s been stolen at this point, the generated text will often immediately give this away

            Another critical feature is to have versioning, both automated and manual, such that a user can roll back to a previous version, tag points in time as Rough Draft, Second Draft, etc…

            I’d also like to build an alpha / beta reader function - share a link and allow readers to give feedback (like comments in particular sections, highlights, emoji reactions, as well as reporting on things like reading behavior - they reread this section or went back after reading this section - that could be indicative of confusing writing), and also enable soliciting the same sort of feedback from AIs, and building tools to combine and analyze the feedback.

            I could go on about the things I’d love to build in that app, but then I’d be here all day.

            I don’t have that tool built yet, obviously, but it has a need to integrate with everything I’ve worked on - LLMs, embeddings, image generation, audio generation - heck, even video generation could be useful, but that’s a whole different story on its own.

            That app will need to be able to connect to such services from the browser or the backend directly, depending on the user’s preferences and how the services are configured.

            In the meantime, having API key support means I can use my self hosted services with other tools.

            • the FOSS NotebookLM clone supports that.
            • I still haven’t touched N8N, but I’d been (and still am) planning to.
            • I’d been toying with subbing to Novelcrafter, which allows you to connect to an ollama instance.
            • I learned about PlotBunni around the time of this comment and spun up my own instance, then forked the project and added support for API keys and made some other bug fixes… I started adding support for storing data on the server and synchronizing it but never fully got that working before having to set the project aside to focus on my day job.
            • I can now use the Comfy UI Remote app outside of my own network (I think I was already able to do this before by configuring a service user in my auth provider and enabling basic authentication with a base64 encoded username/password as the Bearer token) which is nice because Comfy is a pain to use on a phone
            • Likewise with Kokoro - there is (or was - unsure if it’s been fixed) a bug in the web client that means only Chrome browsers can use it, but because I added API key support to the server, I can expose the service and access it from outside my network with a different client running on my phone

            I’ve been pretty busy and haven’t really touched any of this in over a month now, but it’s certainly not for lack of use cases.

  • SmokeyDope@lemmy.worldM
    link
    fedilink
    English
    arrow-up
    1
    ·
    edit-2
    3 months ago

    VSCode + roo plugin seems to be all the hotness for coders leveraging ‘agenic teams’ so I spent a bit playing around with it. Most local models dont do tool calling very well I need to see if devstral works better without giving errors. I hear real professionals use claude API for that kind of stuff.

    Im only vaguely familiar with getting computers to send, recieve, and manipulate data with eachother on a local network so got a very basic python script going pointed at kobold cpps openai-compatable API to send prompts and recieve repliesinstead of the default webui app just to learn how it works under the hood.

    One of my next projects will be creating a extremely simple web based UI for my ereaders basic web browser to connect to. kobold has something similar with the /noscript subpage but even that is too much for my kobo reader. I intend to somehow leverage a gemtext to html proxy like ducking or newswaffle to make the page rendering output dead simple.

    One of these days im going to get a pi zero and attach it to a relay and see if I can get a model to send a signal to turn a light on and off. Those home automation people with the smart houses that integrate llms into things look soo cool

    • HumanPerson@sh.itjust.worksOP
      link
      fedilink
      English
      arrow-up
      2
      ·
      3 months ago

      I looked into roo, and was able to get it to interact with ollama but not actually work. From looking into roo, I found Cline which works a lot better. I would like to figure out a way to get it to work with the authenticated proxy api hosted to openwebui so I can access models externally but it is still pretty cool.

      • SmokeyDope@lemmy.worldM
        link
        fedilink
        English
        arrow-up
        3
        ·
        edit-2
        3 months ago

        If your running into the issue of an app wanting an api key for your local ollamas openai-compatable web interface API and refuses to work without one, I found that any random characters work. If you port forward your host computer you should be able to access the webui interface on an external network using the public IP.

        Heres the dead simple python program I used to send and recieve text to kobold.cpp engine through the web API. Not sure how similar ollama but afaik openai-compatable API means it all should works close to the same for compatibility(I think? lol!) if you give it a shot Make sure to set the .py file you make as executable and run it from a terminal doing ./filename.py to see the output in real time. It should make a log text file in same dir as the program too. Just use your host computers local ip if the python script pc is on same network.

        spoiler
        import requests
        
        # Configuration
        API_URL = "http://10.0.0.xx:5001/api/v1/generate"
        PROMPT = "Tell me a short story about a robot learning to dance."
        OUTPUT_FILE = "output.txt"
        
        # Define the API request data
        data = {
            "prompt": PROMPT,
            "max_length": 200,      # Adjust response length
            "temperature": 0.7,     # Control randomness (0=deterministic, 1=creative)
            "top_p": 0.9,           # Focus on high-probability tokens
        }
        
        # Send the request to kobold.cpp
        response = requests.post(API_URL, json=data)
        
        if response.status_code == 200:
            # Extract the generated text
            result = response.json()
            generated_text = result["results"][0]["text"]
            
            # Save to a text file
            with open(OUTPUT_FILE, "w") as f:
                f.write(generated_text)
            print(f"Response saved to {OUTPUT_FILE}!")
        else:
            print(f"Error: {response.status_code} - {response.text}")