Working with Podman Desktop AI Lab

Throughout this tutorial, we’ve been working with OpenAI’s remote models, however wouldn’t it be nice if we could work with models on our local machine (without incurring costs)?

Podman Desktop is a GUI tool that helps with running and managing containers on our local machine, but it can also help with running AI models locally as well thanks to its AI Lab extension. Thanks to Quarkus and LangChain4j, it then becomes trivial to start developing with these models. Let’s find out how!

Installing Podman Desktop AI

First, if you haven’t yet, you must download and install Podman Desktop on your operating system. The instructions can be found here.

For Windows/macOS users, if you can, give the Podman machine at least 8GB of memory and 4 CPUs (Generative AI Models are resource hungry!). The model can run with less resources, but it will be significantly slower.

Once installed, go ahead and start the application and go through the setup process. After that, you should see an "AI Lab" extension in the left menu. If you don’t, you may need to install the extension first. For that, go to Extensions → Catalog and install Podman AI Lab.

podman desktop ai

Go ahead and click on it, and in the AI Lab, select the "Catalog".

podman desktop ai catalog

You should now see a list of available AI Models to choose from. You can also import different ones (e.g. from Huggingface), but we will use one of the InstructLab models that are already available.

If you haven’t heard of InstructLab, it’s an open source project for enhancing large language models (LLMs) used in generative artificial intelligence (Gen AI) applications. You can even contribute to it yourself!

To start using the model, we’ll first need to download it, so go ahead and do that with the download button download button next to the instructlab/merlinite-7b-lab-GGUF entry (this might take a little while).

Once downloaded, you can create a new model service by clicking on the rocket button rocket button that will appear where you previously clicked the download button.

You will be taken to the "Creating Model Service" page where you can set the port that should be exposed for the service. Podman Desktop assigns a random available port by default, but let’s set it to 35000 so we can remember more easily what the port is when we configure our Quarkus application.

podman desktop create merlinite service

After a few moments, your very own Model service will be running locally on your laptop! You an check the details on the Service details page, including some samples to test out the service with cURL (or even Java!).

Now it’s time to go back to our Quarkus application.

The InstructLab service uses the OpenAI protocol, so we can continue to use the quarkus-langchain4j-openai extension for this exercise.

Creating a new project

Some of the exercises we completed previously are not compatible with non-OpenAI models, so let’s create a fresh new project

  • Maven

  • Quarkus CLI

mvn "io.quarkus.platform:quarkus-maven-plugin:create" -DprojectGroupId="com.redhat.developers" -DprojectArtifactId="quarkus-podman-ai-app" -DprojectVersion="1.0-SNAPSHOT" -Dextensions=langchain4j-openai,rest
cd quarkus-podman-ai-app
quarkus create app -x rest langchain4j-openai com.redhat.developers:quarkus-podman-ai-app:1.0-SNAPSHOT
cd quarkus-podman-ai-app

Connect to the InstructLab Model

Add the following properties in the application.properties file available in src/main/resources:

quarkus.langchain4j.openai.base-url=http://localhost:35000/v1 (1)
# Configure openai server to use a specific model
quarkus.langchain4j.openai.chat-model.model-name=instructlab/merlinite-7b-lab-GGUF (2)
# Set timeout to 3 minutes
quarkus.langchain4j.openai.timeout=180s
# Enable logging of both requests and responses
quarkus.langchain4j.openai.log-requests=true
quarkus.langchain4j.openai.log-responses=true
1 Port number 35000 was what we had defined in Podman Desktop when we created the model service
2 The model we selected to serve. If you chose a different model in Podman Desktop, you’ll need to update it here as well.

Create the AI service

Let’s create an interface for our AI service.

Create a new Assistant.java Java interface in src/main/java in the com.redhat.developers package with the following contents:

package com.redhat.developers;

import dev.langchain4j.service.SystemMessage;
import dev.langchain4j.service.UserMessage;
import io.quarkiverse.langchain4j.RegisterAiService;
import jakarta.enterprise.context.SessionScoped;

@RegisterAiService()
@SessionScoped
public interface Assistant {

    @SystemMessage({
            "You are a Java developer who likes to over engineer things" (1)
    })
    String chat(@UserMessage String userMessage);
}
1 Gives some context to the model

Create the prompt-base resource

Now we’re going to implement another REST resource that accepts prompts.

Create a new AIResource.java Java class in src/main/java in the com.redhat.developers package with the following contents:

package com.redhat.developers;

import jakarta.inject.Inject;
import jakarta.ws.rs.GET;
import jakarta.ws.rs.Path;
import jakarta.ws.rs.Produces;
import jakarta.ws.rs.QueryParam;
import jakarta.ws.rs.core.MediaType;

@Path("/ai")
public class AIResource {

    @Inject
    Assistant assistant;

    @GET
    @Produces(MediaType.TEXT_PLAIN)
    public String prompt() {
        // feel free to update this message to any question you may have for the LLM.
        String message = "Generate a class that returns the square root of a given number";
        return assistant.chat(message);
    }
}

Invoke the endpoint

Let’s ask our model to create a class that returns the square root of a given number:

You can check your prompt implementation by pointing your browser to http://localhost:8080/ai

You can also run the following command:

curl -w '\n' http://localhost:8080/ai

An example of output (remember, your result will likely be different):

Here is a simple Java class to calculate the square root of a given number using the built-in `Math` class in Java:

```java
public class SquareRootCalculator {
    public static void main(String[] args) {
        int num = 16; // square root of 16 is 4.0
        double result = Math.sqrt(num);
        System.out.println("Square root of " + num + ": " + result);
    }
}
```

Alternatively, if you want to handle negative numbers or non-integer inputs, you can use the `Math.sqrt()` function directly:

```java
public class SquareRootCalculator {
    public static void main(String[] args) {
        double num = -16; // square root of -16 is -4.0
        double result = Math.sqrt(num);
        System.out.println("Square root of " + num + ": " + result);
    }
}
```

This will allow you to calculate the square root of any given number, positive or negative, and handle non-integer inputs.
depending on your local resources, this might take a up to a few minutes. If you run into timeouts, you can try changing the quarkus.langchain4j.openai.timeout value in the application.properties file. If you’re running on Windows/macOS, you could also try to give the Podman machine more CPU/Memory resources.

Notice that (at least in our case) the LLM responded with a Java class, since we provided in the SystemMessage that the LLM should respond as if they were a Java engineer.

Going further

Feel free to play around with the different models Podman Desktop AI Lab provides. You will notice that some are faster than others, and some will respond better to specific questions than others, based on how they have been trained.

If you want to help improve the answers generated by the InstructLab model, feel free to contribute to the project.