AI/ML – Docker

A Quick Guide to Containerizing Llamafile with Docker for AI Applications

Sophia Parafina — Wed, 15 May 2024 13:39:56 +0000

This post was contributed by Sophia Parafina.

Keeping pace with the rapid advancements in artificial intelligence can be overwhelming. Every week, new Large Language Models (LLMs), vector databases, and innovative techniques emerge, potentially transforming the landscape of AI/ML development. Our extensive collaboration with developers has uncovered numerous creative and effective strategies to harness Docker in AI development.

This quick guide shows how to use Docker to containerize llamafile, an executable that brings together all the components needed to run a LLM chatbot with a single file. This guide will walk you through the process of containerizing llamafile and having a functioning chatbot running for experimentation.

Llamafile’s concept of bringing together LLMs and local execution has sparked a high level of interest in the GenAI space, as it aims to simplify the process of getting a functioning LLM chatbot running locally.

Containerize llamafile

Llamafile is a Mozilla project that runs open source LLMs, such as Llama-2-7B, Mistral 7B, or any other models in the GGUF format. The Dockerfile builds and containerizes llamafile, then runs it in server mode. It uses Debian trixie as the base image to build llamafile. The final or output image uses debian:stable as the base image.

To get started, copy, paste, and save the following in a file named Dockerfile.

# Use debian trixie for gcc13
FROM debian:trixie as builder

# Set work directory
WORKDIR /download

# Configure build container and build llamafile
RUN mkdir out && \
    apt-get update && \
    apt-get install -y curl git gcc make && \
    git clone https://github.com/Mozilla-Ocho/llamafile.git  && \
    curl -L -o ./unzip https://cosmo.zip/pub/cosmos/bin/unzip && \
    chmod 755 unzip && mv unzip /usr/local/bin && \
    cd llamafile && make -j8 LLAMA_DISABLE_LOGS=1 && \ 
    make install PREFIX=/download/out

# Create container
FROM debian:stable as out

# Create a non-root user
RUN addgroup --gid 1000 user && \
    adduser --uid 1000 --gid 1000 --disabled-password --gecos "" user

# Switch to user
USER user

# Set working directory
WORKDIR /usr/local

# Copy llamafile and man pages
COPY --from=builder /download/out/bin ./bin
COPY --from=builder /download/out/share ./share/man

# Expose 8080 port.
EXPOSE 8080

# Set entrypoint.
ENTRYPOINT ["/bin/sh", "/usr/local/bin/llamafile"]

# Set default command.
CMD ["--server", "--host", "0.0.0.0", "-m", "/model"]

To build the container, run:

docker build -t llamafile .

Running the llamafile container

To run the container, download a model such as Mistral-7b-v0.1. The example below saves the model to the model directory, which is mounted as a volume.

$ docker run -d -v ./model/mistral-7b-v0.1.Q5_K_M.gguf:/model -p 8080:8080 llamafile

The container will open a browser window with the llama.cpp interface (Figure 1).

Figure 1: Llama.cpp is a C/C++ port of Facebook’s LLaMA model by Georgi Gerganov, optimized for efficient LLM inference across various devices, including Apple silicon, with a straightforward setup and advanced performance tuning features.

$ curl -s http://localhost:8080/v1/chat/completions -H "Content-Type: application/json" -d '{
  "model": "gpt-3.5-turbo",
  "messages": [
    {
      "role": "system",
      "content": "You are a poetic assistant, skilled in explaining complex programming concepts with creative flair."
    },
    {
      "role": "user",
      "content": "Compose a poem that explains the concept of recursion in programming."
    }
  ]
}' | python3 -c '
import json
import sys
json.dump(json.load(sys.stdin), sys.stdout, indent=2)
print()
'

Llamafile has many parameters to tune the model. You can see the parameters with man llama file or llama file --help. Parameters can be set in the Dockerfile CMD directive.

Now that you have a containerized llamafile, you can run the container with the LLM of your choice and begin your testing and development journey.

What’s next?

To continue your AI development journey, read the Docker GenAI guide, review the additional AI content on the blog, and check out our resources.

Learn more

Read the Docker AI/ML blog post collection.
Download the Docker GenAI guide.
Read the Llamafile announcement post on Mozilla.org.
Subscribe to the Docker Newsletter.
Have questions? The Docker community is here to help.
New to Docker? Get started.

Creating AI-Enhanced Document Management with the GenAI Stack

Angel Borroy — Tue, 07 May 2024 14:39:51 +0000

Organizations must deal with countless reports, contracts, research papers, and other documents, but managing, deciphering, and extracting pertinent information from these documents can be challenging and time-consuming. In such scenarios, an AI-powered document management system can offer a transformative solution.

Developing Generative AI (GenAI) technologies with Docker offers endless possibilities not only for summarizing lengthy documents but also for categorizing them and generating detailed descriptions and even providing prompt insights you may have missed. This multi-faceted approach, powered by AI, changes the way organizations interact with textual data, saving both time and effort.

In this article, we’ll look at how to integrate Alfresco, a robust document management system, with the GenAI Stack to open up possibilities such as enhancing document analysis, automating content classification, transforming search capabilities, and more.

High-level architecture of Alfresco document management

Alfresco is an open source content management platform designed to help organizations manage, share, and collaborate on digital content and documents. It provides a range of features for document management, workflow automation, collaboration, and records management.

You can find the Alfresco Community platform on Docker Hub. The Docker image for the UI, named alfresco-content-app, has more than 10 million pulls, while other core platform services have more than 1 million pulls.

Alfresco Community platform (Figure 1) provides various open source technologies to create a Content Service Platform, including:

Alfresco content repository is the core of Alfresco and is responsible for storing and managing content. This component exposes a REST API to perform operations in the repository.
Database: PostgreSQL, among others, serves as the database management system, storing the metadata associated with a document.
Apache Solr: Enhancing search capabilities, Solr enables efficient content and metadata searches within Alfresco.
Apache ActiveMQ: As an open source message broker, ActiveMQ enables asynchronous communication between various Alfresco services. Its Messaging API handles asynchronous messages in the repository.
UI reference applications: Share and Alfresco Content App provide intuitive interfaces for user interaction and accessibility.

For detailed instructions on deploying Alfresco Community with Docker Compose, refer to the official Alfresco documentation.

Figure 1: Basic diagram for Alfresco Community deployment with Docker.

Why integrate Alfresco with the GenAI Stack?

Integrating Alfresco with the GenAI Stack unlocks a powerful suite of GenAI services, significantly enhancing document management capabilities. Enhancing Alfresco document management with the GenAI stack services has different benefits:

Use different deployments according to resources available: Docker allows you to easily switch between different Large Language Models (LLMs) of different sizes. Additionally, if you have access to GPUs, you can deploy a container with a GPU-accelerated LLM for faster inference. Conversely, if GPU resources are limited or unavailable, you can deploy a container with a CPU-based LLM.
Portability: Docker containers encapsulate the GenAI service, its dependencies, and runtime environment, ensuring consistent behavior across different environments. This portability allows you to develop and test the AI model locally and then deploy it seamlessly to various platforms.
Production-ready: The stack provides support for GPU-accelerated computing, making it well suited for deploying GenAI models in production environments. Docker’s declarative approach to deployment allows you to define the desired state of the system and let Docker handle the deployment details, ensuring consistency and reliability.
Integration with applications: Docker facilitates integration between GenAI services and other applications deployed as containers. You can deploy multiple containers within the same Docker environment and orchestrate communication between them using Docker networking. This integration enables you to build complex systems composed of microservices, where GenAI capabilities can be easily integrated into larger applications or workflows.

How does it work?

Alfresco provides two main APIs for integration purposes: the Alfresco REST API and the Alfresco Messaging API (Figure 2).

The Alfresco REST API provides a set of endpoints that allow developers to interact with Alfresco content management functionalities over HTTP. It enables operations such as creating, reading, updating, and deleting documents, folders, users, groups, and permissions within Alfresco.
The Alfresco Messaging API provides a messaging infrastructure for asynchronous communication built on top of Apache ActiveMQ and follows the publish-subscribe messaging pattern. Integration with the Messaging API allows developers to build event-driven applications and workflows that respond dynamically to changes and updates within the Alfresco Repository.

The Alfresco Repository can be updated with the enrichment data provided by GenAI Service using both APIs:

The Alfresco REST API may retrieve metadata and content from existing repository nodes to be sent to GenAI Service, and update back the node.
The Alfresco Messaging API may be used to consume new and updated nodes in the repository and obtain the result from the GenAI Service.

Figure 2: Alfresco provides two main APIs for integration purposes: the Alfresco REST API and the Alfresco Messaging API.

Technically, Docker deployment includes both the Alfresco and GenAI Stack platforms running over the same Docker network (Figure 3).

The GenAI Stack works as a REST API service with endpoints available in genai:8506, whereas Alfresco uses a REST API client (named alfresco-ai-applier) and a Messages API client (named alfresco-ai-listener) to integrate with AI services. Both clients can also be run as containers.

Figure 3: Deployment architecture for Alfresco integration with GenAI Stack services.

The GenAI Stack service provides the following endpoints:

summary: Returns a summary of a document together with several tags. It allows some customization, like the language of the response, the number of words in the summary and the number of tags.
classify: Returns a term from a list that best matches the document. It requires a list of terms as input in addition to the document to be classified.
prompt: Replies to a custom user prompt using retrieval-augmented generation (RAG) for the document to limit the scope of the response.
describe: Returns a text description for an input picture.

The implementation of GenAI Stack services loads the document text into chunks in Neo4j VectorDB to improve QA chains with embeddings and prevent hallucinations in the response. Pictures are processed using an LLM with a visual encoder (LlaVA) to generate descriptions (Figure 4). Note that Docker GenAI Stack allows for the use of multiple LLMs for different goals.

Figure 4: The GenAI Stack services are implemented using RAG and an LLM with visual encoder (LlaVA) for describing pictures.

Getting started

To get started, check the following:

Ensure that you have installed the latest version of Docker Desktop with 20 GB of RAM allocated.
Ensure that you have installed Java 17 and Maven 3.9.
Ensure that you have Ollama running locally.

Obtaining the amount of RAM available for Docker Desktop can be done using following command:

docker info --format '{{json .MemTotal}}'

If the result is under 20 GiB, follow the instructions in Docker official documentation for your operating system to boost the memory limit for Docker Desktop.

Clone the repository

Use the following command to close the repository:

git clone https://github.com/aborroy/alfresco-genai.git

The project includes the following components:

genai-stack folder is using https://github.com/docker/genai-stack project to build a REST endpoint that provides AI services for a given document.
alfresco folder includes a Docker Compose template to deploy Alfresco Community 23.1.
alfresco-ai folder includes a set of projects related to Alfresco integration.
- alfresco-ai-model defines a custom Alfresco content model to store summaries, terms and prompts to be deployed in Alfresco Repository and Share App.
- alfresco-ai-applier uses the Alfresco REST API to apply summaries or terms for a populated Alfresco Repository.
- alfresco-ai-listener listens to messages and generates summaries for created or updated nodes in Alfresco Repository.
compose.yaml file describes a deployment for Alfresco and GenAI Stack services using include directive.

Starting Docker GenAI service

The Docker GenAI Service for Alfresco, located in the genai-stack folder, is based on the Docker GenAI Stack project, and provides the summarization service as a REST endpoint to be consumed from Alfresco integration.

cd genai-stack

Before running the service, modify the .env file to adjust available preferences:

# Choose any of the on premise models supported by ollama
LLM=mistral
LLM_VISION=llava
# Any language name supported by chosen LLM
SUMMARY_LANGUAGE=English
# Number of words for the summary
SUMMARY_SIZE=120
# Number of tags to be identified with the summary
TAGS_NUMBER=3

Start the Docker Stack using the standard command:

docker compose up --build --force-recreate

After the service is up and ready, the summary REST endpoint becomes accessible. You can test its functionality using a curl command.

Use a local PDF file (file.pdf in the following sample) to obtain a summary and a number of tags.

curl --location 'http://localhost:8506/summary' \
--form 'file=@"./file.pdf"'
{ 
  "summary": " The text discusses...", 
  "tags": " Golang, Merkle, Difficulty", 
  "model": "mistral"
}

Use a local PDF file (file.pdf in the following sample) and a list of terms (such as Japanese or Spanish) to obtain a classification of the document.

curl --location \
'http://localhost:8506/classify?termList=%22Japanese%2CSpanish%22' \
--form 'file=@"./file.pdf"'
{
    "term": " Japanese",
    "model": "mistral"
}

Use a local PDF file (file.pdf in the following sample) and a prompt (such as “What is the name of the son?”) to obtain a response regarding the document.

curl --location \
'http://localhost:8506/prompt?prompt=%22What%20is%20the%20name%20of%20the%20son%3F%22' \
--form 'file=@"./file.pdf"'
{
    "answer": " The name of the son is Musuko.",
    "model": "mistral"
}

Use a local picture file (picture.jpg in the following sample) to obtain a text description of the image.

curl --location 'http://localhost:8506/describe' \
--form 'image=@"./picture.jpg"'
{
    "description": " The image features a man standing... ",
    "model": "llava"
}

Note that, in this case, LlaVA LLM is used instead of Mistral.

Make sure to stop Docker Compose before continuing to the next step.

Starting Alfresco

The Alfresco Platform, located in the alfresco folder, provides a sample deployment of the Alfresco Repository including a customized content model to store results obtained from the integration with the GenAI Service.

Because we want to run both Alfresco and GenAI together, we’ll use the compose.yaml file located in the project’s main folder.

include:
  - genai-stack/compose.yaml
  - alfresco/compose.yaml
#  - alfresco/compose-ai.yaml

In this step, we’re deploying only GenAI Stack and Alfresco, so make sure to leave the compose.ai.yaml line commented out.

Start the stack using the standard command:

docker compose up --build --force-recreate

After the service is up and ready, the Alfresco Repository becomes accessible. You can test the platform using default credentials (admin/admin) in the following URLs:

Repository services: http://localhost:8080/alfresco
UI: http://localhost:8080/share

Enhancing existing documents within Alfresco

The AI Applier application, located in the alfresco-ai/alfresco-ai-applier folder, contains a Spring Boot application that retrieves documents stored in an Alfresco folder, obtains the response from the GenAI Service and updates the original document in Alfresco.

Before running the application for the first time, you’ll need to build the source code using Maven.

cd alfresco-ai/alfresco-ai-applier
mvn clean package

As we have GenAI Service and Alfresco Platform up and running from the previous steps, we can upload documents to the Alfresco Shared Files/summary folder and run the program to update the documents with the summary.

java -jar target/alfresco-ai-applier-0.8.0.jar \
--applier.root.folder=/app:company_home/app:shared/cm:summary \
--applier.action=SUMMARY
...
Processing 2 documents of a total of 2
END: All documents have been processed. The app may need to be executed again for nodes without existing PDF rendition.

Once the process has been completed, every Alfresco document in the Shared Files/summary folder will include the information obtained by the GenAI Stack service: summary, tags, and LLM used (Figure 5).

Figure 5: The document has been updated in Alfresco Repository with summary, tags and model (LLM).

You can now upload documents to the Alfresco Shared Files/classify folder to prepare the repository for the next step.

Classifying action can be applied to documents in the Alfresco Shared Files/classify folder using the following command. GenAI Service will pick the term from the list (English, Spanish, Japanese) that best matches each document in the folder.

java -jar target/alfresco-ai-applier-0.8.0.jar \
--applier.root.folder=/app:company_home/app:shared/cm:classify \
--applier.action=CLASSIFY \
--applier.action.classify.term.list=English,Spanish,Japanese
...
Processing 2 documents of a total of 2
END: All documents have been processed. The app may need to be executed again for nodes without existing PDF rendition.

Upon completion, every Alfresco document in the Shared Files folder will include the information obtained by the GenAI Stack service: a term from the list of terms and the LLM used (Figure 6).

Figure 6: The document has been updated in Alfresco Repository with term and model (LLM).

You can upload pictures to the Alfresco Shared Files/picture folder to prepare the repository for the next step.

To obtain a text description from pictures, create a new folder named picture under the Shared Files folder. Upload any image file to this folder and run the following command:

java -jar target/alfresco-ai-applier-0.8.0.jar \
--applier.root.folder=/app:company_home/app:shared/cm:picture \
--applier.action=DESCRIBE
...
Processing 1 documents of a total of 1
END: All documents have been processed. The app may need to be executed again for nodes without existing PDF rendition.

Following this process, every Alfresco image in the picture folder will include the information obtained by the GenAI Stack service: a text description and the LLM used (Figure 7).

Figure 7: The document has been updated in Alfresco Repository with text description and model (LLM).

Enhancing new documents uploaded to Alfresco

The AI Listener application, located in the alfresco-ai/alfresco-ai-listener folder, contains a Spring Boot application that listens to Alfresco messages, obtains the response from the GenAI Service and updates the original document in Alfresco.

Before running the application for the first time, you’ll need to build the source code using Maven and to build the Docker image.

cd alfresco-ai/alfresco-ai-listener
mvn clean package
docker build . -t alfresco-ai-listener

As we are using the AI Listener application as a container, stop the Alfresco deployment and uncomment the alfresco-ai-listener in the compose.yaml file.

include:
  - genai-stack/compose.yaml
  - alfresco/compose.yaml
  - alfresco/compose-ai.yaml

Start the stack using the standard command:

docker compose up --build --force-recreate

After the service is again up and ready, the Alfresco Repository becomes accessible. You can verify that the platform is working by using default credentials (admin/admin) in the following URLs:

Repository services: http://localhost:8080/alfresco
UI: http://localhost:8080/share

Summarization

Next, upload a new document and apply the “Summarizable with AI” aspect to the document. After a while, the document will include the information obtained by the GenAI Stack service: summary, tags, and LLM used.

Description

If you want to use AI enhancement, you might want to set up a folder that automatically applies the necessary aspect, instead of doing it manually.

Create a new folder named pictures in Alfresco Repository and create a rule with the following settings in it:

Name: description
When: Items are created or enter this folder
If all criteria are met: All Items
Perform Action: Add “Descriptable with AI” aspect

Upload a new picture to this folder. After a while, without manual setting of the aspect, the document will include the information obtained by the GenAI Stack service: description and LLM used.

Classification

Create a new folder named classifiable in Alfresco Repository. Apply the “Classifiable with AI” aspect to this folder and add a list of terms separated by comma in the “Terms” property (such as English, Japanese, Spanish).

Create a new rule for classifiable folder with the following settings:

Name: classifiable
When: Items are created or enter this folder
If all criteria are met: All Items
Perform Action: Add “Classified with AI” aspect

Upload a new document to this folder. After a while, the document will include the information obtained by the GenAI Stack service: term and LLM used.

A degree of automation can be achieved when using classification with AI. To do this, a simple Alfresco Repository script named classify.js needs to be created in the folder “Repository/Data Dictionary/Scripts” with following content.

document.move(
  document.parent.childByNamePath(    
    document.properties["genai:term"]));

Create a new rule for classifiable folder to apply this script with following settings:

Name: move
When: Items are updated
If all criteria are met: All Items
Perform Action: Execute classify.js script

Create a child folder of the classifiable folder with the name of every term defined in the “Terms” property.

When you set up this configuration, any documents uploaded to the folder will automatically be moved to a subfolder based on the identified term. This means that the documents are classified automatically.

Prompting

Finally, to use the prompting GenAI feature, apply the “Promptable with AI” aspect to an existing document. Type your question in the “Question” property.

After a while, the document will include the information obtained by the GenAI Stack service: answer and LLM used.

A new era of document management

By embracing this framework, you can not only unlock a new level of efficiency, productivity, and user experience but also lay the foundation for limitless innovation. With Alfresco and GenAI Stack, the possibilities are endless — from enhancing document analysis and automating content classification to revolutionizing search capabilities and beyond.

If you’re unsure about any part of this process, check out the following video, which demonstrates all the steps live:

Learn more

Read Introducing a New GenAI Stack: Streamlined AI/ML Integration Made Easy.
Subscribe to the Docker Newsletter.
Get the latest release of Docker Desktop.
Vote on what’s next! Check out our public roadmap.
Have questions? The Docker community is here to help.
New to Docker? Get started.

A Promising Methodology for Testing GenAI Applications in Java

Ignasi Lopez Luna — Wed, 24 Apr 2024 16:03:14 +0000

In the vast universe of programming, the era of generative artificial intelligence (GenAI) has marked a turning point, opening up a plethora of possibilities for developers.

Tools such as LangChain4j and Spring AI have democratized access to the creation of GenAI applications in Java, allowing Java developers to dive into this fascinating world. With Langchain4j, for instance, setting up and interacting with large language models (LLMs) has become exceptionally straightforward. Consider the following Java code snippet:

public static void main(String[] args) {
    var llm = OpenAiChatModel.builder()
            .apiKey("demo")
            .modelName("gpt-3.5-turbo")
            .build();
    System.out.println(llm.generate("Hello, how are you?"));
}

This example illustrates how a developer can quickly instantiate an LLM within a Java application. By simply configuring the model with an API key and specifying the model name, developers can begin generating text responses immediately. This accessibility is pivotal for fostering innovation and exploration within the Java community. More than that, we have a wide range of models that can be run locally, and various vector databases for storing embeddings and performing semantic searches, among other technological marvels.

Despite this progress, however, we are faced with a persistent challenge: the difficulty of testing applications that incorporate artificial intelligence. This aspect seems to be a field where there is still much to explore and develop.

In this article, I will share a methodology that I find promising for testing GenAI applications.

Project overview

The example project focuses on an application that provides an API for interacting with two AI agents capable of answering questions.

An AI agent is a software entity designed to perform tasks autonomously, using artificial intelligence to simulate human-like interactions and responses.

In this project, one agent uses direct knowledge already contained within the LLM, while the other leverages internal documentation to enrich the LLM through retrieval-augmented generation (RAG). This approach allows the agents to provide precise and contextually relevant answers based on the input they receive.

I prefer to omit the technical details about RAG, as ample information is available elsewhere. I’ll simply note that this example employs a particular variant of RAG, which simplifies the traditional process of generating and storing embeddings for information retrieval.

Instead of dividing documents into chunks and making embeddings of those chunks, in this project, we will use an LLM to generate a summary of the documents. The embedding is generated based on that summary.

When the user writes a question, an embedding of the question will be generated and a semantic search will be performed against the embeddings of the summaries. If a match is found, the user’s message will be augmented with the original document.

This way, there’s no need to deal with the configuration of document chunks, worry about setting the number of chunks to retrieve, or worry about whether the way of augmenting the user’s message makes sense. If there is a document that talks about what the user is asking, it will be included in the message sent to the LLM.

Technical stack

The project is developed in Java and utilizes a Spring Boot application with Testcontainers and LangChain4j.

For setting up the project, I followed the steps outlined in Local Development Environment with Testcontainers and Spring Boot Application Testing and Development with Testcontainers.

I also use Tescontainers Desktop to facilitate database access and to verify the generated embeddings as well as to review the container logs.

The challenge of testing

The real challenge arises when trying to test the responses generated by language models. Traditionally, we could settle for verifying that the response includes certain keywords, which is insufficient and prone to errors.

static String question = "How I can install Testcontainers Desktop?";
@Test
    void verifyRaggedAgentSucceedToAnswerHowToInstallTCD() {
        String answer  = restTemplate.getForObject("/chat/rag?question={question}", ChatController.ChatResponse.class, question).message();
        assertThat(answer).contains("https://testcontainers.com/desktop/");
    }

This approach is not only fragile but also lacks the ability to assess the relevance or coherence of the response.

An alternative is to employ cosine similarity to compare the embeddings of a “reference” response and the actual response, providing a more semantic form of evaluation.

This method measures the similarity between two vectors/embeddings by calculating the cosine of the angle between them. If both vectors point in the same direction, it means the “reference” response is semantically the same as the actual response.

static String question = "How I can install Testcontainers Desktop?";
static String reference = """
       - Answer must indicate to download Testcontainers Desktop from https://testcontainers.com/desktop/
       - Answer must indicate to use brew to install Testcontainers Desktop in MacOS
       - Answer must be less than 5 sentences
       """;
@Test
    void verifyRaggedAgentSucceedToAnswerHowToInstallTCD() {
        String answer  = restTemplate.getForObject("/chat/rag?question={question}", ChatController.ChatResponse.class, question).message();
        double cosineSimilarity = getCosineSimilarity(reference, answer);
        assertThat(cosineSimilarity).isGreaterThan(0.8);
    }

However, this method introduces the problem of selecting an appropriate threshold to determine the acceptability of the response, in addition to the opacity of the evaluation process.

Toward a more effective method

The real problem here arises from the fact that answers provided by the LLM are in natural language and non-deterministic. Because of this, using current testing methods to verify them is difficult, as these methods are better suited to testing predictable values.

However, we already have a great tool for understanding non-deterministic answers in natural language: LLMs themselves. Thus, the key may lie in using one LLM to evaluate the adequacy of responses generated by another LLM.

This proposal involves defining detailed validation criteria and using an LLM as a “Validator Agent” to determine if the responses meet the specified requirements. This approach can be applied to validate answers to specific questions, drawing on both general knowledge and specialized information

By incorporating detailed instructions and examples, the Validator Agent can provide accurate and justified evaluations, offering clarity on why a response is considered correct or incorrect.

static String question = "How I can install Testcontainers Desktop?";
    static String reference = """
            - Answer must indicate to download Testcontainers Desktop from https://testcontainers.com/desktop/
            - Answer must indicate to use brew to install Testcontainers Desktop in MacOS
            - Answer must be less than 5 sentences
            """;

    @Test
    void verifyStraightAgentFailsToAnswerHowToInstallTCD() {
        String answer  = restTemplate.getForObject("/chat/straight?question={question}", ChatController.ChatResponse.class, question).message();
        ValidatorAgent.ValidatorResponse validate = validatorAgent.validate(question, answer, reference);
        assertThat(validate.response()).isEqualTo("no");
    }

    @Test
    void verifyRaggedAgentSucceedToAnswerHowToInstallTCD() {
        String answer  = restTemplate.getForObject("/chat/rag?question={question}", ChatController.ChatResponse.class, question).message();
        ValidatorAgent.ValidatorResponse validate = validatorAgent.validate(question, answer, reference);
        assertThat(validate.response()).isEqualTo("yes");
    }

We can even test more complex responses where the LLM should suggest a better alternative to the user’s question.

static String question = "How I can find the random port of a Testcontainer to connect to it?";
    static String reference = """
            - Answer must not mention using getMappedPort() method to find the random port of a Testcontainer
            - Answer must mention that you don't need to find the random port of a Testcontainer to connect to it
            - Answer must indicate that you can use the Testcontainers Desktop app to configure fixed port
            - Answer must be less than 5 sentences
            """;

    @Test
    void verifyRaggedAgentSucceedToAnswerHowToDebugWithTCD() {
        String answer  = restTemplate.getForObject("/chat/rag?question={question}", ChatController.ChatResponse.class, question).message();
        ValidatorAgent.ValidatorResponse validate = validatorAgent.validate(question, answer, reference);
        assertThat(validate.response()).isEqualTo("yes");
    }

Validator Agent

The configuration for the Validator Agent doesn’t differ from that of other agents. It is built using the LangChain4j AI Service and a list of specific instructions:

public interface ValidatorAgent {
    @SystemMessage("""
                ### Instructions
                You are a strict validator.
                You will be provided with a question, an answer, and a reference.
                Your task is to validate whether the answer is correct for the given question, based on the reference.
                
                Follow these instructions:
                - Respond only 'yes', 'no' or 'unsure' and always include the reason for your response
                - Respond with 'yes' if the answer is correct
                - Respond with 'no' if the answer is incorrect
                - If you are unsure, simply respond with 'unsure'
                - Respond with 'no' if the answer is not clear or concise
                - Respond with 'no' if the answer is not based on the reference
                
                Your response must be a json object with the following structure:
                {
                    "response": "yes",
                    "reason": "The answer is correct because it is based on the reference provided."
                }
                
                ### Example
                Question: Is Madrid the capital of Spain?
                Answer: No, it's Barcelona.
                Reference: The capital of Spain is Madrid
                ###
                Response: {
                    "response": "no",
                    "reason": "The answer is incorrect because the reference states that the capital of Spain is Madrid."
                }
                """)
    @UserMessage("""
            ###
            Question: {{question}}
            ###
            Answer: {{answer}}
            ###
            Reference: {{reference}}
            ###
            """)
    ValidatorResponse validate(@V("question") String question, @V("answer") String answer, @V("reference") String reference);

    record ValidatorResponse(String response, String reason) {}
}

As you can see, I’m using Few-Shot Prompting to guide the LLM on the expected responses. I also request a JSON format for responses to facilitate parsing them into objects, and I specify that the reason for the answer must be included, to better understand the basis of its verdict.

Conclusion

The evolution of GenAI applications brings with it the challenge of developing testing methods that can effectively evaluate the complexity and subtlety of responses generated by advanced artificial intelligences.

The proposal to use an LLM as a Validator Agent represents a promising approach, paving the way towards a new era of software development and evaluation in the field of artificial intelligence. Over time, we hope to see more innovations that allow us to overcome the current challenges and maximize the potential of these transformative technologies.

Learn more

Check out the GenAI Stack to get started with adding AI to your apps.
Subscribe to the Docker Newsletter.
Get the latest release of Docker Desktop.
Vote on what’s next! Check out our public roadmap.
Have questions? The Docker community is here to help.
New to Docker? Get started.

Better Debugging: How the Signal0ne Docker Extension Uses AI to Simplify Container Troubleshooting

Szymon Stawski — Wed, 24 Apr 2024 15:58:35 +0000

This post was written in collaboration with Szymon Stawski, project maintainer at Signal0ne.

Consider this scenario: You fire up your Docker containers, hit an API endpoint, and … bam! It fails. Now what? The usual drill involves diving into container logs, scrolling through them to understand the error messages, and spending time looking for clues that will help you understand what’s wrong. But what if you could get a summary of what’s happening in your containers and potential issues with the proposed solutions already provided?

In this article, we’ll dive into a solution that solves this issue using AI. AI can already help developers write code, so why not help developers understand their system, too?

Signal0ne is a Docker Desktop extension that scans Docker containers’ state and logs in search of problems, analyzes the discovered issues, and outputs insights to help developers debug. We first learned about Signal0ne as the winning submission in the 2023 Docker AI/ML Hackathon, and we’re excited to show you how to use it to debug more efficiently.

Introducing Signal0ne Docker extension: Streamlined debugging for Docker

The magic of the Signal0ne Docker extension is its ability to shorten feedback loops for working with and developing containerized applications. Forget endless log diving — the extension offers a clear and concise summary of what’s happening inside your containers after logs and states are analyzed by an AI agent, pinpointing potential issues and even suggesting solutions.

Developing applications these days involves more than a block of code executed in a vacuum. It is a complex system of dependencies, and different user flows that need debugging from time to time. AI can help filter out all the system noise and focuses on providing data about certain issues in the system so that developers can debug faster and better.

Docker Desktop is one of the most popular tools used for local development with a huge community, and Docker features like Docker Debug enhance the community’s ability to quickly debug and resolve issues with their containerized apps.

Signal0ne Docker extension’s suggested solutions and summaries can help you while debugging your container or editing your code so that you can focus on bringing value as a software engineer. The term “developer experience” is often used, but this extension focuses on one crucial aspect: shortening development time. This translates directly to increased productivity, letting you build containerized applications faster and more efficiently.

How does the Docker Desktop extension work?

Between AI co-pilots, highly integrated in IDEs that help write code, and browser AI chats that help understand software development concepts in a Q&A way, there is one piece missing: logs and runtime system data.

The Signal0ne Docker Desktop extension consists of three components: two hosted on the user’s local system (UI and agent) and one in the Signal0ne cloud backend service. The agent scans the user’s local environment in the search of containers with invalid states, runtime issues, or some warnings or errors in the logs, after issue discovery, it collects additional data from container definition for enhanced analysis.

After the Signal0ne agent discovery, data is sent to the backend service, where a combination of pre-trained LLM and solution search retrieval service performs the analysis. The analysis of the issues can be seen from the Signal0ne extension UI, including:

Short log summary — Outlines what is happening within a particular container, including logs on which analysis was based can be accessed from the sources dropdown if you wish.
Solutions summary — One paragraph about possible solutions to the analyzed issue with some explanation of why this particular solution was proposed.
Solution sources — Links to websites where the issue has already been solved by the community (for example, a GitHub issue or StackOverflow post).

Watch this short video demonstration for more details:

Technical details

The issue analysis pipeline is powered by the LangChain tooling framework, which utilizes open source Mistral 7b LLM with LLM-optimized web search that browses sources of existing solutions on the open web (Figure 1).

Figure 1: Signal0ne architecture overview.

The following section shows sample issue analysis pipeline output:

Logs from Docker container:

“$ irb\n>> \"12341:asdf\"[/\\d+/]\n# => \"12341\"  \n>> \"12341:asdf\"[/\\d*/]\n# => \"12341\"  \n>> \"12341:asdf\"[0..5]\n# => \"12341:\"  \n>> \"12341:asdf\"[0...5]\n# => \"12341\"  \n>> \"12341:asdf\"[0, ':']\nTypeError: can't convert String into Integer\n  from (irb):5:in”

Summary:

The user is trying to extract a substring from the string 12341:asdf using different Ruby methods, such as [/\\\\d+/], [/\\\\d\\\\*/], [0..5], [0...5], and [0, ':']. However, when attempting to use the indexing method [0, ':'], an error occurs because the method expects an integer argument but receives a string instead.

Predicted solutions:

“To extract a substring from the given string 12341:asdf without encountering an error when using indexing methods, consider the following options:

1. Using regular expressions:

To extract a substring using regular expressions, you can use the scan method with a regular expression pattern. For instance, to extract the digits before the colon, you can use the following code snippet:

				str = "12341:asdf"                           
				substring = str.scan(/[\\\\d]+/)                          
				puts substring.first
				# Output: ["12341"]                         
				puts substring.first[0]                           
				# Output: 12341

2. Slicing:

Another alternative to indexing is slicing, which allows you to extract a specified range of characters from a string. To extract the digits before the colon using slicing, you can use the following code snippet:

str = "12341:asdf"                       
	   substring = str[0...str.index(":")]                         
               puts substring               
	   # Output: "12341"                         
	   puts substring.to_i                           
	   # Output: 12341

These two methods should help you extract the desired substring from the given string without encountering an error.

Getting started

Getting started with the Signal0ne Docker Desktop extension is a straightforward process that allows developers to leverage the benefits of unified development.

Here are the steps for installing Signal0ne Docker extension:

1. Install Docker Desktop.

2. Choose Add Extensions in the left sidebar. The Browse tab will appear by default (Figure 2).

Figure 2: Signal0ne extension installation from the marketplace.

3. In the Filters drop-down, select the Utility tools category.

4. Find Signal0ne and then select Install (Figure 3).

Figure 3: Extension installation process.

5. Log in after the extension is installed (Figure 4).

Figure 4: Signal0ne extension login screen.

6. Start developing your apps, and, if you face some issues while debugging, have a look at the Signal0ne extension UI. The issue analysis will be there to help you with debugging.

Make sure the Signal0ne agent is enabled by toggling on (Figure 5):

Figure 5: Agent settings tab.

Figure 6 shows the summary and sources:

Figure 6: Overview of the inspected issue.

Proposed solutions and sources are shown in Figures 7 and 8. Solutions sources will redirect you to a webpage with predicted solution:

Figure 7: Overview of proposed solutions to the encountered issue.

Figure 8: Overview of the list of helpful links.

If you want to contribute to the project, you can leave feedback via the Like or Dislike button in the issue analysis output (Figure 9).

Figure 9: You can leave feedback about analysis output for further improvements.

To explore Signal0ne Docker Desktop extension without utilizing your containers, consider experimenting with dummy containers using this docker compose to observe how logs are being analyzed and how helpful the output is with the insights:

services:
  broken_bulb: # c# application that cannot start properly
    image: 'Signal0neai/broken_bulb:dev'
  faulty_roger: # 
    image: 'Signal0neai/faulty_roger:dev'
  smoked_server: # nginx server hosting the website with the miss-configuration
    image: 'Signal0neai/smoked_server:dev'
    ports:
      - '8082:8082'
  invalid_api_call: # python webserver with bug 
   image: 'Signal0neai/invalid_api_call:dev'
   ports:
    - '5000:5000'

broken_bulb: This service uses the image Signal0neai/broken_bulb:dev. It’s a C# application that throws System.NullReferenceException during the startup. Thanks to that application, you can observe how Signal0ne discovers the failed container, extracts the error logs, and analyzes it.
faulty_roger: This service uses the image Signal0neai/faulty_roger:dev. It is a Python API server that is trying to connect to an unreachable database on localhost.
smoked_server: This service utilizes the image Signal0neai/smoked_server:dev. The smoked_server service is an Nginx instance that is throwing 403 forbidden while the user is trying to access the root path (http://127.0.0.1:8082/). Signal0ne can help you debug that.
invalid_api_call: API service with a bug in one of the endpoints, to generate an error call http://127.0.0.1:5000/create-table after running the container. Follow the analysis of Signal0ne and try to debug the issue.

Conclusion

Debugging containerized applications can be time-consuming and tedious, often involving endless scrolling through logs and searching for clues to understand the issue. However, with the introduction of the Signal0ne Docker extension, developers can now streamline this process and boost their productivity significantly.

By leveraging the power of AI and language models, the extension provides clear and concise summaries of what’s happening inside your containers, pinpoints potential issues, and even suggests solutions. With its user-friendly interface and seamless integration with Docker Desktop, the Signal0ne Docker extension is set to transform how developers debug and develop containerized applications.

Whether you’re a seasoned Docker user or just starting your journey with containerized development, this extension offers a valuable tool that can save you countless hours of debugging and help you focus on what matters most — building high-quality applications efficiently. Try the extension in Docker Desktop today, and check out the documentation on GitHub.

Learn more

Subscribe to the Docker Newsletter.
Get the latest release of Docker Desktop.
Vote on what’s next! Check out our public roadmap.
Have questions? The Docker community is here to help.
New to Docker? Get started.

AI Trends Report 2024: AI’s Growing Role in Software Development

Rebecca Floyd — Tue, 16 Apr 2024 13:00:20 +0000

The landscape of application development is rapidly evolving, propelled by the integration of Artificial Intelligence (AI) into the development process. Results in the Docker AI Trends Report 2024, a precursor to the upcoming State of Application Development Report, show interesting AI trends among developers, highlighted in this report.

The most recent Docker State of Application Development Survey results offer insights into how developers are adopting and utilizing AI, reflecting a shift toward more intelligent, efficient, and adaptable development methodologies. This transformation is part of a larger trend observed across the tech industry as AI becomes increasingly central to software development.

The annual Docker State of Application Development survey, conducted by our User Research Team, is one way Docker product managers, engineers, and designers gather insights from Docker users to continuously develop and improve the suite of tools the company offers. For example, in Docker’s 2022 State of Application Development Survey, we found that the task for which Docker users most often refer to support/documentation was creating a Dockerfile (reported by 60% of respondents). This finding helped spur the innovation of Docker AI.

More than 1,300 developers participated in the latest Docker State of Application Development survey, conducted in late 2023. The online survey asked respondents about what tools they use, their application development processes and frustrations, feelings about industry trends, Docker usage, and participation in developer communities. We wanted to know where developers are focused, what they’re working on, and what is most important to them.

Of the approximately 1,300 respondents to the survey, 885 completed it. The findings in this report are based on the 885 completed responses.

Who responded to the Docker survey?

Respondents who took our survey ranged from home hobbyists to professionals at companies with more than 5,000 employees. Forty-two percent of respondents are working for a small company (up to 100 employees), 28% of participants say they work for mid-sized companies (between 100 and 1,000 employees), and 25% work for large companies (more than 1,000 employees).

Well over half of the respondents were in engineering roles — for example, 36% of respondents identified as back-end or full-stack developers; 21% were DevOps, infrastructure managers, or platform engineers; and 4% were front-end developers. Other roles of respondents included dev/engineering managers, company leadership, product managers, security roles, and AI/ML roles. There was nearly an even split between respondents with more experience (6+ years, 54%) and less experienced (0-5 years, 46%).

Our survey underscored a marked growth in roles focused on machine learning (ML) engineering and data science within the Docker ecosystem. In our 2022 survey, approximately 1% of respondents represented this demographic, whereas they made up 8% in the most recent survey. ML engineers and data scientists represent a rapidly expanding user base. This signals the growing relevance of AI to the software development field, and the blurring of the lines between tools used by developers and tools used by AI/ML scientists.

More than 34% of respondents said they work in the computing or IT/SaaS industry, but we also saw responses from individuals working in accounting, banking, or finance (8%); business, consultancy, or management (7%); engineering or manufacturing (6%), and education (5%). Other responses came in from professionals in a wide range of fields, including media; academic research; transport or logistics; retail; marketing, advertising, or PR; charity or volunteer work; healthcare; construction; creative arts or design; and environment or agriculture.

Docker users made up 87% of our respondents, whereas 13% reported that they do not use Docker.

AI as an up-and-coming trend

We asked participants what they felt were the most important trends currently in the industry. GenAI (40% of respondents) and AI assistants for software engineering (38% of respondents) were the top-selected options identified as important industry trends in software development. More senior developers (back-end, front-end, and full-stack developers with over 5 years of experience) tended to view GenAI as most important, whereas more junior developers (less than 5 years of experience) view AI assistants for software engineering as most important. This difference may signal varied and unique uses of AI throughout a career in software development.

It’s clearly trendy, but how do developers really feel about AI? The majority (65%) agree that AI is a positive option, it makes their jobs easier (61%), and it allows them to focus on more important tasks (55%). A much smaller number of respondents see AI as a threat to their jobs (23%) or say it makes their jobs more difficult (19%).

Interestingly, despite high usage and generally positive feelings towards AI, 45% of respondents also reported that they feel AI is over-hyped. Why might this be? It’s not fully clear, but when this finding is considered alongside responses to perception of job threat, one possible answer could be entertained: respondents may be viewing AI as a critical and useful tool for their work, but they’re not too worried about the hype of it replacing them anytime soon.

How AI is used in the developer’s world

We asked users what they use AI for, how dependent they feel on AI, and what AI tools they use most often. A majority of developers (64%) already report using AI for work, underscoring AI’s penetration into the software development field. Developers leverage AI at work mainly for coding (33% of respondents), writing documentation (29%), research (28%), writing tests (23%), troubleshooting/debugging (21%), and CLI commands (20%).

For the 568 respondents who indicated they use AI for work, we also asked how dependent they felt on AI to get their job done on a scale of 0 (not at all dependent) to 10 (completely dependent). Responses ranged substantially and varied by role and years of experience, but the overall average reported dependence was about 4 out of 10, indicating relatively low dependence.

In the developer toolkit, respondents indicate that AI tools like ChatGPT (46% of respondents), GitHub Copilot (30%), and Bard (19%) stand out as most frequently used.

Conclusion

Concluding our 2024 Docker AI Trends Report, Artificial Intelligence is already shifting the way software development is approached. The insights from more than 800 respondents in our latest survey illuminate a path toward a future where AI is seamlessly integrated into every aspect of application development. From coding and documentation to debugging and writing tests, AI tools are becoming indispensable in enhancing efficiency and problem-solving capabilities, allowing developers to focus on more creative and important work.

The uptake of AI tools such as ChatGPT, GitHub Copilot, and Bard among developers is a testament to AI’s value in the development process. Moreover, the growing interest in machine learning engineering and data science within the Docker community signals a broader acceptance and integration of AI technologies.

As Docker continues to innovate and support developers in navigating these changes, the evolving landscape of AI in software development presents both opportunities and challenges. Embracing AI as a positive force that can augment human capabilities rather than replace them is crucial. Docker is committed to facilitating this transition by providing tools and resources that empower developers to leverage AI effectively, ensuring they can remain at the forefront of technological innovation.

Looking ahead, Docker will continue to monitor these trends, adapt our offerings accordingly, and support our user community in harnessing the full potential of AI in software development. As the industry evolves, so too will Docker’s role in shaping the future of application development, ensuring our users are equipped to meet the challenges and seize the opportunities that lie ahead in this exciting era of AI-driven development.

Learn more

Docker’s User Research Team — Olga Diachkova, Julia Wilson, and Rebecca Floyd — conducted this survey, analyzed the results, and provided insights.

For a complete methodology, contact uxresearch@docker.com.

Building a Video Analysis and Transcription Chatbot with the GenAI Stack

David Cardozo — Thu, 28 Mar 2024 14:32:22 +0000

Videos are full of valuable information, but tools are often needed to help find it. From educational institutions seeking to analyze lectures and tutorials to businesses aiming to understand customer sentiment in video reviews, transcribing and understanding video content is crucial for informed decision-making and innovation. Recently, advancements in AI/ML technologies have made this task more accessible than ever.

Developing GenAI technologies with Docker opens up endless possibilities for unlocking insights from video content. By leveraging transcription, embeddings, and large language models (LLMs), organizations can gain deeper understanding and make informed decisions using diverse and raw data such as videos.

In this article, we’ll dive into a video transcription and chat project that leverages the GenAI Stack, along with seamless integration provided by Docker, to streamline video content processing and understanding.

High-level architecture

The application’s architecture is designed to facilitate efficient processing and analysis of video content, leveraging cutting-edge AI technologies and containerization for scalability and flexibility. Figure 1 shows an overview of the architecture, which uses Pinecone to store and retrieve the embeddings of video transcriptions.

Figure 1: Schematic diagram outlining a two-component system for processing and interacting with video data.

The application’s high-level service architecture includes the following:

yt-whisper: A local service, run by Docker Compose, that interacts with the remote OpenAI and Pinecone services. Whisper is an automatic speech recognition (ASR) system developed by OpenAI, representing a significant milestone in AI-driven speech processing. Trained on an extensive dataset of 680,000 hours of multilingual and multitask supervised data sourced from the web, Whisper demonstrates remarkable robustness and accuracy in English speech recognition.
Dockerbot: A local service, run by Docker Compose, that interacts with the remote OpenAI and Pinecone services. The service takes the question of a user, computes a corresponding embedding, and then finds the most relevant transcriptions in the video knowledge database. The transcriptions are then presented to an LLM, which takes the transcriptions and the question and tries to provide an answer based on this information.
OpenAI: The OpenAI API provides an LLM service, which is known for its cutting-edge AI and machine learning technologies. In this application, OpenAI’s technology is used to generate transcriptions from audio (using the Whisper model) and to create embeddings for text data, as well as to generate responses to user queries (using GPT and chat completions).
Pinecone: A vector database service optimized for similarity search, used for building and deploying large-scale vector search applications. In this application, Pinecone is employed to store and retrieve the embeddings of video transcriptions, enabling efficient and relevant search functionality within the application based on user queries.

Getting started

To get started, complete the following steps:

Create an OpenAI API Key.
Ensure that you have a Pinecone API Key.
Ensure that you have installed the latest version of Docker Desktop.

The application is a chatbot that can answer questions from a video. Additionally, it provides timestamps from the video that can help you find the sources used to answer your question.

Clone the repository

The next step is to clone the repository:

git clone https://github.com/dockersamples/docker-genai.git

The project contains the following directories and files:

├── docker-genai/
│ ├── docker-bot/
│ ├── yt-whisper/
│ ├── .env.example
│ ├── .gitignore
│ ├── LICENSE
│ ├── README.md
│ └── docker-compose.yaml

Specify your API keys

In the /docker-genai directory, create a text file called .env, and specify your API keys inside. The following snippet shows the contents of the .env.example file that you can refer to as an example.

#-------------------------------------------------------------
# OpenAI
#-------------------------------------------------------------
OPENAI_TOKEN=your-api-key # Replace your-api-key with your personal API key

#-------------------------------------------------------------
# Pinecone
#--------------------------------------------------------------
PINECONE_TOKEN=your-api-key # Replace your-api-key with your personal API key

Build and run the application

In a terminal, change directory to your docker-genai directory and run the following command:

docker compose up --build

Next, Docker Compose builds and runs the application based on the services defined in the docker-compose.yaml file. When the application is running, you’ll see the logs of two services in the terminal.

In the logs, you’ll see the services are exposed on ports 8503 and 8504. The two services are complementary to each other.

The yt-whisper service is running on port 8503. This service feeds the Pinecone database with videos that you want to archive in your knowledge database. The next section explores the yt-whisper service.

Using yt-whisper

The yt-whisper service is a YouTube video processing service that uses the OpenAI Whisper model to generate transcriptions of videos and stores them in a Pinecone database. The following steps outline how to use the service.

Open a browser and access the yt-whisper service at http://localhost:8503. Once the application appears, specify a YouTube video URL in the URL field and select Submit. The example shown in Figure 2 uses a video from David Cardozo.

Figure 2: A web interface showcasing processed video content with a feature to download transcriptions.

Submitting a video

The yt-whisper service downloads the audio of the video, then uses Whisper to transcribe it into a WebVTT (*.vtt) format (which you can download). Next, it uses the “text-embedding-3-small” model to create embeddings and finally uploads those embeddings into the Pinecone database.

After the video is processed, a video list appears in the web app that informs you which videos have been indexed in Pinecone. It also provides a button to download the transcript.

Accessing Dockerbot chat service

You can now access the Dockerbot chat service on port 8504 and ask questions about the videos as shown in Figure 3.

Figure 3: Example of a user asking Dockerbot about NVIDIA containers and the application giving a response with links to specific timestamps in the video.

Conclusion

In this article, we explored the exciting potential of GenAI technologies combined with Docker for unlocking valuable insights from video content. It shows how the integration of cutting-edge AI models like Whisper, coupled with efficient database solutions like Pinecone, empowers organizations to transform raw video data into actionable knowledge.

Whether you’re an experienced developer or just starting to explore the world of AI, the provided resources and code make it simple to embark on your own video-understanding projects.

Learn more

Accelerated AI/ML with Docker
Build and run natural language processing (NLP) applications with Docker
Video transcription and chat using GenAI Stack
PDF analysis and chat using GenAI Stack
Subscribe to the Docker Newsletter.
Have questions? The Docker community is here to help.

Docker Partners with NVIDIA to Support Building and Running AI/ML Applications

Rahul Awasthy — Mon, 18 Mar 2024 22:06:12 +0000

The domain of GenAI and LLMs has been democratized and tasks that were once purely in the domain of AI/ML developers must now be reasoned with by regular application developers into everyday products and business logic. This is leading to new products and services across banking, security, healthcare, and more with generative text, images, and videos. Moreover, GenAI’s potential economic impact is substantial, with estimates it could add trillions of dollars annually to the global economy.

Docker offers an ideal way for developers to build, test, run, and deploy the NVIDIA AI Enterprise software platform — an end-to-end, cloud-native software platform that brings generative AI within reach for every business. The platform is available to use in Docker containers, deployable as microservices. This enables teams to focus on cutting-edge AI applications where performance isn’t just a goal — it’s a necessity.

This week, at the NVIDIA GTC global AI conference, the latest release of NVIDIA AI Enterprise was announced, providing businesses with the tools and frameworks necessary to build and deploy custom generative AI models with NVIDIA AI foundation models, the NVIDIA NeMo framework, and the just-announced NVIDIA NIM inference microservices, which deliver enhanced performance and efficient runtime.

This blog post summarizes some of the Docker resources available to customers today.

Docker Hub

Docker Hub is the world’s largest repository for container images with an extensive collection of AI/ML development-focused container images, including leading frameworks and tools such as PyTorch, TensorFlow, Langchain, Hugging Face, and Ollama. With more than 100 million pull requests for AI/ML-related images, Docker Hub’s significance to the developer community is self-evident. It not only simplifies the development of AI/ML applications but also democratizes innovation, making AI technologies accessible to developers across the globe.

NVIDIA’s Docker Hub library offers a suite of container images that harness the power of accelerated computing, supplementing NVIDIA’s API catalog. Docker Hub’s vast audience — which includes approximately 27 million monthly active IPs, showcasing an impressive 47% year-over-year growth — can use these container images to enhance AI performance.

Docker Hub’s extensive reach, underscored by an astounding 26 billion monthly image pulls, suggests immense potential for continued growth and innovation.

Docker Desktop with NVIDIA AI Workbench

Docker Desktop on Windows and Mac helps deliver NVIDIA AI Workbench developers a smooth experience on local and remote machines.

NVIDIA AI Workbench is an easy-to-use toolkit that allows developers to create, test, and customize AI and machine learning models on their PC or workstation and scale them to the data center or public cloud. It simplifies interactive development workflows while automating technical tasks that halt beginners and derail experts. AI Workbench makes workstation setup and configuration fast and easy. Example projects are also included to help developers get started even faster with their own data and use cases.

Docker engineering teams are collaborating with NVIDIA to improve the user experience with NVIDIA GPU-accelerated platforms through recent improvements to the AI Workbench installation on WSL2.

Check out how NVIDIA AI Workbench can be used locally to tune a generative image model to produce more accurate prompted results:

In a near-term update, AI Workbench will use the Container Device Interface (CDI) to govern local and remote GPU-enabled environments. CDI is a CNCF-sponsored project led by NVIDIA and Intel, which exposes NVIDIA GPUs inside of containers to support complex device configurations and CUDA compatibility checks. This simplifies how research, simulation, GenAI, and ML applications utilize local and cloud-native GPU resources.

With Docker Desktop 4.29 (which includes Moby 25), developers can configure CDI support in the daemon and then easily make all NVIDIA GPUs available in a running container by using the –device option via support for CDI devices.

docker run --device nvidia.com/gpu=all

LLM-powered apps with Docker GenAI Stack

The Docker GenAI Stack lets teams easily integrate NVIDIA accelerated computing into their AI workflows. This stack, designed for seamless component integration, can be set up on a developer’s laptop using Docker Desktop for Windows. It helps deliver the power of NVIDIA GPUs and NVIDIA NIM to accelerate LLM inference, providing tangible improvements in application performance. Developers can experiment and modify five pre-packaged applications to leverage the stack’s capabilities.

Accelerate AI/ML development with Docker Desktop

Docker Desktop facilitates an accelerated machine learning development environment on a developer’s laptop. By tapping NVIDIA GPU support for containers, developers can leverage tools distributed via Docker Hub, such as PyTorch and TensorFlow, to see significant speed improvements in their projects, underscoring the efficiency gains possible with NVIDIA technology on Docker.

Securing the software supply chain

Securing the software supply chain is a crucial aspect of continuously developing ML applications that can run reliably and securely in production. Building with verified, trusted content from Docker Hub and staying on top of security issues through actionable insights from Docker Scout is key to improving security posture across the software supply chain. By following these best practices, customers can minimize the risk of security issues hitting production, improving the overall reliability and integrity of applications running in production. This comprehensive approach not only accelerates the development of ML applications built with the Docker GenAI Stack but also allows for more secure images when building on images sourced from Hub that interface with LLMs, such as LangChain . Ultimately, this provides developers with the confidence that their applications are built on a secure and reliable foundation.

“With exploding interest in AI from a huge range of developers, we are excited to work with NVIDIA to build tooling that helps accelerate building AI applications. The ecosystem around Docker and NVIDIA has been building strong foundations for many years and this is enabling a new community of enterprise AI/ML developers to explore and build GPU accelerated applications.”
Justin Cormack, Chief Technology Officer, Docker

“Enterprise applications like NVIDIA AI Workbench can benefit enormously from the streamlining that Docker Desktop provides on local systems. Our work with the Docker team will help improve the AI Workbench user experience for managing GPUs on Windows.”
Tyler Whitehouse, Principal Product Manager, NVIDIA

Learn more

By leveraging Docker Desktop and Docker Hub with NVIDIA technologies, developers are equipped to harness the revolutionary power of AI, grow their skills, and seize opportunities to deliver innovative applications that push the boundaries of what’s possible. Check out NVIDIA’s Docker Hub library and NVIDIA AI Enterprise to get started with your own AI solutions.

Build Multimodal GenAI Apps with OctoAI and Docker

Thierry Moreau — Thu, 08 Feb 2024 15:07:55 +0000

This post was contributed by Thierry Moreau, co-founder and head of DevRel at OctoAI.

Generative AI models have shown immense potential over the past year with breakthrough models like GPT3.5, DALL-E, and more. In particular, open source foundational models have gained traction among developers and enterprise users who appreciate how customizable, cost-effective, and transparent these models are compared to closed-source alternatives.

In this article, we’ll explore how you can compose an open source foundational model into a streamlined image transformation pipeline that lets you manipulate images with nothing but text to achieve surprisingly good results.

With this approach, you can create fun versions of corporate logos, bring your kids’ drawings to life, enrich your product photography, or even remodel your living room (Figure 1).

Figure 1: Examples of image transformation including, from left to right: Generating creative corporate logo, bringing children’s drawings to life, enriching commercial photography, remodeling your living room

Pretty cool, right? Behind the scenes, a lot needs to happen, and we’ll walk step by step through how to reproduce these results yourself. We call the multimodal GenAI pipeline OctoShop as a nod to the popular image editing software.

Feeling inspired to string together some foundational GenAI models? Let’s dive into the technology that makes this possible.

Architecture overview

Let’s look more closely at the open source foundational GenAI models that compose the multimodal pipeline we’re about to build.

Going forward, we’ll use the term “model cocktail” instead of “multimodal GenAI model pipeline,” as it flows a bit better (and sounds tastier, too). A model cocktail is a mix of GenAI models that can process and generate data across multiple modalities: text and images are examples of data modalities across which GenAI models consume and produce data, but the concept can also extend to audio and video (Figure 2).

To build on the analogy of crafting a cocktail (or mocktail, if you prefer), you’ll need to mix ingredients, which, when assembled, are greater than the sum of their individual parts.

Figure 2: The multimodal GenAI workflow — by taking an image and text, this pipeline transforms the input image according to the text prompt.

Let’s use a Negroni, for example — my favorite cocktail. It’s easy to prepare; you need equal parts of gin, vermouth, and Campari. Similarly, our OctoShop model cocktail will use three ingredients: an equal mix of image-generation (SDXL), text-generation (Mistral-7B), and a custom image-to-text generation (CLIP Interrogator) model.

The process is as follows:

CLIP Interrogator takes in an image and generates a textual description (e.g., “a whale with a container on its back”).
An LLM model, Mistral-7B, will generate a richer textual description based on a user prompt (e.g., “set the image into space”). The LLM will consequently transform the description into a richer one that meets the user prompt (e.g., “in the vast expanse of space, a majestic whale carries a container on its back”).
Finally, an SDXL model will be used to generate a final AI-generated image based on the textual description transformed by the LLM model. We also take advantage of SDXL styles and a ControlNet to better control the output of the image in terms of style and framing/perspective.

Prerequisites

Let’s go over the prerequisites for crafting our cocktail.

Here’s what you’ll need:

Sign up for an OctoAI account to use OctoAI’s image generation (SDXL), text generation (Mistral-7B), and compute solutions (CLIP Interrogator) — OctoAI serves as the bar from which to get all of the ingredients you’ll need to craft your model cocktail. If you’re already using a different compute service, feel free to bring that instead.
Run a Jupyter notebook to craft the right mix of GenAI models. This is your place for experimenting and mixing, so this will be your cocktail shaker. To make it easy to run and distribute the notebook, we’ll use Google Colab.
Finally, we’ll deploy our model cocktail as a Streamlit app. Think of building your app and embellishing the frontend as the presentation of your cocktail (e.g., glass, ice, and choice of garnish) to enhance your senses.

Getting started with OctoAI

Head to octoai.cloud and create an account if you haven’t done so already. You’ll receive $10 in credits upon signing up for the first time, which should be sufficient for you to experiment with your own workflow here.

Follow the instructions on the Getting Started page to obtain an OctoAI API token — this will help you get authenticated whenever you use the OctoAI APIs.

Notebook walkthrough

We’ve built a Jupyter notebook in Colab to help you learn how to use the different models that will constitute your model cocktail. Here are the steps to follow:

1. Launch the notebook

Get started by launching the following Colab notebook.

There’s no need to change the runtime type or rely on a GPU or TPU accelerator — all we need is a CPU here, given that all of the AI heavy-lifting is done on OctoAI endpoints.

2. OctoAI SDK setup

Let’s get started by installing the OctoAI SDK. You’ll use the SDK to invoke the different open source foundational models we’re using, like SDXL and Mistral-7B. You can install through pip:

# Install the OctoAI SDK
!pip install octoai-sdk

In some cases, you may get a message about pip packages being previously imported in the runtime, causing an error. If that’s the case, selecting the Restart Session button at the bottom should take care of the package versioning issues. After this, you should be able to re-run the cell that pip-installs the OctoAI SDK without any issues.

3. Generate images with SDXL

You’ll first learn to generate an image with SDXL using the Image Generation solution API. To learn more about what each parameter does in the code below, check out OctoAI’s ImageGenerator client.

In particular, the ImageGenerator API takes several arguments to generate an image:

Engine: Lets you choose between versions of Stable Diffusion models, such as SDXL, SD1.5, and SSD.
Prompt: Describes the image you want to generate.
Negative prompt: Describes the traits you want to avoid in the final image.
Width, height: The resolution of the output image.
Num images: The number of images to generate at once.
Sampler: Determines the sampling method used to denoise your image. If you’re not familiar with this process, this article provides a comprehensive overview.
Number of steps: Number of denoising steps — the more steps, the higher the quality, but generally going past 30 will lead to diminishing returns.
Cfg scale: How closely to adhere to the image description — generally stays around 7-12.
Use refiner: Whether to apply the SDXL refiner model, which improves the output quality of the image.
Seed: A parameter that lets you control the reproducibility of image generation (set to a positive value to always get the same image given stable input parameters).

Note that tweaking the image generation parameters — like number of steps, number of images, sampler used, etc. — affects the amount of GPU compute needed to generate an image. Increasing GPU cycles will affect the pricing of generating the image.

Here’s an example using simple parameters:

# To use OctoAI, we'll need to set up OctoAI to use it
from octoai.clients.image_gen import Engine, ImageGenerator


# Now let's use the OctoAI Image Generation API to generate
# an image of a whale with a container on its back to recreate
# the moby logo
image_gen = ImageGenerator(token=OCTOAI_API_TOKEN)
image_gen_response = image_gen.generate(
 engine=Engine.SDXL,
 prompt="a whale with a container on its back",
 negative_prompt="blurry photo, distortion, low-res, poor quality",
 width=1024,
 height=1024,
 num_images=1,
 sampler="DPM_PLUS_PLUS_2M_KARRAS",
 steps=20,
 cfg_scale=7.5,
 use_refiner=True,
 seed=1
)
images = image_gen_response.images


# Display generated image from OctoAI
for i, image in enumerate(images):
 pil_image = image.to_pil()
 display(pil_image)

Feel free to experiment with the parameters to see what happens to the resulting image. In this case, I’ve put in a simple prompt meant to describe the Docker logo: “a whale with a container on its back.” I also added standard negative prompts to help generate the style of image I’m looking for. Figure 3 shows the output:

Figure 3: An SDXL-generated image of a whale with a container on its back.

4. Control your image output with ControlNet

One thing you may want to do with SDXL is control the composition of your AI-generated image. For example, you can specify a specific human pose or control the composition and perspective of a given photograph, etc.

For our experiment using Moby (the Docker mascot), we’d like to get an AI-generated image that can be easily superimposed onto the original logo — same shape of whale and container, orientation of the subject, size, and so forth.

This is where ControlNet can come in handy: they let you constrain the generation of images by feeding a control image as input. In our example we’ll feed the image of the Moby logo as our control input.

By tweaking the following parameters used by the ImageGenerator API, we are constraining the SDXL image generation with a control image of Moby. That control image will be converted into a depth map using a depth estimation model, then fed into the ControlNet, which will constrain SDXL image generation.

# Set the engine to controlnet SDXL
 engine="controlnet-sdxl",
 # Select depth controlnet which uses a depth map to apply
 # constraints to SDXL
 controlnet="depth_sdxl",
 # Set the conditioning scale anywhere between 0 and 1, try different
 # values to see what they do!
 controlnet_conditioning_scale=0.3,
 # Pass in the base64 encoded string of the moby logo image
 controlnet_image=image_to_base64(moby_image),

Now the result looks like it matches the Moby outline a lot more closely (Figure 4). This is the power of ControlNet. You can adjust the strength by varying the controlnet_conditioning_scale parameter. This way, you can make the output image more or less faithfully match the control image of Moby.

Figure 4: Left: The Moby logo is used as a control image to a ControlNet. Right: the SDXL-generated image resembles the control image more closely than in the previous example.

5. Control your image output with SDXL style presets

Let’s add a layer of customization with SDXL styles. We’ll use the 3D Model style preset (Figure 5). Behind the scenes, these style presets are adding additional keywords to the positive and negative prompts that the SDXL model ingests.

Figure 5: You can try various styles on the OctoAI Image Generation solution UI — there are more than 100 to choose from, each delivering a unique feel and aesthetic.

Figure 6 shows how setting this one parameter in the ImageGenerator API transforms our AI-generated image of Moby. Go ahead and try out more styles; we’ve generated a gallery for you to get inspiration from.

Figure 6: SDXL-generated image of Moby with the “3D Model” style preset applied.

6. Manipulate images with Mistral-7B LLM

So far we’ve relied on SDXL, which does text-to-image generation. We’ve added ControlNet in the mix to apply a control image as a compositional constraint.

Next, we’re going to layer an LLM into the mix to transform our original image prompt into a creative and rich textual description based on a “transformation prompt.”

Basically, we’re going to use an LLM to make our prompt better automatically. This will allow us to perform image manipulation using text in our OctoShop model cocktail pipeline:

Take a logo of Moby: Set it into an ultra-realistic photo in space.
Take a child’s drawing: Bring it to life in a fantasy world.
Take a photo of a cocktail: Set it on a beach in Italy.
Take a photo of a living room: Transform it into a staged living room in a designer house.

To achieve this text-to-text transformation, we will use the LLM user prompt as follows. This sets the original textual description of Moby into a new setting: the vast expanse of space.

'''
Human: set the image description into space: “a whale with a container on its back”
AI: '''

We’ve configured the LLM system prompt so that LLM responses are concise and at most one sentence long. We could make them longer, but be aware that the prompt consumed by SDXL has a 77-token context limit.

You can read more on the text generation Python SDK and its Chat Completions API used to generate text:

Model: Lets you choose out of selection of foundational open source models like Mixtral, Mistral, Llama2, Code Llama (the selection will grow with more open source models being released).
Messages: Contains a list of messages (system and user) to use as context for the completion.
Max tokens: Enforces a hard limit on output tokens (this could cut a completion response in the middle of a sentence).
Temperature: Lets you control the creativity of your answer: with a higher temperature, less likely tokens can be selected.

The choice of model, input, and output tokens will influence pricing on OctoAI. In this example, we’re using the Mistral-7B LLM, which is a great open source LLM model that really packs a punch given its small parameter size.

Let’s look at the code used to invoke our Mistral-7B LLM:

# Let's go ahead and start with the original prompt that we used in our
# image generation examples.
image_desc = "a whale with a container on its back"


# Let's then prepare our LLM prompt to manipulate our image
llm_prompt = '''
Human: set the image description into space: {}
AI: '''.format(image_desc)


# Now let's use an LLM to transform this craft clay rendition
# of Moby into a fun scify universe
from octoai.client import Client


client = Client(OCTOAI_API_TOKEN)
completion = client.chat.completions.create(
 messages=[
   {
     "role": "system",
     "content": "You are a helpful assistant. Keep your responses short and limited to one sentence."
   },
   {
     "role": "user",
     "content": llm_prompt
   }
 ],
 model="mistral-7b-instruct-fp16",
 max_tokens=128,
 temperature=0.01
)


# Print the message we get back from the LLM
llm_image_desc = completion.choices[0].message.content
print(llm_image_desc)

Here’s the output:

Our LLM has created a short yet imaginative description of Moby traveling through space. Figure 7 shows the result when we feed this LLM-generated textual description into SDXL.

Figure 7: SDXL-generated image of Moby where we used an LLM to set the scene in space and enrich the text prompt.

This image is great. We can feel the immensity of space. With the power of LLMs and the flexibility of SDXL, we can take image creation and manipulation to new heights. And the great thing is, all we need to manipulate those images is text; the GenAI models do the rest of the work.

7. Automate the workflow with AI-based image labeling

So far in our image transformation pipeline, we’ve had to manually label the input image to our OctoShop model cocktail. Instead of just passing in the image of Moby, we had to provide a textual description of that image.

Thankfully, we can rely on a GenAI model to perform text labeling tasks: CLIP Interrogator. Think of this task as the reverse of what SDXL does: It takes in an image and produces text as the output.

To get started, we’ll need a CLIP Interrogator model running behind an endpoint somewhere. There are two ways to get a CLIP Interrogator model endpoint on OctoAI. If you’re just getting started, we recommend the simple approach, and if you feel inspired to customize your model endpoint, you can use the more advanced approach. For instance, you may be interested in trying out the more recent version of CLIP Interrogator.

You can now invoke the CLIP Interrogator model in a few lines of code. We’ll use the fast interrogator mode here to get a label generated as quickly as possible.

# Let's go ahead and invoke the CLIP interrogator model


# Note that under a cold start scenario, you may need to wait a minute or two
# to get the result of this inference... Be patient!
output = client.infer(
   endpoint_url=CLIP_ENDPOINT_URL+'/predict',
   inputs={
       "image": image_to_base64(moby_image),
       "mode": "fast"
   }
)


# All labels
clip_labels = output["completion"]["labels"]
print(clip_labels)


# Let's get just the top label
top_label = clip_labels.split(',')[0]
print(top_label)

The top label described our Moby logo as:

That’s pretty on point. Now that we’ve tested all ingredients individually, let’s assemble our model cocktail and test it on interesting use cases.

8. Assembling the model cocktail

Now that we have tested our three models (CLIP interrogator, Mistral-7B, SDXL), we can package them into one convenient function, which takes the following inputs:

An input image that will be used to control the output image and also be automatically labeled by our CLIP interrogator model.
A transformation string that describes the transformation we want to apply to the input image (e.g., “set the image description in space”).
A style string which lets us better control the artistic output of the image independently of the transformation we apply to it (e.g., painterly style vs. cinematic).

The function below is a rehash of all of the code we’ve introduced above, packed into one function.

def genai_transform(image: Image, transformation: str, style: str) -> Image:
 # Step 1: CLIP captioning
 output = client.infer(
   endpoint_url=CLIP_ENDPOINT_URL+'/predict',
   inputs={
     "image": image_to_base64(image),
     "mode": "fast"
   }
 )
 clip_labels = output["completion"]["labels"]
 top_label = clip_labels.split(',')[0]


 # Step 2: LLM transformation
 llm_prompt = '''
 Human: {}: {}
 AI: '''.format(transformation, top_label)
 completion = client.chat.completions.create(
   messages=[
     {
       "role": "system",
       "content": "You are a helpful assistant. Keep your responses short and limited to one sentence."
     },
     {
       "role": "user",
       "content": llm_prompt
     }
   ],
   model="mistral-7b-instruct-fp16",
   max_tokens=128,
   presence_penalty=0,
   temperature=0.1,
   top_p=0.9,
 )
 llm_image_desc = completion.choices[0].message.content


 # Step 3: SDXL+controlnet transformation
 image_gen_response = image_gen.generate(
   engine="controlnet-sdxl",
   controlnet="depth_sdxl",
   controlnet_conditioning_scale=0.4,
   controlnet_image=image_to_base64(image),
   prompt=llm_image_desc,
   negative_prompt="blurry photo, distortion, low-res, poor quality",
   width=1024,
   height=1024,
   num_images=1,
   sampler="DPM_PLUS_PLUS_2M_KARRAS",
   steps=20,
   cfg_scale=7.5,
   use_refiner=True,
   seed=1,
   style_preset=style
 )
 images = image_gen_response.images


 # Display generated image from OctoAI
 pil_image = images[0].to_pil()
 return top_label, llm_image_desc, pil_image

Now you can try this out on several images, prompts, and styles.

Package your model cocktail into a web app

Now that you’ve mixed your unique GenAI cocktail, it’s time to pour it into a glass and garnish it, figuratively speaking. We built a simple Streamlit frontend that lets you deploy your unique OctoShop GenAI model cocktail and share the results with your friends and colleagues (Figure 8). You can check it on GitHub.

Follow the README instructions to deploy your app locally or get it hosted on Streamlit’s web hosting services.

Figure 8: The Streamlit app transforms images into realistic renderings in space — all thanks to the magic of GenAI.

We look forward to seeing what great image-processing apps you come up with. Go ahead and share your creations on OctoAI’s Discord server in the #built_with_octo channel!

If you want to learn how you can put OctoShop behind a Discord Bot or build your own model containers with Docker, we also have instructions on how to do that from an AI/ML workshop organized by OctoAI at DockerCon 2023.

About OctoAI

OctoAI provides infrastructure to run GenAI at scale, efficiently, and robustly. The model endpoints that OctoAI delivers to serve models like Mixtral, Stable Diffusion XL, etc. all rely on Docker to containerize models and make them easier to serve at scale.

If you go to octoai.cloud, you’ll find three complementary solutions that developers can build on to bring their GenAI-powered apps and pipelines into production.

Image Generation solution exposes endpoints and APIs to perform text to image, image to image tasks built around open source foundational models such as Stable Diffusion XL or SSD.
Text Generation solution exposes endpoints and APIs to perform text generation tasks built around open source foundational models, such as Mixtral/Mistral, Llama2, or CodeLlama.
Compute solution lets you deploy and manage any dockerized model container on capable OctoAI cloud endpoints to power your demanding GenAI needs. This compute service complements the image generation and text generation solutions by exposing infinite programmability and customizability for AI tasks that are not currently readily available on either the image generation or text generation solutions.

Disclaimer

OctoShop is built on the foundation of CLIP Interrogator and SDXL, and Mistral-7B and is therefore likely to carry forward the potential dangers inherent in these base models. It’s capable of generating unintended, unsuitable, offensive, and/or incorrect outputs. We therefore strongly recommend exercising caution and conducting comprehensive assessments before deploying this model into any practical applications.

This GenAI model workflow doesn’t work on people as it won’t preserve their likeness; the pipeline works best on scenes, objects, or animals. Solutions are available to address this problem, such as face mapping techniques (also known as face swapping), which we can containerize with Docker and deploy on OctoAI Compute solution, but that’s something to cover in another blog post.

Conclusion

This article covered the fundamentals of building a GenAI model cocktail by relying on a combination of text generation, image generation, and compute solutions powered by the portability and scalability enabled by Docker containerization.

If you’re interested in learning more about building these kinds of GenAI model cocktails, check out the OctoAI demo page or join OctoAI on Discord to see what people have been building.

Acknowledgements

The authors acknowledge Justin Gage for his thorough review, as well as Luis Vega, Sameer Farooqui, and Pedro Toruella for their contributions to the DockerCon AI/ML Workshop 2023, which inspired this article. The authors also thank Cia Bodin and her daughter Ada for the drawing used in this blog post.

Learn more

Watch the DockerCon 2023 Docker for ML, AI, and Data Science workshop.
Get the latest release of Docker Desktop.
Vote on what’s next! Check out our public roadmap.
Have questions? The Docker community is here to help.
New to Docker? Get started.

Empowering Data-Driven Development: Docker’s Collaboration with Snowflake and Docker AI Advancements

Deanna Sparks — Wed, 06 Dec 2023 22:18:33 +0000

Docker, in collaboration with Snowflake, introduces an enhanced level of developer productivity when you leverage the power of Docker Desktop with Snowpark Container Services (private preview). At Snowflake BUILD, Docker presented a session showcasing the streamlined process of building, iterating, and efficiently managing data through containerization within Snowflake using Snowpark Container Services.

Watch the session to learn more about how this collaboration helps streamline development and application innovation with Docker, and read on for more details.

Docker Desktop with Snowpark Container Services helps empower developers, data engineers, and data scientists with the tools and insights needed to seamlessly navigate the intricacies of incorporating data, including AI/ML, into their workflows. Furthermore, the advancements in Docker AI within the development ecosystem promise to elevate GenAI development efforts now and in the future.

Through the collaborative efforts showcased between Docker and Snowflake, we aim to continue supporting and guiding developers, data engineers, and data scientists in leveraging these technologies effectively.

Accelerating deployment of data workloads with Docker and Snowpark

Why is Docker, a containerization platform, collaborating with Snowflake, a data-as-a-service company? Many organizations lack formal coordination between data and engineering teams, meaning every change might have to go through DevOps, slowing project delivery. Docker Desktop and Snowpark Container Services (private preview) improve collaboration between developers and data teams.

This collaboration allows data and engineering teams to work together, removing barriers to enable:

Ownership by streamlining development and deployment
Independence by removing traditional dependence on engineering stacks
Efficiency by reducing resources and improving cross-team coordination

With the growing number of applications that rely on data, Docker is invested in ensuring that containerization supports the changing development landscape to provide consistent value within your organization.

Streamlining Snowpark deployments with Docker Desktop

Docker Desktop provides many benefits to data teams, including improving data ingestion or enrichment and improving general workarounds when working with a data stack. Watch the video from Snowflake BUILD for a demo showing the power of Docker Desktop and Snowpark Container Services working together. We walk through:

How to create a Docker Image using Docker Desktop to help you drive consistency by encapsulating your code, libraries, dependencies, and configurations in an image.
How to push that image to a registry to make it portable and available to others with the correct permissions.
How to run the container as a job in Snowpark Container Services to help you scale your work with versioning and distributed deployments.

Using Docker Desktop with Snowpark Container Services provides an enhanced development experience for data engineers who can develop in one environment and deploy in another. For example, with Docker Desktop you can create on an Arm64 platform, yet deploy to Snowpark, an AMD64 platform. This functionality shows multi-platform images, so you can have a great local development environment and still deploy to Snowpark without any difficulty.

Boosting developer productivity with Docker AI

In alignment with Docker’s mission to increase the time developers spend on innovation and decrease the time they spend on everything else, Docker AI assists in streamlining the development lifecycle for both development and data teams. Docker AI, available in early access now, aims to simplify current tasks, boosting developer productivity by offering context-specific, automated guidance.

When using Snowpark Container Services, deploying the project to Snowpark is the next step once you’ve built your image. Leveraging its trained model on Snowpark documentation, Docker AI offers relevant recommendations within your project’s context. For example, it autocompletes Docker files with best practice suggestions and continually updates recommendations as projects evolve and security measures change.

This marks Docker’s initial phase of aiding the community’s journey in simplifying using big data and implementing context-specific AI guidance across the software development lifecycle. Despite the rising complexity of projects involving vast data sets, Docker AI provides support, streamlining processes and enhancing your experience throughout the development lifecycle.

Docker AI aims to deliver tailored, automated advice during Dockerfile or Docker Compose editing, local docker build debugging, and local testing. Docker AI leverages the wealth of knowledge from the millions of long-time Docker users to autogenerate best practices and recommend secure, updated images. With Docker AI, developers can concentrate more on innovating their applications and less time on tools and infrastructure. Sign up for the Docker AI Early Access Program now.

Improving the collaboration across development and data teams

Our continued investment in Docker Desktop and Docker AI, along with our key collaborators like Snowflake, help you streamline the process of building, iterating, and efficiently managing data through containerization.

Download Docker Desktop to get started today. Check with your admins — you may be surprised to find out your organization is already using Docker!

Learn more

Review Snowpark Container Services GitHub documentation.
Follow the Snowflake tutorial to leverage your Snowflake data and build a Docker Image.
Learn more about LLM and Hugging Face.
Sign up for the Docker AI Early Access Program.

Announcing the Docker AI/ML Hackathon 2023 Winners

Jennifer Kohl — Tue, 05 Dec 2023 16:44:46 +0000

The week of DockerCon 2023 in Los Angeles, we announced the kick-off of the Docker AI/ML Hackathon. The hackathon ran as a virtual event from October 3 to November 7 with support from partners including DataStax, Livecycle, Navan.ai, Neo4j, and OctoML. Leading up to the submission deadline, we ran a series of webinars on topics ranging from getting started with Docker Hub to setting up computer vision AI models on Docker, and more. You can watch the collection of webinars on YouTube.

The Docker AI/ML Hackathon encouraged participants to build solutions that were innovative, applicable in real life, use Docker technology, and have an impact on developer productivity. We made a lot of announcements at DockerCon, including the new GenAI Stack, and we couldn’t wait to see how developers would put this to work in their projects.

Participants competed for US$ 20,000 in cash prizes and exclusive Docker swag. Judging was based on criteria such as applicability, innovativeness, incorporation of Docker tooling, and impact on the developer experience and productivity. Read on to learn who took home the top prizes.

The winners

1st place

Signal0ne — This project automates insights from failed containers and anomalous resource usage through anomaly detection algorithms and a Docker desktop extension. Developed using Python and Angular, the Signal0ne tool provides rapid, accurate log analysis, even enabling self-debugging. The project’s key achievements include quick issue resolution for experienced engineers and enhanced debugging capabilities for less experienced ones.

2nd place

SeamlessML: Docker-Powered Serverless Model Orchestration — SeamlessML addresses the AI model deployment bottleneck by providing a simplified, scalable, and cost-effective solution. Leveraging Docker and serverless technologies, it enables easy deployment of machine learning models as scalable API endpoints, abstracting away complexities like server management and load balancing. The team successfully reduced deployment time from hours to minutes and created a local testing setup for confident cloud-like deployments.

3rd place

Dionysus — Dionysus is a developer collaboration platform that streamlines teamwork through automatic code documentation, efficient codebase search, and AI-powered meeting transcription. Built with a microservice architecture using NextJS for the frontend and a Python backend API, Docker containerization, and integration with GitHub, Dionysus simplifies development workflows. The team overcame challenges in integrating AI effectively, ensuring real-time updates and creating a user-friendly interface, resulting in a tool that automates code documentation, facilitates contextual code search, and provides real-time AI-driven meeting transcription.

Honorable mentions

The following winners took home swag prizes. We received so many fantastic submissions that we awarded honorable mentions to four more teams than originally planned!

What’s next?

Check out all project submissions on the Docker AI/ML Hackathon gallery page. Also, check out and contribute to the GenAI Stack project on GitHub and sign up to join the Docker AI Early Access program. We can’t wait to see what projects you create.

We had so much fun seeing the creativity that came from this hackathon. Stay tuned until the next one!