GenAI Stack – Docker https://www.docker.com Thu, 09 May 2024 18:40:26 +0000 en-US hourly 1 https://wordpress.org/?v=6.5.3 https://www.docker.com/wp-content/uploads/2024/02/cropped-docker-logo-favicon-32x32.png GenAI Stack – Docker https://www.docker.com 32 32 Creating AI-Enhanced Document Management with the GenAI Stack https://www.docker.com/blog/creating-ai-enhanced-document-management-with-the-genai-stack/ Tue, 07 May 2024 14:39:51 +0000 https://www.docker.com/?p=54113 Organizations must deal with countless reports, contracts, research papers, and other documents, but managing, deciphering, and extracting pertinent information from these documents can be challenging and time-consuming. In such scenarios, an AI-powered document management system can offer a transformative solution.

Developing Generative AI (GenAI) technologies with Docker offers endless possibilities not only for summarizing lengthy documents but also for categorizing them and generating detailed descriptions and even providing prompt insights you may have missed. This multi-faceted approach, powered by AI, changes the way organizations interact with textual data, saving both time and effort.

In this article, we’ll look at how to integrate Alfresco, a robust document management system, with the GenAI Stack to open up possibilities such as enhancing document analysis, automating content classification, transforming search capabilities, and more.

2400x1260 2024 gen ai stack v1

High-level architecture of Alfresco document management 

Alfresco is an open source content management platform designed to help organizations manage, share, and collaborate on digital content and documents. It provides a range of features for document management, workflow automation, collaboration, and records management.

You can find the Alfresco Community platform on Docker Hub. The Docker image for the UI, named alfresco-content-app, has more than 10 million pulls, while other core platform services have more than 1 million pulls.

Alfresco Community platform (Figure 1) provides various open source technologies to create a Content Service Platform, including:

  • Alfresco content repository is the core of Alfresco and is responsible for storing and managing content. This component exposes a REST API to perform operations in the repository.
  • Database: PostgreSQL, among others, serves as the database management system, storing the metadata associated with a document.
  • Apache Solr: Enhancing search capabilities, Solr enables efficient content and metadata searches within Alfresco.
  • Apache ActiveMQ: As an open source message broker, ActiveMQ enables asynchronous communication between various Alfresco services. Its Messaging API handles asynchronous messages in the repository.
  • UI reference applications: Share and Alfresco Content App provide intuitive interfaces for user interaction and accessibility.

For detailed instructions on deploying Alfresco Community with Docker Compose, refer to the official Alfresco documentation.

 Illustration of Alfresco Community platform architecture, showing PostgreSQL, ActiveMQ, repo, share, Alfresco content app, and more.
Figure 1: Basic diagram for Alfresco Community deployment with Docker.

Why integrate Alfresco with the GenAI Stack?

Integrating Alfresco with the GenAI Stack unlocks a powerful suite of GenAI services, significantly enhancing document management capabilities. Enhancing Alfresco document management with the GenAI stack services has different benefits:

  • Use different deployments according to resources available: Docker allows you to easily switch between different Large Language Models (LLMs) of different sizes. Additionally, if you have access to GPUs, you can deploy a container with a GPU-accelerated LLM for faster inference. Conversely, if GPU resources are limited or unavailable, you can deploy a container with a CPU-based LLM.
  • Portability: Docker containers encapsulate the GenAI service, its dependencies, and runtime environment, ensuring consistent behavior across different environments. This portability allows you to develop and test the AI model locally and then deploy it seamlessly to various platforms.
  • Production-ready: The stack provides support for GPU-accelerated computing, making it well suited for deploying GenAI models in production environments. Docker’s declarative approach to deployment allows you to define the desired state of the system and let Docker handle the deployment details, ensuring consistency and reliability.
  • Integration with applications: Docker facilitates integration between GenAI services and other applications deployed as containers. You can deploy multiple containers within the same Docker environment and orchestrate communication between them using Docker networking. This integration enables you to build complex systems composed of microservices, where GenAI capabilities can be easily integrated into larger applications or workflows.

How does it work?

Alfresco provides two main APIs for integration purposes: the Alfresco REST API and the Alfresco Messaging API (Figure 2).

  • The Alfresco REST API provides a set of endpoints that allow developers to interact with Alfresco content management functionalities over HTTP. It enables operations such as creating, reading, updating, and deleting documents, folders, users, groups, and permissions within Alfresco. 
  • The Alfresco Messaging API provides a messaging infrastructure for asynchronous communication built on top of Apache ActiveMQ and follows the publish-subscribe messaging pattern. Integration with the Messaging API allows developers to build event-driven applications and workflows that respond dynamically to changes and updates within the Alfresco Repository.

The Alfresco Repository can be updated with the enrichment data provided by GenAI Service using both APIs:

  • The Alfresco REST API may retrieve metadata and content from existing repository nodes to be sent to GenAI Service, and update back the node.
  • The Alfresco Messaging API may be used to consume new and updated nodes in the repository and obtain the result from the GenAI Service.
 Illustration showing integration of two main Alfresco APIs: REST API and Messaging API.
Figure 2: Alfresco provides two main APIs for integration purposes: the Alfresco REST API and the Alfresco Messaging API.

Technically, Docker deployment includes both the Alfresco and GenAI Stack platforms running over the same Docker network (Figure 3). 

The GenAI Stack works as a REST API service with endpoints available in genai:8506, whereas Alfresco uses a REST API client (named alfresco-ai-applier) and a Messages API client (named alfresco-ai-listener) to integrate with AI services. Both clients can also be run as containers.

 Illustration of deployment architecture, showing Alfresco and GenAI Stack.
Figure 3: Deployment architecture for Alfresco integration with GenAI Stack services.

The GenAI Stack service provides the following endpoints:

  • summary: Returns a summary of a document together with several tags. It allows some customization, like the language of the response, the number of words in the summary and the number of tags.
  • classify: Returns a term from a list that best matches the document. It requires a list of terms as input in addition to the document to be classified.
  • prompt: Replies to a custom user prompt using retrieval-augmented generation (RAG) for the document to limit the scope of the response.
  • describe: Returns a text description for an input picture.

The implementation of GenAI Stack services loads the document text into chunks in Neo4j VectorDB to improve QA chains with embeddings and prevent hallucinations in the response. Pictures are processed using an LLM with a visual encoder (LlaVA) to generate descriptions (Figure 4). Note that Docker GenAI Stack allows for the use of multiple LLMs for different goals.

 Illustration of GenAI Stack Services, showing Document Loader, LLM embeddings, VectorDB, QA Chain, and more.
Figure 4: The GenAI Stack services are implemented using RAG and an LLM with visual encoder (LlaVA) for describing pictures.

Getting started 

To get started, check the following:

Obtaining the amount of RAM available for Docker Desktop can be done using following command:

docker info --format '{{json .MemTotal}}'

If the result is under 20 GiB, follow the instructions in Docker official documentation for your operating system to boost the memory limit for Docker Desktop.

Clone the repository

Use the following command to close the repository:

git clone https://github.com/aborroy/alfresco-genai.git

The project includes the following components:

  • genai-stack folder is using https://github.com/docker/genai-stack project to build a REST endpoint that provides AI services for a given document.
  • alfresco folder includes a Docker Compose template to deploy Alfresco Community 23.1.
  • alfresco-ai folder includes a set of projects related to Alfresco integration.
    • alfresco-ai-model defines a custom Alfresco content model to store summaries, terms and prompts to be deployed in Alfresco Repository and Share App.
    • alfresco-ai-applier uses the Alfresco REST API to apply summaries or terms for a populated Alfresco Repository.
    • alfresco-ai-listener listens to messages and generates summaries for created or updated nodes in Alfresco Repository.
  • compose.yaml file describes a deployment for Alfresco and GenAI Stack services using include directive.

Starting Docker GenAI service

The Docker GenAI Service for Alfresco, located in the genai-stack folder, is based on the Docker GenAI Stack project, and provides the summarization service as a REST endpoint to be consumed from Alfresco integration.

cd genai-stack

Before running the service, modify the .env file to adjust available preferences:

# Choose any of the on premise models supported by ollama
LLM=mistral
LLM_VISION=llava
# Any language name supported by chosen LLM
SUMMARY_LANGUAGE=English
# Number of words for the summary
SUMMARY_SIZE=120
# Number of tags to be identified with the summary
TAGS_NUMBER=3

Start the Docker Stack using the standard command:

docker compose up --build --force-recreate

After the service is up and ready, the summary REST endpoint becomes accessible. You can test its functionality using a curl command.

Use a local PDF file (file.pdf in the following sample) to obtain a summary and a number of tags.

curl --location 'http://localhost:8506/summary' \
--form 'file=@"./file.pdf"'
{ 
  "summary": " The text discusses...", 
  "tags": " Golang, Merkle, Difficulty", 
  "model": "mistral"
}

Use a local PDF file (file.pdf in the following sample) and a list of terms (such as Japanese or Spanish) to obtain a classification of the document.

curl --location \
'http://localhost:8506/classify?termList=%22Japanese%2CSpanish%22' \
--form 'file=@"./file.pdf"'
{
    "term": " Japanese",
    "model": "mistral"
}

Use a local PDF file (file.pdf in the following sample) and a prompt (such as “What is the name of the son?”) to obtain a response regarding the document.

curl --location \
'http://localhost:8506/prompt?prompt=%22What%20is%20the%20name%20of%20the%20son%3F%22' \
--form 'file=@"./file.pdf"'
{
    "answer": " The name of the son is Musuko.",
    "model": "mistral"
}

Use a local picture file (picture.jpg in the following sample) to obtain a text description of the image.

curl --location 'http://localhost:8506/describe' \
--form 'image=@"./picture.jpg"'
{
    "description": " The image features a man standing... ",
    "model": "llava"
}

Note that, in this case, LlaVA LLM is used instead of Mistral.

Make sure to stop Docker Compose before continuing to the next step.

Starting Alfresco

The Alfresco Platform, located in the alfresco folder, provides a sample deployment of the Alfresco Repository including a customized content model to store results obtained from the integration with the GenAI Service.

Because we want to run both Alfresco and GenAI together, we’ll use the compose.yaml file located in the project’s main folder.

include:
  - genai-stack/compose.yaml
  - alfresco/compose.yaml
#  - alfresco/compose-ai.yaml

In this step, we’re deploying only GenAI Stack and Alfresco, so make sure to leave the compose.ai.yaml line commented out.

Start the stack using the standard command:

docker compose up --build --force-recreate

After the service is up and ready, the Alfresco Repository becomes accessible. You can test the platform using default credentials (admin/admin) in the following URLs:

Enhancing existing documents within Alfresco 

The AI Applier application, located in the alfresco-ai/alfresco-ai-applier folder, contains a Spring Boot application that retrieves documents stored in an Alfresco folder, obtains the response from the GenAI Service and updates the original document in Alfresco.

Before running the application for the first time, you’ll need to build the source code using Maven.

cd alfresco-ai/alfresco-ai-applier
mvn clean package

As we have GenAI Service and Alfresco Platform up and running from the previous steps, we can upload documents to the Alfresco Shared Files/summary folder and run the program to update the documents with the summary.

java -jar target/alfresco-ai-applier-0.8.0.jar \
--applier.root.folder=/app:company_home/app:shared/cm:summary \
--applier.action=SUMMARY
...
Processing 2 documents of a total of 2
END: All documents have been processed. The app may need to be executed again for nodes without existing PDF rendition.

Once the process has been completed, every Alfresco document in the Shared Files/summary folder will include the information obtained by the GenAI Stack service: summary, tags, and LLM used (Figure 5).

Screenshot of Document details in Alfresco, showing Document properties, Summary, tags, and LLM used.
Figure 5: The document has been updated in Alfresco Repository with summary, tags and model (LLM).

You can now upload documents to the Alfresco Shared Files/classify folder to prepare the repository for the next step.

Classifying action can be applied to documents in the Alfresco Shared Files/classify folder using the following command. GenAI Service will pick the term from the list (English, Spanish, Japanese) that best matches each document in the folder.

java -jar target/alfresco-ai-applier-0.8.0.jar \
--applier.root.folder=/app:company_home/app:shared/cm:classify \
--applier.action=CLASSIFY \
--applier.action.classify.term.list=English,Spanish,Japanese
...
Processing 2 documents of a total of 2
END: All documents have been processed. The app may need to be executed again for nodes without existing PDF rendition.

Upon completion, every Alfresco document in the Shared Files folder will include the information obtained by the GenAI Stack service: a term from the list of terms and the LLM used (Figure 6).

Screenshot showing document classification update in Alfresco Repository.
Figure 6: The document has been updated in Alfresco Repository with term and model (LLM).

You can upload pictures to the Alfresco Shared Files/picture folder to prepare the repository for the next step.

To obtain a text description from pictures, create a new folder named picture under the Shared Files folder. Upload any image file to this folder and run the following command:

java -jar target/alfresco-ai-applier-0.8.0.jar \
--applier.root.folder=/app:company_home/app:shared/cm:picture \
--applier.action=DESCRIBE
...
Processing 1 documents of a total of 1
END: All documents have been processed. The app may need to be executed again for nodes without existing PDF rendition.

Following this process, every Alfresco image in the picture folder will include the information obtained by the GenAI Stack service: a text description and the LLM used (Figure 7).

Screenshot showing document description update in Alfresco repository.
Figure 7: The document has been updated in Alfresco Repository with text description and model (LLM).

Enhancing new documents uploaded to Alfresco

The AI Listener application, located in the alfresco-ai/alfresco-ai-listener folder, contains a Spring Boot application that listens to Alfresco messages, obtains the response from the GenAI Service and updates the original document in Alfresco.

Before running the application for the first time, you’ll need to build the source code using Maven and to build the Docker image.

cd alfresco-ai/alfresco-ai-listener
mvn clean package
docker build . -t alfresco-ai-listener

As we are using the AI Listener application as a container, stop the Alfresco deployment and uncomment the alfresco-ai-listener in the compose.yaml file.

include:
  - genai-stack/compose.yaml
  - alfresco/compose.yaml
  - alfresco/compose-ai.yaml

Start the stack using the standard command:

docker compose up --build --force-recreate

After the service is again up and ready, the Alfresco Repository becomes accessible. You can verify that the platform is working by using default credentials (admin/admin) in the following URLs:

Summarization

Next, upload a new document and apply the “Summarizable with AI” aspect to the document. After a while, the document will include the information obtained by the GenAI Stack service: summary, tags, and LLM used.

Description

If you want to use AI enhancement, you might want to set up a folder that automatically applies the necessary aspect, instead of doing it manually.

Create a new folder named pictures in Alfresco Repository and create a rule with the following settings in it:

  • Name: description
  • When: Items are created or enter this folder
  • If all criteria are met: All Items
  • Perform Action: Add “Descriptable with AI” aspect

Upload a new picture to this folder. After a while, without manual setting of the aspect, the document will include the information obtained by the GenAI Stack service: description and LLM used.

Classification

Create a new folder named classifiable in Alfresco Repository. Apply the “Classifiable with AI” aspect to this folder and add a list of terms separated by comma in the “Terms” property (such as English, Japanese, Spanish).

Create a new rule for classifiable folder with the following settings:

  • Name: classifiable
  • When: Items are created or enter this folder
  • If all criteria are met: All Items
  • Perform Action: Add “Classified with AI” aspect

Upload a new document to this folder. After a while, the document will include the information obtained by the GenAI Stack service: term and LLM used.

A degree of automation can be achieved when using classification with AI. To do this, a simple Alfresco Repository script named classify.js needs to be created in the folder “Repository/Data Dictionary/Scripts” with following content.

document.move(
  document.parent.childByNamePath(    
    document.properties["genai:term"]));

Create a new rule for classifiable folder to apply this script with following settings:

  • Name: move
  • When: Items are updated
  • If all criteria are met: All Items
  • Perform Action: Execute classify.js script

Create a child folder of the classifiable folder with the name of every term defined in the “Terms” property. 

When you set up this configuration, any documents uploaded to the folder will automatically be moved to a subfolder based on the identified term. This means that the documents are classified automatically.

Prompting

Finally, to use the prompting GenAI feature, apply the “Promptable with AI” aspect to an existing document. Type your question in the “Question” property.

After a while, the document will include the information obtained by the GenAI Stack service: answer and LLM used.

A new era of document management

By embracing this framework, you can not only unlock a new level of efficiency, productivity, and user experience but also lay the foundation for limitless innovation. With Alfresco and GenAI Stack, the possibilities are endless — from enhancing document analysis and automating content classification to revolutionizing search capabilities and beyond.

If you’re unsure about any part of this process, check out the following video, which demonstrates all the steps live:

Learn more

]]>
Creating AI-Enhanced Document Management with the Docker GenAI Stack nonadult
A Promising Methodology for Testing GenAI Applications in Java https://www.docker.com/blog/testing-genai-applications-in-java/ Wed, 24 Apr 2024 16:03:14 +0000 https://www.docker.com/?p=54150 In the vast universe of programming, the era of generative artificial intelligence (GenAI) has marked a turning point, opening up a plethora of possibilities for developers.

Tools such as LangChain4j and Spring AI have democratized access to the creation of GenAI applications in Java, allowing Java developers to dive into this fascinating world. With Langchain4j, for instance, setting up and interacting with large language models (LLMs) has become exceptionally straightforward. Consider the following Java code snippet:

public static void main(String[] args) {
    var llm = OpenAiChatModel.builder()
            .apiKey("demo")
            .modelName("gpt-3.5-turbo")
            .build();
    System.out.println(llm.generate("Hello, how are you?"));
}

This example illustrates how a developer can quickly instantiate an LLM within a Java application. By simply configuring the model with an API key and specifying the model name, developers can begin generating text responses immediately. This accessibility is pivotal for fostering innovation and exploration within the Java community. More than that, we have a wide range of models that can be run locally, and various vector databases for storing embeddings and performing semantic searches, among other technological marvels.

Despite this progress, however, we are faced with a persistent challenge: the difficulty of testing applications that incorporate artificial intelligence. This aspect seems to be a field where there is still much to explore and develop.

In this article, I will share a methodology that I find promising for testing GenAI applications.

2400x1260 2024 GenAi

Project overview

The example project focuses on an application that provides an API for interacting with two AI agents capable of answering questions. 

An AI agent is a software entity designed to perform tasks autonomously, using artificial intelligence to simulate human-like interactions and responses. 

In this project, one agent uses direct knowledge already contained within the LLM, while the other leverages internal documentation to enrich the LLM through retrieval-augmented generation (RAG). This approach allows the agents to provide precise and contextually relevant answers based on the input they receive.

I prefer to omit the technical details about RAG, as ample information is available elsewhere. I’ll simply note that this example employs a particular variant of RAG, which simplifies the traditional process of generating and storing embeddings for information retrieval.

Instead of dividing documents into chunks and making embeddings of those chunks, in this project, we will use an LLM to generate a summary of the documents. The embedding is generated based on that summary.

When the user writes a question, an embedding of the question will be generated and a semantic search will be performed against the embeddings of the summaries. If a match is found, the user’s message will be augmented with the original document.

This way, there’s no need to deal with the configuration of document chunks, worry about setting the number of chunks to retrieve, or worry about whether the way of augmenting the user’s message makes sense. If there is a document that talks about what the user is asking, it will be included in the message sent to the LLM.

Technical stack

The project is developed in Java and utilizes a Spring Boot application with Testcontainers and LangChain4j.

For setting up the project, I followed the steps outlined in Local Development Environment with Testcontainers and Spring Boot Application Testing and Development with Testcontainers.

I also use Tescontainers Desktop to facilitate database access and to verify the generated embeddings as well as to review the container logs.

The challenge of testing

The real challenge arises when trying to test the responses generated by language models. Traditionally, we could settle for verifying that the response includes certain keywords, which is insufficient and prone to errors.

static String question = "How I can install Testcontainers Desktop?";
@Test
    void verifyRaggedAgentSucceedToAnswerHowToInstallTCD() {
        String answer  = restTemplate.getForObject("/chat/rag?question={question}", ChatController.ChatResponse.class, question).message();
        assertThat(answer).contains("https://testcontainers.com/desktop/");
    }

This approach is not only fragile but also lacks the ability to assess the relevance or coherence of the response.

An alternative is to employ cosine similarity to compare the embeddings of a “reference” response and the actual response, providing a more semantic form of evaluation. 

This method measures the similarity between two vectors/embeddings by calculating the cosine of the angle between them. If both vectors point in the same direction, it means the “reference” response is semantically the same as the actual response.

static String question = "How I can install Testcontainers Desktop?";
static String reference = """
       - Answer must indicate to download Testcontainers Desktop from https://testcontainers.com/desktop/
       - Answer must indicate to use brew to install Testcontainers Desktop in MacOS
       - Answer must be less than 5 sentences
       """;
@Test
    void verifyRaggedAgentSucceedToAnswerHowToInstallTCD() {
        String answer  = restTemplate.getForObject("/chat/rag?question={question}", ChatController.ChatResponse.class, question).message();
        double cosineSimilarity = getCosineSimilarity(reference, answer);
        assertThat(cosineSimilarity).isGreaterThan(0.8);
    }

However, this method introduces the problem of selecting an appropriate threshold to determine the acceptability of the response, in addition to the opacity of the evaluation process.

Toward a more effective method

The real problem here arises from the fact that answers provided by the LLM are in natural language and non-deterministic. Because of this, using current testing methods to verify them is difficult, as these methods are better suited to testing predictable values. 

However, we already have a great tool for understanding non-deterministic answers in natural language: LLMs themselves. Thus, the key may lie in using one LLM to evaluate the adequacy of responses generated by another LLM. 

This proposal involves defining detailed validation criteria and using an LLM as a “Validator Agent” to determine if the responses meet the specified requirements. This approach can be applied to validate answers to specific questions, drawing on both general knowledge and specialized information

By incorporating detailed instructions and examples, the Validator Agent can provide accurate and justified evaluations, offering clarity on why a response is considered correct or incorrect.

static String question = "How I can install Testcontainers Desktop?";
    static String reference = """
            - Answer must indicate to download Testcontainers Desktop from https://testcontainers.com/desktop/
            - Answer must indicate to use brew to install Testcontainers Desktop in MacOS
            - Answer must be less than 5 sentences
            """;

    @Test
    void verifyStraightAgentFailsToAnswerHowToInstallTCD() {
        String answer  = restTemplate.getForObject("/chat/straight?question={question}", ChatController.ChatResponse.class, question).message();
        ValidatorAgent.ValidatorResponse validate = validatorAgent.validate(question, answer, reference);
        assertThat(validate.response()).isEqualTo("no");
    }

    @Test
    void verifyRaggedAgentSucceedToAnswerHowToInstallTCD() {
        String answer  = restTemplate.getForObject("/chat/rag?question={question}", ChatController.ChatResponse.class, question).message();
        ValidatorAgent.ValidatorResponse validate = validatorAgent.validate(question, answer, reference);
        assertThat(validate.response()).isEqualTo("yes");
    }

We can even test more complex responses where the LLM should suggest a better alternative to the user’s question.

static String question = "How I can find the random port of a Testcontainer to connect to it?";
    static String reference = """
            - Answer must not mention using getMappedPort() method to find the random port of a Testcontainer
            - Answer must mention that you don't need to find the random port of a Testcontainer to connect to it
            - Answer must indicate that you can use the Testcontainers Desktop app to configure fixed port
            - Answer must be less than 5 sentences
            """;

    @Test
    void verifyRaggedAgentSucceedToAnswerHowToDebugWithTCD() {
        String answer  = restTemplate.getForObject("/chat/rag?question={question}", ChatController.ChatResponse.class, question).message();
        ValidatorAgent.ValidatorResponse validate = validatorAgent.validate(question, answer, reference);
        assertThat(validate.response()).isEqualTo("yes");
    }

Validator Agent

The configuration for the Validator Agent doesn’t differ from that of other agents. It is built using the LangChain4j AI Service and a list of specific instructions:

public interface ValidatorAgent {
    @SystemMessage("""
                ### Instructions
                You are a strict validator.
                You will be provided with a question, an answer, and a reference.
                Your task is to validate whether the answer is correct for the given question, based on the reference.
                
                Follow these instructions:
                - Respond only 'yes', 'no' or 'unsure' and always include the reason for your response
                - Respond with 'yes' if the answer is correct
                - Respond with 'no' if the answer is incorrect
                - If you are unsure, simply respond with 'unsure'
                - Respond with 'no' if the answer is not clear or concise
                - Respond with 'no' if the answer is not based on the reference
                
                Your response must be a json object with the following structure:
                {
                    "response": "yes",
                    "reason": "The answer is correct because it is based on the reference provided."
                }
                
                ### Example
                Question: Is Madrid the capital of Spain?
                Answer: No, it's Barcelona.
                Reference: The capital of Spain is Madrid
                ###
                Response: {
                    "response": "no",
                    "reason": "The answer is incorrect because the reference states that the capital of Spain is Madrid."
                }
                """)
    @UserMessage("""
            ###
            Question: {{question}}
            ###
            Answer: {{answer}}
            ###
            Reference: {{reference}}
            ###
            """)
    ValidatorResponse validate(@V("question") String question, @V("answer") String answer, @V("reference") String reference);

    record ValidatorResponse(String response, String reason) {}
}

As you can see, I’m using Few-Shot Prompting to guide the LLM on the expected responses. I also request a JSON format for responses to facilitate parsing them into objects, and I specify that the reason for the answer must be included, to better understand the basis of its verdict.

Conclusion

The evolution of GenAI applications brings with it the challenge of developing testing methods that can effectively evaluate the complexity and subtlety of responses generated by advanced artificial intelligences. 

The proposal to use an LLM as a Validator Agent represents a promising approach, paving the way towards a new era of software development and evaluation in the field of artificial intelligence. Over time, we hope to see more innovations that allow us to overcome the current challenges and maximize the potential of these transformative technologies.

Learn more

]]>
Building a Video Analysis and Transcription Chatbot with the GenAI Stack https://www.docker.com/blog/building-a-video-analysis-and-transcription-chatbot-with-the-genai-stack/ Thu, 28 Mar 2024 14:32:22 +0000 https://www.docker.com/?p=53230 Videos are full of valuable information, but tools are often needed to help find it. From educational institutions seeking to analyze lectures and tutorials to businesses aiming to understand customer sentiment in video reviews, transcribing and understanding video content is crucial for informed decision-making and innovation. Recently, advancements in AI/ML technologies have made this task more accessible than ever. 

Developing GenAI technologies with Docker opens up endless possibilities for unlocking insights from video content. By leveraging transcription, embeddings, and large language models (LLMs), organizations can gain deeper understanding and make informed decisions using diverse and raw data such as videos. 

In this article, we’ll dive into a video transcription and chat project that leverages the GenAI Stack, along with seamless integration provided by Docker, to streamline video content processing and understanding. 

2400x1260 building next gen video analysis transcription chatbot with genai stack

High-level architecture 

The application’s architecture is designed to facilitate efficient processing and analysis of video content, leveraging cutting-edge AI technologies and containerization for scalability and flexibility. Figure 1 shows an overview of the architecture, which uses Pinecone to store and retrieve the embeddings of video transcriptions. 

Two-part illustration showing “yt-whisper” process on the left, which involves downloading audio, transcribing it using Whisper (an audio transcription system), computing embeddings (mathematical representations of the audio features), and saving those embeddings into Pinecone. On the right side (labeled "dockerbot"), the process includes computing a question embedding, completing a chat with the question combined with provided transcriptions and knowledge, and retrieving relevant transcriptions.
Figure 1: Schematic diagram outlining a two-component system for processing and interacting with video data.

The application’s high-level service architecture includes the following:

  • yt-whisper: A local service, run by Docker Compose, that interacts with the remote OpenAI and Pinecone services. Whisper is an automatic speech recognition (ASR) system developed by OpenAI, representing a significant milestone in AI-driven speech processing. Trained on an extensive dataset of 680,000 hours of multilingual and multitask supervised data sourced from the web, Whisper demonstrates remarkable robustness and accuracy in English speech recognition. 
  • Dockerbot: A local service, run by Docker Compose, that interacts with the remote OpenAI and Pinecone services. The service takes the question of a user, computes a corresponding embedding, and then finds the most relevant transcriptions in the video knowledge database. The transcriptions are then presented to an LLM, which takes the transcriptions and the question and tries to provide an answer based on this information.
  • OpenAI: The OpenAI API provides an LLM service, which is known for its cutting-edge AI and machine learning technologies. In this application, OpenAI’s technology is used to generate transcriptions from audio (using the Whisper model) and to create embeddings for text data, as well as to generate responses to user queries (using GPT and chat completions).
  • Pinecone: A vector database service optimized for similarity search, used for building and deploying large-scale vector search applications. In this application, Pinecone is employed to store and retrieve the embeddings of video transcriptions, enabling efficient and relevant search functionality within the application based on user queries.

Getting started

To get started, complete the following steps:

The application is a chatbot that can answer questions from a video. Additionally, it provides timestamps from the video that can help you find the sources used to answer your question.

Clone the repository 

The next step is to clone the repository:

git clone https://github.com/dockersamples/docker-genai.git

The project contains the following directories and files:

├── docker-genai/
│ ├── docker-bot/
│ ├── yt-whisper/
│ ├── .env.example
│ ├── .gitignore
│ ├── LICENSE
│ ├── README.md
│ └── docker-compose.yaml

Specify your API keys

In the /docker-genai directory, create a text file called .env, and specify your API keys inside. The following snippet shows the contents of the .env.example file that you can refer to as an example.

#-------------------------------------------------------------
# OpenAI
#-------------------------------------------------------------
OPENAI_TOKEN=your-api-key # Replace your-api-key with your personal API key

#-------------------------------------------------------------
# Pinecone
#--------------------------------------------------------------
PINECONE_TOKEN=your-api-key # Replace your-api-key with your personal API key

Build and run the application

In a terminal, change directory to your docker-genai directory and run the following command:

docker compose up --build

Next, Docker Compose builds and runs the application based on the services defined in the docker-compose.yaml file. When the application is running, you’ll see the logs of two services in the terminal.

In the logs, you’ll see the services are exposed on ports 8503 and 8504. The two services are complementary to each other.

The yt-whisper service is running on port 8503. This service feeds the Pinecone database with videos that you want to archive in your knowledge database. The next section explores the yt-whisper service.

Using yt-whisper

The yt-whisper service is a YouTube video processing service that uses the OpenAI Whisper model to generate transcriptions of videos and stores them in a Pinecone database. The following steps outline how to use the service.

Open a browser and access the yt-whisper service at http://localhost:8503. Once the application appears, specify a YouTube video URL in the URL field and select Submit. The example shown in Figure 2 uses a video from David Cardozo.

Screenshot showing example of processed content with "download transcription" option for a video from David Cardozo on how to "Develop ML interactive gpu-workflows with Visual Studio Code, Docker and Docker Hub."
Figure 2: A web interface showcasing processed video content with a feature to download transcriptions.

Submitting a video

The yt-whisper service downloads the audio of the video, then uses Whisper to transcribe it into a WebVTT (*.vtt) format (which you can download). Next, it uses the “text-embedding-3-small” model to create embeddings and finally uploads those embeddings into the Pinecone database.

After the video is processed, a video list appears in the web app that informs you which videos have been indexed in Pinecone. It also provides a button to download the transcript.

Accessing Dockerbot chat service

You can now access the Dockerbot chat service on port 8504 and ask questions about the videos as shown in Figure 3.

Screenshot of Dockerbot interaction with user asking a question about Nvidia containers and Dockerbot responding with links to specific timestamps in the video.
Figure 3: Example of a user asking Dockerbot about NVIDIA containers and the application giving a response with links to specific timestamps in the video.

Conclusion

In this article, we explored the exciting potential of GenAI technologies combined with Docker for unlocking valuable insights from video content. It shows how the integration of cutting-edge AI models like Whisper, coupled with efficient database solutions like Pinecone, empowers organizations to transform raw video data into actionable knowledge. 

Whether you’re an experienced developer or just starting to explore the world of AI, the provided resources and code make it simple to embark on your own video-understanding projects. 

Learn more

]]>
Introducing a New GenAI Stack: Streamlined AI/ML Integration Made Easy https://www.docker.com/blog/introducing-a-new-genai-stack/ Thu, 05 Oct 2023 15:56:14 +0000 https://www.docker.com/?p=46886 At DockerCon 2023, with partners Neo4j, LangChain, and Ollama, we announced a new GenAI Stack. We have brought together the top technologies in the generative artificial intelligence (GenAI) space to build a solution that allows developers to deploy a full GenAI stack with only a few clicks.

banner dockercon23 docker ai ml

Here’s what’s included in the new GenAI Stack:

1. Pre-configured LLMs: We provide preconfigured Large Language Models (LLMs), such as  Llama2, GPT-3.5, and GPT-4, to jumpstart your AI projects.

2. Ollama management: Ollama simplifies the local management of open source LLMs, making your AI development process smoother.

3. Neo4j as the default database: Neo4j serves as the default database, offering graph and native vector search capabilities. This helps uncover data patterns and relationships, ultimately enhancing the speed and accuracy of AI/ML models. Neo4j also serves as a long-term memory for these models.

4. Neo4j knowledge graphs: Neo4j knowledge graphs to ground LLMs for more precise GenAI predictions and outcomes.

5. LangChain orchestration: LangChain facilitates communication between the LLM, your application, and the database, along with a robust vector index. LangChain serves as a framework for developing applications powered by LLMs. This includes LangSmith, an exciting new way to debug, test, evaluate, and monitor your LLM applications.

6. Comprehensive support: To support your GenAI journey, we provide a range of helpful tools, code templates, how-to guides, and GenAI best practices. These resources ensure you have the guidance you need.

genai stack f1
Figure 1: The GenAI Stack guide and access to the GenAI Stack components.

Conclusion

The GenAI Stack simplifies AI/ML integration, making it accessible to developers. Docker’s commitment to fostering innovation and collaboration means we’re excited to see the practical applications and solutions that will emerge from this ecosystem. Join us as we make AI/ML more accessible and straightforward for developers everywhere.

The GenAI Stack is available in Early Access now and is accessible from the Docker Desktop Learning Center or on GitHub

Participate in our Docker Docker AI/ML Hackathon to show off your most creative AI/ML solutions built on Docker. Read our blog post “Announcing Docker AI/ML Hackathon” to learn more.

At DockerCon 2023, Docker also announced its first AI-powered product, Docker AI. Sign up now for early access to Docker AI. 

Learn more

]]>