Sophia Parafina – Docker

A Quick Guide to Containerizing Llamafile with Docker for AI Applications

Sophia Parafina — Wed, 15 May 2024 13:39:56 +0000

This post was contributed by Sophia Parafina.

Keeping pace with the rapid advancements in artificial intelligence can be overwhelming. Every week, new Large Language Models (LLMs), vector databases, and innovative techniques emerge, potentially transforming the landscape of AI/ML development. Our extensive collaboration with developers has uncovered numerous creative and effective strategies to harness Docker in AI development.

This quick guide shows how to use Docker to containerize llamafile, an executable that brings together all the components needed to run a LLM chatbot with a single file. This guide will walk you through the process of containerizing llamafile and having a functioning chatbot running for experimentation.

Llamafile’s concept of bringing together LLMs and local execution has sparked a high level of interest in the GenAI space, as it aims to simplify the process of getting a functioning LLM chatbot running locally.

Containerize llamafile

Llamafile is a Mozilla project that runs open source LLMs, such as Llama-2-7B, Mistral 7B, or any other models in the GGUF format. The Dockerfile builds and containerizes llamafile, then runs it in server mode. It uses Debian trixie as the base image to build llamafile. The final or output image uses debian:stable as the base image.

To get started, copy, paste, and save the following in a file named Dockerfile.

# Use debian trixie for gcc13
FROM debian:trixie as builder

# Set work directory
WORKDIR /download

# Configure build container and build llamafile
RUN mkdir out && \
    apt-get update && \
    apt-get install -y curl git gcc make && \
    git clone https://github.com/Mozilla-Ocho/llamafile.git  && \
    curl -L -o ./unzip https://cosmo.zip/pub/cosmos/bin/unzip && \
    chmod 755 unzip && mv unzip /usr/local/bin && \
    cd llamafile && make -j8 LLAMA_DISABLE_LOGS=1 && \ 
    make install PREFIX=/download/out

# Create container
FROM debian:stable as out

# Create a non-root user
RUN addgroup --gid 1000 user && \
    adduser --uid 1000 --gid 1000 --disabled-password --gecos "" user

# Switch to user
USER user

# Set working directory
WORKDIR /usr/local

# Copy llamafile and man pages
COPY --from=builder /download/out/bin ./bin
COPY --from=builder /download/out/share ./share/man

# Expose 8080 port.
EXPOSE 8080

# Set entrypoint.
ENTRYPOINT ["/bin/sh", "/usr/local/bin/llamafile"]

# Set default command.
CMD ["--server", "--host", "0.0.0.0", "-m", "/model"]

To build the container, run:

docker build -t llamafile .

Running the llamafile container

To run the container, download a model such as Mistral-7b-v0.1. The example below saves the model to the model directory, which is mounted as a volume.

$ docker run -d -v ./model/mistral-7b-v0.1.Q5_K_M.gguf:/model -p 8080:8080 llamafile

The container will open a browser window with the llama.cpp interface (Figure 1).

Figure 1: Llama.cpp is a C/C++ port of Facebook’s LLaMA model by Georgi Gerganov, optimized for efficient LLM inference across various devices, including Apple silicon, with a straightforward setup and advanced performance tuning features.

$ curl -s http://localhost:8080/v1/chat/completions -H "Content-Type: application/json" -d '{
  "model": "gpt-3.5-turbo",
  "messages": [
    {
      "role": "system",
      "content": "You are a poetic assistant, skilled in explaining complex programming concepts with creative flair."
    },
    {
      "role": "user",
      "content": "Compose a poem that explains the concept of recursion in programming."
    }
  ]
}' | python3 -c '
import json
import sys
json.dump(json.load(sys.stdin), sys.stdout, indent=2)
print()
'

Llamafile has many parameters to tune the model. You can see the parameters with man llama file or llama file --help. Parameters can be set in the Dockerfile CMD directive.

Now that you have a containerized llamafile, you can run the container with the LLM of your choice and begin your testing and development journey.

What’s next?

To continue your AI development journey, read the Docker GenAI guide, review the additional AI content on the blog, and check out our resources.

Learn more

Read the Docker AI/ML blog post collection.
Download the Docker GenAI guide.
Read the Llamafile announcement post on Mozilla.org.
Subscribe to the Docker Newsletter.
Have questions? The Docker community is here to help.
New to Docker? Get started.

Improved Docker Container Integration with Java 10

Sophia Parafina — Tue, 03 Apr 2018 23:53:00 +0000

Many applications that run in a Java Virtual Machine (JVM), including data services such as Apache Spark and Kafka and traditional enterprise applications, are run in containers. Until recently, running the JVM in a container presented problems with memory and cpu sizing and usage that led to performance loss. This was because Java didn’t recognize that it was running in a container. With the release of Java 10, the JVM now recognizes constraints set by container control groups (cgroups). Both memory and cpu constraints can be used manage Java applications directly in containers, these include:

adhering to memory limits set in the container
setting available cpus in the container
setting cpu constraints in the container

Java 10 improvements are realized in both Docker for Mac or Windows and Docker Enterprise Edition environments.

Container Memory Limits

Until Java 9 the JVM did not recognize memory or cpu limits set by the container using flags. In Java 10, memory limits are automatically recognized and enforced.

Java defines a server class machine as having 2 CPUs and 2GB of memory and the default heap size is ¼ of the physical memory. For example, a Docker Enterprise Edition installation has 2GB of memory and 4 CPUs. Compare the difference between containers running Java 8 and Java 10. First, Java 8:

docker container run -it -m512 --entrypoint bash openjdk:latest

$ docker-java-home/bin/java -XX:+PrintFlagsFinal -version | grep MaxHeapSize
    uintx MaxHeapSize                              := 524288000                          {product}
openjdk version "1.8.0_162"

The max heap size is 512M or ¼ of the 2GB set by the Docker EE installation instead of the limit set on the container to 512M. In comparison, running the same commands on Java 10 shows that the memory limit set in the container is fairly close to the expected 128M:

docker container run -it -m512M --entrypoint bash openjdk:10-jdk

$ docker-java-home/bin/java -XX:+PrintFlagsFinal -version | grep MaxHeapSize
   size_t MaxHeapSize                              = 134217728                                {product} {ergonomic}
openjdk version "10" 2018-03-20

Setting Available CPUs

By default, each container’s access to the host machine’s CPU cycles is unlimited. Various constraints can be set to limit a given container’s access to the host machine’s CPU cycles. Java 10 recognizes these limits:

docker container run -it --cpus 2 openjdk:10-jdk
jshell> Runtime.getRuntime().availableProcessors()
$1 ==> 2

All CPUs allocated to Docker EE get the same proportion of CPU cycles. The proportion can be modified by changing the container’s CPU share weighting relative to the weighting of all other running containers. The proportion will only apply when CPU-intensive processes are running. When tasks in one container are idle, other containers can use the leftover CPU time. The actual amount of CPU time will vary depending on the number of containers running on the system. These can be set in Java 10:

docker container run -it --cpu-shares 2048 openjdk:10-jdk
jshell> Runtime.getRuntime().availableProcessors()
$1 ==> 2

The cpuset constraint sets which CPUs allow execution in Java 10.

docker run -it --cpuset-cpus="1,2,3" openjdk:10-jdk
jshell> Runtime.getRuntime().availableProcessors()
$1 ==> 3

Allocating memory and CPU

With Java 10, container settings can be used to estimate the allocation of memory and CPUs needed to deploy an application. Let’s assume that the memory heap and CPU requirements for each process running in a container has already been determined and JAVA_OPTS set. For example, if you have an application distributed across 10 nodes; five nodes require 512Mb of memory with 1024 CPU-shares each and another five nodes require 256Mb with 512 CPU-shares each. Note that 1 CPU share proportion is represented by 1024.

For memory, the application would need 5Gb allocated at minimum.

512Mb x 5 = 2.56 Gb

256Mb x 5 = 1.28 Gb

The application would require 8 CPUs to run efficiently.

1024 x 5 = 5 CPUs

512 x 5 = 3 CPUs

Best practice suggests profiling the application to determine the memory and CPU allocations for each process running in the JVM. However, Java 10 removes the guesswork when sizing containers to prevent out of memory errors in Java applications as well allocating sufficient CPU to process work loads.

Improved Docker Container Integration with @Java 10
Click To Tweet

To learn more about Docker solutions for Java Developers:

Follow the Docker for Java Developer blog and videos series
Start a hosted trial
Sign up for upcoming webinars

Multi-Stage Builds

Sophia Parafina — Wed, 05 Jul 2017 17:30:00 +0000

This is part of a series of articles describing how the AtSea Shop application was built using enterprise development tools and Docker. In the previous post, I introduced the AtSea application and how I developed a REST application with the Eclipse IDE and Docker. Multi-stage builds, a Docker feature introduced in Docker 17.06 CE, let you orchestrate a complex build in a single Dockerfile. Before multi-stage build, Docker users would use a script to compile the applications on the host machine, then use Dockerfiles to build the images. The AtSea application is the perfect use case for a multi-stage build because:

it uses node.js to compile the ReactJs app into storefront
it uses Spring Boot and Maven to make a standalone jar file
it is deployed to a standalone JDK container
the storefront is then included in the jar

Let’s look at the Dockerfile.

The react-app is an extension of create-react-app. From within the react-app directory we run AtSea’s frontend in local development mode.

The first stage of the build uses a Node base image to create a production-ready frontend build directory consisting of static javascript and css files. A Docker best practice is named stages, e.g. “FROM node:latest AS storefront”.

This step first makes our image’s working directory at /usr/src/atsea/app/react-app. We copy the contents of the react-app directory, which includes the ReactJs source and package.json file, to the root of our image’s working directory. Then we use npm to install all necessary react-app’s node dependencies. Finally, npm run build bundles the react-app using the node dependencies and ReactJs source into a build directory at the root.

FROM node:latest AS storefront
WORKDIR /usr/src/atsea/app/react-app
COPY react-app .
RUN npm install
RUN npm run build

Once this build stage is complete, the builder has an intermediate image named storefront. This temporary image will not show up in your list of images from a docker image ls. Yet the builder can access and choose artifacts from this stage in other stages of the build.

To compile the AtSea REST application, we use a maven image and copy the pom.xml file, which maven uses to install the dependencies. We copy the source files to the image and run maven again to build the AtSea jar file using the package command. This creates another intermediate image called appserver.

FROM maven:latest AS appserver
 WORKDIR /usr/src/atsea
 COPY pom.xml .
 RUN mvn -B -f pom.xml -s /usr/share/maven/ref/settings-docker.xml dependency:resolve
 COPY . .
 RUN mvn -B -s /usr/share/maven/ref/settings-docker.xml package -DskipTests

Putting it all together, we use a java image to build the final Docker image. The build directory in storefront, created during the first build stage, is copied to the /static directory, defined as an external directory in the AtSea REST application. We are choosing to leave behind all those node modules :).

We copy the AtSea jar file to the java image and set ENTRYPOINT to start the application and set the profile to use a PostgreSQL database. The final image is compact since it only contains the compiled applications in the JDK base image.

FROM java:8-jdk-alpine
WORKDIR /static
COPY --from=storefront /usr/src/atsea/app/react-app/build/ .
WORKDIR /app
COPY --from=appserver /usr/src/atsea/target/AtSea-0.0.1-SNAPSHOT.jar .
ENTRYPOINT ["java", "-jar", "/app/AtSea-0.0.1-SNAPSHOT.jar"]
CMD ["--spring.profiles.active=postgres"]

This step uses COPY –from command to copy files from the intermediate images. Multi-stage builds can also use offsets instead of named stages, e.g. “COPY --from=0 /usr/src/atsea/app/react-app/build/ .”

Multi-stage builds facilitate the creation of small and significantly more efficient containers since the final image can be free of any build tools. External scripts are no longer needed to orchestrate a build. Instead, an application image is built and started by using docker-compose up –build. A stack is deployed using docker stack deploy -c docker-stack.yml.

Multi-Stage Builds in Docker Cloud

Docker Cloud now supports multi-stage builds for automated builds. Linking the github repository to Docker Cloud ensures that your images will be always be current. To enable automated builds, tag and push your image to your Docker Cloud repository.

docker tag atsea_app <your username>/atsea_app

docker push <your username>/atsea_app

Log into your Docker Cloud account.

Next connect your Github account to give Cloud access to the source code. Click on Cloud Settings, then click on sources, and the plug icon. Follow the directions to connect your Github account.

After your Github account is connected, click on Repositories on the side menu and then click your atsea_app repository.

Click on Builds, then click on Configure Automated Builds on the following screen.

In the Build Configurations form, complete

the Source Repository with the Github account and repository
the Build Location, we’ll use Docker Cloud with a medium node
the Docker Version using Edge 17.05 CE which supports multi-stage builds
leave Autotest to off
create a Build Rule that specifies the dockerfile in the app directory of the repository

Click on Save and Build to build the image.

Docker Cloud will notify you if the build was successful.

For more information on multi-stage builds read the documentation and Docker Captain Alexis Ellis’ Builder pattern vs. Multi-stage builds in Docker. To build compact and efficient images watch Abby Fuller’s Dockercon 2017 presentation, Creating Effective Images and check out her slides.

Interested in more? Check out these developer resources and videos from Dockercon 2017.

Multi-stage builds in the #DockerCon AtSea demo app by @spara @jessvalarezo1
Click To Tweet

Spring Boot Development with Docker

Sophia Parafina — Wed, 24 May 2017 19:00:00 +0000

The AtSea Shop is an example storefront application that can be deployed on different operating systems and can be customized to both your enterprise development and operational environments. In my last post, I discussed the architecture of the app. In this post, I will cover how to setup your development environment to debug the Java REST backend that runs in a container.

Building the REST Application

I used the Spring Boot framework to rapidly develop the REST backend that manages products, customers and orders tables used in the AtSea Shop. The application takes advantage of Spring Boot’s built-in application server, support for REST interfaces and ability to define multiple data sources. Because it was written in Java, it is agnostic to the base operating system and runs in either Windows or Linux containers. This allows developers to build against a heterogenous architecture.

Project setup

The AtSea project uses multi-stage builds, a new Docker feature, which allows me to use multiple images to build a single Docker image that includes all the components needed for the application. The multi-stage build uses a Maven container to build the the application jar file. The jar file is then copied to a Java Development Kit image. This makes for a more compact and efficient image because the Maven is not included with the application. Similarly, the React store front client is built in a Node image and the compile application is also added to the final application image.

I used Eclipse to write the AtSea app. If you want info on configuring IntelliJ or Netbeans for remote debugging, you can check out the the Docker Labs Repository. You can also check out the code in the AtSea app github repository.

I built the application by cloning the repository and imported the project into Eclipse by setting the Root Directory to the project and clicking Finish

File > Import > Maven > Existing Maven Projects

Since I used using Spring Boot, I took advantage of spring-devtools to do remote debugging in the application. I had to add the Spring Boot-devtools dependency to the pom.xml file.


     org.springframework.boot
     spring-boot-devtools

Note that developer tools are automatically disabled when the application is fully packaged as a jar. To ensure that devtools are available during development, I set the configuration to false in the spring-boot-maven build plugin:


    
        
            org.springframework.boot
            spring-boot-maven-plugin
            
                false

This example uses a Docker Compose file that creates a simplified build of the containers specifically needed for development and debugging.

 version: "3.1"

services:
  database:
    build: 
       context: ./database
    image: atsea_db
    environment:
      POSTGRES_USER: gordonuser
      POSTGRES_DB: atsea
    ports:
      - "5432:5432" 
    networks:
      - back-tier
    secrets:
      - postgres_password

  appserver:
    build:
       context: .
       dockerfile: app/Dockerfile-dev
    image: atsea_app
    ports:
      - "8080:8080"
      - "5005:5005"
    networks:
      - front-tier
      - back-tier
    secrets:
      - postgres_password

secrets:
  postgres_password:
    file: ./devsecrets/postgres_password
    
networks:
  front-tier:
  back-tier:
  payment:
    driver: overlay

The Compose file uses secrets to provision passwords and other sensitive information such as certificates – without relying on environmental variables. Although the example uses PostgreSQL, the application can use secrets to connect to any database defined by as a Spring Boot datasource. From JpaConfiguration.java:

 public DataSourceProperties dataSourceProperties() {
        DataSourceProperties dataSourceProperties = new DataSourceProperties();

    // Set password to connect to database using Docker secrets.
    try(BufferedReader br = new BufferedReader(new FileReader("/run/secrets/postgres_password"))) {
        StringBuilder sb = new StringBuilder();
        String line = br.readLine();
        while (line != null) {
            sb.append(line);
            sb.append(System.lineSeparator());
            line = br.readLine();
        }
         dataSourceProperties.setDataPassword(sb.toString());
     } catch (IOException e) {
        System.err.println("Could not successfully load DB password file");
     }
    return dataSourceProperties;
}

Also note that the appserver opens port 5005 for remote debugging and that build calls the Dockerfile-dev file to build a container that has remote debugging turned on. This is set in the Entrypoint which specifies transport and address for the debugger.

ENTRYPOINT ["java", 

"-agentlib:jdwp=transport=dt_socket,server=y,suspend=y,address=5005","-jar", 

"/app/AtSea-0.0.1-SNAPSHOT.jar"]

Remote Debugging

To start remote debugging on the application, run compose using the docker-compose-dev.yml file.

docker-compose -f docker-compose-dev.yml up --build

Docker will build the images and start the AtSea Shop database and appserver containers. However, the application will not fully load until Eclipse’s remote debugger attaches to the application. To start remote debugging you click on Run > Debug Configurations …

Select Remote Java Application then press the new button to create a configuration. In the Debug Configurations panel, you give the configuration a name, select the AtSea project and set the connection properties for host and the port to 5005. Click Apply > Debug.

The appserver will start up.

appserver_1|2017-05-09 03:22:23.095 INFO 1 --- [main] s.b.c.e.t.TomcatEmbeddedServletContainer : Tomcat started on port(s): 8080 (http)

appserver_1|2017-05-09 03:22:23.118 INFO 1 --- [main] com.docker.atsea.AtSeaApp                : Started AtSeaApp in 38.923 seconds (JVM running for 109.984)

To test remote debugging set a breakpoint on ProductController.java where it returns a list of products.

You can test it using curl or your preferred tool for making HTTP requests:

curl -H "Content-Type: application/json" -X GET  http://localhost:8080/api/product/

Eclipse will switch to the debug perspective where you can step through the code.

The AtSea Shop example shows how easy it is to use containers as part of your normal development environment using tools that you and your team are familiar with. Download the application to try out developing with containers or use it as basis for your own Spring Boot REST application.

Interested in more? Check out these developer resources and videos from Dockercon 2017.

Developing the AtSea app with #Docker and #SpringBoot by @spara
Click To Tweet

Live Debugging Java in Docker – Just in time for JavaOne!

Sophia Parafina — Tue, 20 Sep 2016 16:00:00 +0000

Developing Java web applications often requires that they can be deployed on multiple technology stacks. These typically include an application server and a database, but these components can vary from deployment to deployment. Building and managing multiple development stacks in a development environment can be a time consuming task often requiring unique configurations for each stack.

Docker can simplify the process of building and maintaining develop environments for Java web applications by building custom images that application developers can create on demand and use for development, testing and debugging applications. We have recently published a tutorial for building a Java web application using containers and three popular Java IDEs. Docker enables developers to debug their code as it runs in containers. The tutorial covers setting up a debug session with an application server in Docker using IDEs that developers typically use such as Eclipse, IntelliJ IDEA and Netbeans. Developers can build the application, change code, and set breakpoints while the application is running in the container. The tutorials use a simple Spring MVC application to illustrate how use containers when developing Java applications

The tutorial is available on GitHub in our Docker Labs repository. These tutorials show you how to:

Configure Eclipse, IntelliJ, and Netbeans
Set-up the project
Debug your application live in the container

You can go to the tutorials, or follow along in these videos:

The tutorial uses common stack components, but the Docker enables you to build development environments using components from different technology stacks. For most use cases, Docker provides a way to quickly create and deploy a consistent development environment in Java.

Have any more tips or examples using Docker for with Java? Or other languages? Share them with the community by contributing to the Docker Labs repository.

Based in San Francisco? Join us this Wed Sept 21st at Docker HQ for a Docker for Java Developers meetup with Docker Captain Arun Gupta and Patrick Chanezon.