Skip to content


cloud cloud

Scalable cloud-native applications can be built with txtai. The following cloud runtimes are supported.

  • Container Orchestration Systems (i.e. Kubernetes)
  • Docker Engine
  • Serverless Compute
  • (coming in 2024)

Images for txtai are available on Docker Hub for CPU and GPU installs. The CPU install is recommended when GPUs aren't available given the image is half the size.

The base txtai images have no models installed and models will be downloaded each time the container starts. Caching the models is recommended as that will significantly reduce container start times. This can be done a couple different ways.

  • Create a container with the models cached
  • Set the transformers cache environment variable and mount that volume when starting the image
    docker run -v <local dir>:/models -e TRANSFORMERS_CACHE=/models --rm -it <docker image>

Build txtai images

The txtai images found on Docker Hub are configured to support most situations. This image can be locally built with different options as desired.

Examples build commands below.

# Get Dockerfile

# Build Ubuntu 20.04 image running Python 3.8
docker build -t txtai --build-arg BASE_IMAGE=ubuntu:20.04 --build-arg PYTHON_VERSION=3.8 .

# Build image with GPU support
docker build -t txtai --build-arg GPU=1 .

# Build minimal image with the base txtai components
docker build -t txtai --build-arg COMPONENTS= .

Container image model caching

As mentioned previously, model caching is recommended to reduce container start times. The following commands demonstrate this. In all cases, it is assumed a config.yml file is present in the local directory with the desired configuration set.


This section builds an image that caches models and starts an API service. The config.yml file should be configured with the desired components to expose via the API.

The following is a sample config.yml file that creates an Embeddings API service.

# config.yml
writable: true

  path: sentence-transformers/nli-mpnet-base-v2
  content: true

The next section builds the image and starts an instance.

# Get Dockerfile

# CPU build
docker build -t txtai-api .

# GPU build
docker build -t txtai-api --build-arg BASE_IMAGE=neuml/txtai-gpu .

# Run
docker run -p 8000:8000 --rm -it txtai-api


This section builds a scheduled workflow service. More on scheduled workflows can be found here.

# Get Dockerfile

# CPU build
docker build -t txtai-service .

# GPU build
docker build -t txtai-service --build-arg BASE_IMAGE=neuml/txtai-gpu .

# Run
docker run --rm -it txtai-service


This section builds a single run workflow. Example workflows can be found here.

# Get Dockerfile

# CPU build
docker build -t txtai-workflow . 

# GPU build
docker build -t txtai-workflow --build-arg BASE_IMAGE=neuml/txtai-gpu .

# Run
docker run --rm -it txtai-workflow <workflow name> <workflow parameters>

Serverless Compute

One of the most powerful features of txtai is building YAML-configured applications with the "build once, run anywhere" approach. API instances and workflows can run locally, on a server, on a cluster or serverless.

Serverless instances of txtai are supported on frameworks such as AWS Lambda, Google Cloud Functions, Azure Cloud Functions and Kubernetes with Knative.

AWS Lambda

The following steps show a basic example of how to build a serverless API instance with AWS SAM.

  • Create config.yml and template.yml
# config.yml
writable: true

  path: sentence-transformers/nli-mpnet-base-v2
  content: true
# template.yml
    Type: AWS::Serverless::Function
      PackageType: Image
      MemorySize: 3000
      Timeout: 20
          Type: Api
            Path: "/{proxy+}"
            Method: ANY
      Dockerfile: Dockerfile
      DockerContext: ./
      DockerTag: api
# Get Dockerfile and application

# Build the docker image
sam build

# Start API gateway and Lambda instance locally
sam local start-api -p 8000 --warm-containers LAZY

# Verify instance running (should return 0)
curl http://localhost:8080/count

If successful, a local API instance is now running in a "serverless" fashion. This configuration can be deployed to AWS using SAM. See this link for more information.

Kubernetes with Knative

txtai scales with container orchestration systems. This can be self-hosted or with a cloud provider such as Amazon Elastic Kubernetes Service, Google Kubernetes Engine and Azure Kubernetes Service. There are also other smaller providers with a managed Kubernetes offering.

A full example covering how to build a serverless txtai application on Kubernetes with Knative can be found here. is a planned effort that will offer an easy and secure way to run hosted txtai applications.