Skip to content

All-in-one embeddings database

Version GitHub last commit GitHub issues Join Slack Build Status Coverage Status

txtai is an all-in-one embeddings database for semantic search, LLM orchestration and language model workflows.

architecture architecture

Embeddings databases are a union of vector indexes (sparse and dense), graph networks and relational databases. This enables vector search with SQL, topic modeling, retrieval augmented generation and more.

Embeddings databases can stand on their own and/or serve as a powerful knowledge source for large language model (LLM) prompts.

Summary of txtai features:

  • 🔎 Vector search with SQL, object storage, topic modeling, graph analysis and multimodal indexing
  • 📄 Create embeddings for text, documents, audio, images and video
  • 💡 Pipelines powered by language models that run LLM prompts, question-answering, labeling, transcription, translation, summarization and more
  • ↪️️ Workflows to join pipelines together and aggregate business logic. txtai processes can be simple microservices or multi-model workflows.
  • ⚙️ Build with Python or YAML. API bindings available for JavaScript, Java, Rust and Go.
  • ☁️ Run local or scale out with container orchestration

txtai is built with Python 3.8+, Hugging Face Transformers, Sentence Transformers and FastAPI. txtai is open-source under an Apache 2.0 license.

Interested in an easy and secure way to run hosted txtai applications? Then join the txtai.cloud preview to learn more.