FAQ
Below is a list of frequently asked questions and common issues encountered.
Questions
Question
What models are recommended?
Answer
See the model guide.
Question
What is the best way to track the progress of an embeddings.index
call?
Answer
Wrap the list or generator passed to the index call with tqdm. See #478 for more.
Question
What is the best way to analyze the content of a txtai index?
Answer
txtai has a console application that makes this easy. Read this article to learn more.
Question
How can models be externally loaded and passed to embeddings and pipelines?
Answer
Embeddings example.
from transformers import AutoModel, AutoTokenizer
from txtai import Embeddings
# Load model externally
model = AutoModel.from_pretrained("sentence-transformers/all-MiniLM-L6-v2")
tokenizer = AutoTokenizer.from_pretrained("sentence-transformers/all-MiniLM-L6-v2")
# Pass to embeddings instance
embeddings = Embeddings(path=model, tokenizer=tokenizer)
LLM pipeline example.
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from txtai import LLM
# Load Phi 3.5-mini
path = "microsoft/Phi-3.5-mini-instruct"
model = AutoModelForCausalLM.from_pretrained(
path,
torch_dtype=torch.bfloat16,
)
tokenizer = AutoTokenizer.from_pretrained(path)
llm = LLM((model, tokenizer))
Common issues
Issue
Embeddings query errors like this:
SQLError: no such function: json_extract
Solution
Upgrade Python version as it doesn't have SQLite support for json_extract
Issue
Segmentation faults and similar errors on macOS
Solution
Set the following environment parameters.
- OpenMP threading is handled internally on macOS platforms but it can be disabled via
export OMP_NUM_THREADS=1
- Disable PyTorch MPS device via
export PYTORCH_MPS_DISABLE=1
- Disable llama.cpp metal via
export LLAMA_NO_METAL=1
For more details, refer to this issue on GitHub.
Issue
Error running SQLite ANN on macOS
AttributeError: 'sqlite3.Connection' object has no attribute 'enable_load_extension'
Solution
See this note for options on how to fix this.
Issue
ContextualVersionConflict
and/or package METADATA exception while running one of the examples notebooks on Google Colab
Solution
Restart the kernel. See issue #409 for more on this issue.
Issue
Error installing optional/extra dependencies such as pipeline
Solution
The default MacOS shell (zsh) and Windows PowerShell require escaping square brackets
pip install 'txtai[pipeline]'