Skip to content

General

General configuration options.

keyword

keyword: boolean|string

Enables sparse keyword indexing for this embeddings.

When set to a boolean, this parameter creates a BM25 index for full text search. When set to a string, it expects a keyword method.

It also implicitly disables the defaults setting for vector search.

sparse

sparse: boolean|path

Enables sparse vector indexing for this embeddings.

When set to True, this parameter creates a sparse vector index using the default sparse index model. When set to a string, it expects a local or Hugging Face model path.

It also implicitly disables the defaults setting for vector search.

dense

dense: boolean|string

Alias for the vector model path. When set to True, the default transformers vector model is used.

hybrid

hybrid: boolean

Enables hybrid (sparse + dense) indexing for this embeddings.

When enabled, this parameter creates a BM25 index for full text search. It has no effect on the defaults or path settings.

defaults

defaults: boolean

Uses default vector model path when enabled (default setting is True) and path is not provided. See this link for an example.

indexes

indexes: dict

Key value pairs defining subindexes for this embeddings. Each key is the index name and the value is the full configuration. This configuration can use any of the available configurations in a standard embeddings instance.

autoid

format: int|uuid function

Sets the auto id generation method. When this is not set, an autogenerated numeric sequence is used. This also supports UUID generation functions. For example, setting this value to uuid4 will generate random UUIDs. Setting this to uuid5 will generate deterministic UUIDs for each input data row.

columns

columns:
    text: name of the text column
    object: name of the object column

Sets the text and object column names. Defaults to text and object if not provided.

format

format: json|pickle

Sets the configuration storage format. Defaults to json.