General
General configuration options.
keyword
keyword: boolean|string
Enables sparse keyword indexing for this embeddings.
When set to a boolean, this parameter creates a BM25 index for full text search. When set to a string, it expects a keyword method.
It also implicitly disables the defaults setting for vector search.
sparse
sparse: boolean|path
Enables sparse vector indexing for this embeddings.
When set to True
, this parameter creates a sparse vector index using the default sparse index model. When set to a string, it expects a local or Hugging Face model path.
It also implicitly disables the defaults setting for vector search.
dense
dense: boolean|string
Alias for the vector model path. When set to True
, the default transformers vector model is used.
hybrid
hybrid: boolean
Enables hybrid (sparse + dense) indexing for this embeddings.
When enabled, this parameter creates a BM25 index for full text search. It has no effect on the defaults or path settings.
defaults
defaults: boolean
Uses default vector model path when enabled (default setting is True) and path
is not provided. See this link for an example.
indexes
indexes: dict
Key value pairs defining subindexes for this embeddings. Each key is the index name and the value is the full configuration. This configuration can use any of the available configurations in a standard embeddings instance.
autoid
format: int|uuid function
Sets the auto id generation method. When this is not set, an autogenerated numeric sequence is used. This also supports UUID generation functions. For example, setting this value to uuid4
will generate random UUIDs. Setting this to uuid5
will generate deterministic UUIDs for each input data row.
columns
columns:
text: name of the text column
object: name of the object column
Sets the text
and object
column names. Defaults to text
and object
if not provided.
format
format: json|pickle
Sets the configuration storage format. Defaults to json
.