Skip to content

Reranker

pipeline pipeline

The Reranker pipeline runs embeddings queries and re-ranks them using a similarity pipeline.

Example

The following shows a simple example using this pipeline.

from txtai import Embeddings
from txtai.pipeline import Reranker, Similarity

# Embeddings instance
embeddings = Embeddings()
embeddings.load(provider="huggingface-hub", container="neuml/txtai-wikipedia")

# Similarity instance
similarity = Similarity(path="colbert-ir/colbertv2.0", lateencode=True)

# Reranking pipeline
reranker = Reranker(embeddings, similarity)
reranker("Tell me about AI")

Note: Content must be enabled with the embeddings instance for this to work properly.

Configuration-driven example

Pipelines are run with Python or configuration. Pipelines can be instantiated in configuration using the lower case name of the pipeline. Configuration-driven pipelines are run with workflows or the API.

config.yml

embeddings:

similarity:

# Create pipeline using lower case class name
reranker:

# Run pipeline with workflow
workflow:
  translate:
    tasks:
      - reranker

Run with Workflows

from txtai import Application

# Create and run pipeline with workflow
app = Application("config.yml")
list(app.workflow("reranker", ["Tell me about AI"]))

Run with API

CONFIG=config.yml uvicorn "txtai.api:app" &

curl \
  -X POST "http://localhost:8000/workflow" \
  -H "Content-Type: application/json" \
  -d '{"name":"rerank", "elements":["Tell me about AI"]}'

Methods

Python documentation for the pipeline.

__init__(embeddings, similarity)

Creates a Reranker pipeline.

Parameters:

Name Type Description Default
embeddings

embeddings instance (content must be enabled)

required
similarity

similarity instance

required
Source code in txtai/pipeline/text/reranker.py
14
15
16
17
18
19
20
21
22
23
def __init__(self, embeddings, similarity):
    """
    Creates a Reranker pipeline.

    Args:
        embeddings: embeddings instance (content must be enabled)
        similarity: similarity instance
    """

    self.embeddings, self.similarity = embeddings, similarity

__call__(query, limit=3, factor=10, **kwargs)

Runs an embeddings search and re-ranks the results using a Similarity pipeline.

Parameters:

Name Type Description Default
query

query text|list

required
limit

maximum results

3
factor

factor to multiply limit by for the initial embeddings search

10
kwargs

additional arguments to pass to embeddings search

{}

Returns:

Type Description

list of query results rescored using a Similarity pipeline

Source code in txtai/pipeline/text/reranker.py
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
def __call__(self, query, limit=3, factor=10, **kwargs):
    """
    Runs an embeddings search and re-ranks the results using a Similarity pipeline.

    Args:
        query: query text|list
        limit: maximum results
        factor: factor to multiply limit by for the initial embeddings search
        kwargs: additional arguments to pass to embeddings search

    Returns:
        list of query results rescored using a Similarity pipeline
    """

    queries = [query] if not isinstance(query, list) else query

    # Run searches
    results = self.embeddings.batchsearch(queries, limit * factor, **kwargs)

    # Re-rank using similarity pipeline
    ranked = []
    for x, result in enumerate(results):
        texts = [row["text"] for row in result]

        # Score results and merge
        for uid, score in self.similarity(queries[x], texts):
            result[uid]["score"] = score

        # Sort and take top n sorted results
        ranked.append(sorted(result, key=lambda row: row["score"], reverse=True)[:limit])

    return ranked[0] if isinstance(query, str) else ranked