Translation
The Translation pipeline translates text between languages. It supports over 100+ languages. Automatic source language detection is built-in. This pipeline detects the language of each input text row, loads a model for the source-target combination and translates text to the target language.
Example
The following shows a simple example using this pipeline.
from txtai.pipeline import Translation
# Create and run pipeline
translate = Translation()
translate("This is a test translation into Spanish", "es")
See the link below for a more detailed example.
Notebook | Description | |
---|---|---|
Translate text between languages | Streamline machine translation and language detection |
Configuration-driven example
Pipelines are run with Python or configuration. Pipelines can be instantiated in configuration using the lower case name of the pipeline. Configuration-driven pipelines are run with workflows or the API.
config.yml
# Create pipeline using lower case class name
translation:
# Run pipeline with workflow
workflow:
translate:
tasks:
- action: translation
args: ["es"]
Run with Workflows
from txtai import Application
# Create and run pipeline with workflow
app = Application("config.yml")
list(app.workflow("translate", ["This is a test translation into Spanish"]))
Run with API
CONFIG=config.yml uvicorn "txtai.api:app" &
curl \
-X POST "http://localhost:8000/workflow" \
-H "Content-Type: application/json" \
-d '{"name":"translate", "elements":["This is a test translation into Spanish"]}'
Methods
Python documentation for the pipeline.
__init__(path=None, quantize=False, gpu=True, batch=64, langdetect=None, findmodels=True)
Constructs a new language translation pipeline.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path |
optional path to model, accepts Hugging Face model hub id or local path, uses default model for task if not provided |
None
|
|
quantize |
if model should be quantized, defaults to False |
False
|
|
gpu |
True/False if GPU should be enabled, also supports a GPU device id |
True
|
|
batch |
batch size used to incrementally process content |
64
|
|
langdetect |
set a custom language detection function, method must take a list of strings and return language codes for each, uses default language detector if not provided |
None
|
|
findmodels |
True/False if the Hugging Face Hub will be searched for source-target translation models |
True
|
Source code in txtai/pipeline/text/translation.py
29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 |
|
__call__(texts, target='en', source=None, showmodels=False)
Translates text from source language into target language.
This method supports texts as a string or a list. If the input is a string, the return type is string. If text is a list, the return type is a list.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
texts |
text|list |
required | |
target |
target language code, defaults to "en" |
'en'
|
|
source |
source language code, detects language if not provided |
None
|
Returns:
Type | Description |
---|---|
list of translated text |
Source code in txtai/pipeline/text/translation.py
56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 |
|