Translation
This guide explains the overall process for executing speech transcript and subtitle translation with Limecraft Flow.
The outline of the process is as follows:
Get supported languages for translation
First let’s learn which languages the system can translate between. This information can be retrieved using this call:
GET /production/{prId}/translation/language
List the available languages for translation.
Details
Description
Parameters
Path Parameters
Name | Description | Required | Type |
---|---|---|---|
|
ID of the production. |
✔ |
Long |
Return Type
String
Content Type
-
application/json
Responses
Code | Description | Datatype |
---|---|---|
200 |
The request was successful. |
|
403 |
The user needs START_TRANSLATION_WORKFLOW rights. |
|
404 |
The production was not found. |
|
The response is a list of language code strings. E.g.:
[
"en-GB",
"en-US",
"nl"
]
In the remainder of this guide, we will talk about the 'source language' (of the annotations we want to translate from) and the 'target language'. Both languages have to be in this list for the translation to be supported.
Create a clip
We assume a clip has already been uploaded to the Limecraft platform. If not, refer to Create And Upload MediaObjects to learn how to do this.
Retrieve the source annotations
Limecraft Flow translation happens on Annotations. We need the id
of the annotations we want to translate. We describe how translation of subtitles and speech transcriptions is done, respectively.
Subtitle
We assume the clip already has subtitles in the source language available on the Limecraft platform. If not, refer to Subtitling for more info.
See Retrieve a subtitle to learn how to fetch the subtitle in the source language. Note that the language
of the Subtitle annotation should be in the supported language list retrieved earlier.
The response is a single Subtitle annotation. We’ll need its id
to put in the annotationIds
list parameter when starting the translation workflow in the next section.
Transcription
We assume the clip already has a transcript in the source language available on the Limecraft platform. If not, refer to Transcription for more info.
See Retrieve the transcript to learn how to fetch the transcript in the source language. Note that the language
of the TranscriptAnnotations should be in the supported language list retrieved earlier. The response is a list of TranscriptAnnotations. We’ll need the id
of each of them to put in the annotationIds
list parameter when starting the translation workflow in the next section.
Start the translation workflow
Starting the translation workflow is done using this call:
POST /production/{prId}/mo/{moId}/translate
A call for starting a translation workflow.
Details
Description
Depending on the body of the query, you can choose between different translation engines.
Parameters
Path Parameters
Name | Description | Required | Type |
---|---|---|---|
|
ID of the production. |
✔ |
Long |
|
ID of the media object. |
✔ |
Long |
Body Parameters
Name | Description | Required | Type |
---|---|---|---|
|
✘ |
TranslationRequest |
TranslationRequest
Field Name | Required | Type | Description | Format |
---|---|---|---|---|
annotationIds |
✘ |
List of long |
List of source annotations to translate. |
int64 |
language |
✘ |
String |
The target language for translation. |
|
redo |
✘ |
Boolean |
Run again, even if the workflow already ran in this context. |
|
redoSingleTask |
✘ |
Boolean |
||
skipActiveWorkflowTest |
✘ |
Boolean |
||
sourceLanguage |
✘ |
String |
The source language for translation. Leave empty to use the language of the annotations themselves. |
|
translationEngine |
✘ |
String |
The engine to use for translation. Omit to make an automatic smart choice based on the languages involved. |
|
waitForWorkflow |
✘ |
Boolean |
Return Type
MediaObjectWorkflowReport
Field Name | Required | Type | Description | Format |
---|---|---|---|---|
adminOnly |
✘ |
Boolean |
||
audioAnalyzerCompleted |
✘ |
Date |
date-time |
|
created |
✘ |
Date |
The time when this resource was created |
date-time |
createdBy |
✘ |
String |
The request or process that created this resource |
|
createdByShareId |
✘ |
Long |
int64 |
|
createdBySharedUserId |
✘ |
Long |
int64 |
|
creatorId |
✘ |
Long |
The id of the user who created this resource |
int64 |
duration |
✘ |
Double |
double |
|
errorReports |
✘ |
List of TaskReport |
||
extra |
✘ |
Object |
||
funnel |
✘ |
String |
||
id |
✔ |
Long |
The id of this resource |
int64 |
label |
✘ |
String |
User-friendly label of the workflow |
|
lastUpdated |
✘ |
Date |
The time when this resource was last updated |
date-time |
mediaAnalyzerCompleted |
✘ |
Date |
date-time |
|
mediaObjectId |
✘ |
Long |
int64 |
|
modifiedBy |
✔ |
String |
The request or process responsible for the last update of this resource |
|
objectType |
✘ |
String |
The data model type or class name of this resource |
|
productionId |
✘ |
Long |
int64 |
|
publishedFiles |
✘ |
List of object |
Files generated by the workflow, which can be downloaded. |
|
removeFromQuota |
✘ |
Boolean |
||
requiredRights |
✘ |
List of ProductionPermission |
||
size |
✘ |
Long |
int64 |
|
startupParameters |
✘ |
Object |
||
status |
✘ |
String |
Enum: Inited, Started, Completed, Error, Cancelled, Paused, CompletedPending, ErrorPending, WaitForCallback, Scheduled, |
|
successFul |
✘ |
Boolean |
||
target |
✘ |
String |
||
taskReports |
✘ |
List of TaskReport |
||
transcoder1Completed |
✘ |
Date |
date-time |
|
transcoder2Completed |
✘ |
Date |
date-time |
|
variables |
✘ |
Object |
||
version |
✔ |
Long |
The version of this resource, used for Optimistic Locking |
int64 |
workflowCompleted |
✘ |
Date |
When did the workflow complete? |
date-time |
workflowFailed |
✘ |
Date |
When did the workflow fail? |
date-time |
workflowId |
✘ |
String |
The id of the workflow. This can be used to retrieve the workflow status. |
|
workflowStarted |
✘ |
Date |
When was the workflow started? |
date-time |
workflowTask |
✘ |
String |
||
workflowType |
✘ |
String |
Enum: INGEST, SPEECH, IPPAMEXPORT, IPPAMSYNC, MOIEXPORT, REMOTE_SPEECH, VOLDEMORT_SPEECH, TRANSCODE, AUDIOANALYZE, EXPORT_VWFLOW, FEATURE_EXTRACTION, BLACK_FRAME, STON_APPROVE, AAF_EXPORT, FCP_EXPORT, VOLDEMORT_SPEECH_2, KALDI_SPEECH, SUBTITLING, MIGRATE, INDEX, BACKUP, VOLDEMORT_SPEECH_3, VOLDEMORT_SPEECH_4, VOLDEMORT_SPEECH_5, TRANSLATION, INDEX_SWITCH, SIMPLEINGEST, CLONE, UPDATE_CATEGORY, WEBHOOK, SETKEEPER_ATTACH, PDF_EXPORT, SHOT_DETECTION, EXPORT, REMOTE_HELLO_WORLD, CUSTOM, UNKNOWN, CHANGE_AUDIO_LAYOUT, WORKSPACE_BOOTSTRAP, MEDIA_TRANSFER_COMPLETE, MEDIA_TRANSFER_FAILED, DELIVERY_REQUEST_SUBMISSION_CLIP_PROBED, DELIVERY_REQUEST_SUBMISSION, ADVANCED_SUBTITLE, TRANSCRIPTION_SUMMARIZE, |
Content Type
-
application/json
Responses
Code | Description | Datatype |
---|---|---|
200 |
The request was successful. |
|
403 |
The user needs START_TRANSLATION_WORKFLOW rights and additional rights depending on the type of the annotation. |
|
404 |
The production or one of the annotations was not found. |
|
409 |
The workflow was already in progress. |
|
The body of this request is a TranslationRequest JSON object.
Its annotationIds
field is a list of source annotations. So, we put each id
of the annotations we retrieved earlier in it.
In the language
field, we put the desired target language, chosen from the list we retrieved earlier.
The workflow will translate each annotation passed to it in annotationIds
, and create a new annotation with the same objectType
and funnel
, but now with the language
being the one passed to the workflow.
The example below will start a workflow which will translate the annotation with id 922076
into German (language code "de"
). Note too that the redo
flag should be set to true
, otherwise the workflow will not run if another translation workflow already ran before on that clip.
{
"redo": true,
"language": "de",
"annotationIds": [
922076
]
}
Follow-up the status of the translation workflow
The execution of the automated translation process is modeled as any other Limecraft Flow platform workflow, like the other enrichment and media processing workflows in our system. As such, the workflow API can be used to track its progress, till completion or failure.
The call mentioned above will return a MediaObjectWorkflowReport. Its workflowId
field gives you a reference to the workflow that was started. Once this workflow completes, the Annotations containing the translation will have been created. To learn how to wait for a workflow to complete, see this section.
Retrieve the translation results
The translation results will be new Annotations, with the same objectType
and funnel
as the source annotations.
So, translating TranscriptAnnotations will result in just as many new TranscriptAnnotations, with the chosen language as their language
field. See Retrieve the transcript to learn how to fetch the transcript in the target language.
Translating a Subtitle annotation will result in new Subtitle annotation, with the chosen language as its language
field. See Retrieve a subtitle to learn how to fetch the subtitle in the target language.
Customize the translation process
Translate other annotations
While speech transcripts and subtitles are the primary candidates for translation, Limecraft Flow also supports translating other annotations, like a ClipAnnotation. In that case, the commonly used description
field will be translated.
Use a different engine
The Limecraft Flow platform supports multiple translation engines. The default
choice (same as not specifying translationEngine
) will make a smart choice on what we feel is the best engine to translate the given source and target languages.
{ "translationEngine": "default", "language": "fr", "annotationIds": [ {{annotationId}} ] }
It is however possible to force using a particular engine by choosing a different value for translationEngine
:
speechEngine | Description |
---|---|
default |
will make a smart choice on what we feel is the best engine to translate the given source and target languages |
googleTranslate |
Google Translate |
deepl |
Deepl |