Translation

This guide explains the overall process for executing speech transcript and subtitle translation with Limecraft Flow.

The outline of the process is as follows:

Get supported languages for translation
Create a clip
Retrieve the source annotations
Start the translation workflow
Follow-up the status of the translation workflow
Retrieve the translation results
Customize the translation process

Get supported languages for translation

First let’s learn which languages the system can translate between. This information can be retrieved using this call:

GET /production/{prId}/translation/language

List the available languages for translation.

Details

Description

Parameters

Path Parameters

Name Description Required Type

Name	Description	Required	Type
`prId`	ID of the production.	✔	Long

prId

ID of the production.

✔

Long

Return Type

String

Content Type

application/json

Responses

Table 1. http response codes
Code	Description	Datatype
200	The request was successful.	`String`
403	The user needs START_TRANSLATION_WORKFLOW rights.	`ForbiddenError`
404	The production was not found.	`NotFoundError`

The response is a list of language code strings. E.g.:

[
    "en-GB",
    "en-US",
    "nl"
]

In the remainder of this guide, we will talk about the 'source language' (of the annotations we want to translate from) and the 'target language'. Both languages have to be in this list for the translation to be supported.

Create a clip

We assume a clip has already been uploaded to the Limecraft platform. If not, refer to Create And Upload MediaObjects to learn how to do this.

Retrieve the source annotations

Limecraft Flow translation happens on Annotations. We need the id of the annotations we want to translate. We describe how translation of subtitles and speech transcriptions is done, respectively.

Subtitle

We assume the clip already has subtitles in the source language available on the Limecraft platform. If not, refer to Subtitling for more info.

See Retrieve a subtitle to learn how to fetch the subtitle in the source language. Note that the language of the Subtitle annotation should be in the supported language list retrieved earlier.

The response is a single Subtitle annotation. We’ll need its id to put in the annotationIds list parameter when starting the translation workflow in the next section.

Transcription

We assume the clip already has a transcript in the source language available on the Limecraft platform. If not, refer to Transcription for more info.

See Retrieve the transcript to learn how to fetch the transcript in the source language. Note that the language of the TranscriptAnnotations should be in the supported language list retrieved earlier. The response is a list of TranscriptAnnotations. We’ll need the id of each of them to put in the annotationIds list parameter when starting the translation workflow in the next section.

Start the translation workflow

Starting the translation workflow is done using this call:

POST /production/{prId}/mo/{moId}/translate

A call for starting a translation workflow.

Details

Description

Depending on the body of the query, you can choose between different translation engines.

Parameters

Path Parameters

Name Description Required Type

Name	Description	Required	Type
`prId`	ID of the production.	✔	Long
`moId`	ID of the media object.	✔	Long

prId

ID of the production.

✔

Long

moId

ID of the media object.

✔

Long

Body Parameters

Name Description Required Type

Name	Description	Required	Type
`TranslationRequest`		✘	TranslationRequest

TranslationRequest

✘

TranslationRequest

TranslationRequest

Field Name	Required	Type	Description	Format
annotationIds	✘	List of long	List of source annotations to translate.	int64
language	✘	String	The target language for translation.
redo	✘	Boolean	Run again, even if the workflow already ran in this context.
redoSingleTask	✘	Boolean
skipActiveWorkflowTest	✘	Boolean
sourceLanguage	✘	String	The source language for translation. Leave empty to use the language of the annotations themselves.
translationEngine	✘	String	The engine to use for translation. Omit to make an automatic smart choice based on the languages involved.
translationEngineFormality	✘	String	The formality parameter to use for translation when using DeepL. Either: 'less'	'more'
'default'	'prefer_less'	'prefer_more', defaults to 'default'		translationEngineModel
✘	String	The model to use for translation when using DeepL. Either: 'quality_optimized'	'latency_optimized'	'prefer_quality_optimized', defaults to 'prefer_quality_optimized'
	waitForWorkflow	✘	Boolean
	workflowLabel	✘	String

Field Name

Required

Type

Description

Format

annotationIds

✘

List of long

List of source annotations to translate.

int64

language

✘

String

The target language for translation.

redo

✘

Boolean

Run again, even if the workflow already ran in this context.

redoSingleTask

✘

Boolean

skipActiveWorkflowTest

✘

Boolean

sourceLanguage

✘

String

The source language for translation. Leave empty to use the language of the annotations themselves.

translationEngine

✘

String

The engine to use for translation. Omit to make an automatic smart choice based on the languages involved.

translationEngineFormality

✘

String

The formality parameter to use for translation when using DeepL. Either: 'less'

'more'

'default'

'prefer_less'

'prefer_more', defaults to 'default'

translationEngineModel

✘

String

The model to use for translation when using DeepL. Either: 'quality_optimized'

'latency_optimized'

'prefer_quality_optimized', defaults to 'prefer_quality_optimized'

waitForWorkflow

✘

Boolean

workflowLabel

✘

String

Return Type

MediaObjectWorkflowReport

Field Name	Required	Type	Description	Format
adminOnly	✘	Boolean
audioAnalyzerCompleted	✘	Date		date-time
created	✘	Date	The time when this resource was created	date-time
createdBy	✘	String	The request or process that created this resource
createdByShareId	✘	Long		int64
createdBySharedUserId	✘	Long		int64
creatorId	✘	Long	The id of the user who created this resource	int64
duration	✘	Double		double
errorReports	✘	List of TaskReport
extra	✘	Object
funnel	✘	String
id	✔	Long	The id of this resource	int64
label	✘	String	User-friendly label of the workflow
lastUpdated	✘	Date	The time when this resource was last updated	date-time
mediaAnalyzerCompleted	✘	Date		date-time
mediaObjectId	✘	Long		int64
modifiedBy	✔	String	The request or process responsible for the last update of this resource
objectType	✘	String	The data model type or class name of this resource
productionId	✘	Long		int64
publishedFiles	✘	List of object	Files generated by the workflow, which can be downloaded.
removeFromQuota	✘	Boolean
requiredRights	✘	List of ProductionPermission
size	✘	Long		int64
startupParameters	✘	Object
status	✘	String		Enum: Inited, Started, Completed, Error, Cancelled, Paused, CompletedPending, ErrorPending, WaitForCallback, Scheduled, Skipped,
successFul	✘	Boolean
target	✘	String
taskReports	✘	List of TaskReport
transcoder1Completed	✘	Date		date-time
transcoder2Completed	✘	Date		date-time
variables	✘	Object
version	✔	Long	The version of this resource, used for Optimistic Locking	int64
workflowCompleted	✘	Date	When did the workflow complete?	date-time
workflowFailed	✘	Date	When did the workflow fail?	date-time
workflowId	✘	String	The id of the workflow. This can be used to retrieve the workflow status.
workflowStarted	✘	Date	When was the workflow started?	date-time
workflowTask	✘	String
workflowType	✘	String		Enum: INGEST, SPEECH, IPPAMEXPORT, IPPAMSYNC, MOIEXPORT, REMOTE_SPEECH, VOLDEMORT_SPEECH, TRANSCODE, AUDIOANALYZE, EXPORT_VWFLOW, FEATURE_EXTRACTION, BLACK_FRAME, STON_APPROVE, AAF_EXPORT, FCP_EXPORT, VOLDEMORT_SPEECH_2, KALDI_SPEECH, SUBTITLING, MIGRATE, INDEX, BACKUP, VOLDEMORT_SPEECH_3, VOLDEMORT_SPEECH_4, VOLDEMORT_SPEECH_5, TRANSLATION, INDEX_SWITCH, SIMPLEINGEST, CLONE, UPDATE_CATEGORY, WEBHOOK, SETKEEPER_ATTACH, PDF_EXPORT, SHOT_DETECTION, EXPORT, REMOTE_HELLO_WORLD, CUSTOM, UNKNOWN, CHANGE_AUDIO_LAYOUT, WORKSPACE_BOOTSTRAP, MEDIA_TRANSFER_COMPLETE, MEDIA_TRANSFER_FAILED, DELIVERY_REQUEST_SUBMISSION_CLIP_PROBED, DELIVERY_REQUEST_SUBMISSION, ADVANCED_SUBTITLE, TRANSCRIPTION_SUMMARIZE, VOLUME_TRANSFER_COMPLETE, VOLUME_TRANSFER_FAILED, VOLUME_TRANSFER_MONITOR,

Field Name

Required

Type

Description

Format

adminOnly

✘

Boolean

audioAnalyzerCompleted

✘

Date

date-time

created

✘

Date

The time when this resource was created

date-time

createdBy

✘

String

The request or process that created this resource

createdByShareId

✘

Long

int64

createdBySharedUserId

✘

Long

int64

creatorId

✘

Long

The id of the user who created this resource

int64

duration

✘

Double

double

errorReports

✘

List of TaskReport

extra

✘

Object

funnel

✘

String

✔

Long

The id of this resource

int64

label

✘

String

User-friendly label of the workflow

lastUpdated

✘

Date

The time when this resource was last updated

date-time

mediaAnalyzerCompleted

✘

Date

date-time

mediaObjectId

✘

Long

int64

modifiedBy

✔

String

The request or process responsible for the last update of this resource

objectType

✘

String

The data model type or class name of this resource

productionId

✘

Long

int64

publishedFiles

✘

List of object

Files generated by the workflow, which can be downloaded.

removeFromQuota

✘

Boolean

requiredRights

✘

List of ProductionPermission

size

✘

Long

int64

startupParameters

✘

Object

status

✘

String

Enum: Inited, Started, Completed, Error, Cancelled, Paused, CompletedPending, ErrorPending, WaitForCallback, Scheduled, Skipped,

successFul

✘

Boolean

target

✘

String

taskReports

✘

List of TaskReport

transcoder1Completed

✘

Date

date-time

transcoder2Completed

✘

Date

date-time

variables

✘

Object

version

✔

Long

The version of this resource, used for Optimistic Locking

int64

workflowCompleted

✘

Date

When did the workflow complete?

date-time

workflowFailed

✘

Date

When did the workflow fail?

date-time

workflowId

✘

String

The id of the workflow. This can be used to retrieve the workflow status.

workflowStarted

✘

Date

When was the workflow started?

date-time

workflowTask

✘

String

workflowType

✘

String

Enum: INGEST, SPEECH, IPPAMEXPORT, IPPAMSYNC, MOIEXPORT, REMOTE_SPEECH, VOLDEMORT_SPEECH, TRANSCODE, AUDIOANALYZE, EXPORT_VWFLOW, FEATURE_EXTRACTION, BLACK_FRAME, STON_APPROVE, AAF_EXPORT, FCP_EXPORT, VOLDEMORT_SPEECH_2, KALDI_SPEECH, SUBTITLING, MIGRATE, INDEX, BACKUP, VOLDEMORT_SPEECH_3, VOLDEMORT_SPEECH_4, VOLDEMORT_SPEECH_5, TRANSLATION, INDEX_SWITCH, SIMPLEINGEST, CLONE, UPDATE_CATEGORY, WEBHOOK, SETKEEPER_ATTACH, PDF_EXPORT, SHOT_DETECTION, EXPORT, REMOTE_HELLO_WORLD, CUSTOM, UNKNOWN, CHANGE_AUDIO_LAYOUT, WORKSPACE_BOOTSTRAP, MEDIA_TRANSFER_COMPLETE, MEDIA_TRANSFER_FAILED, DELIVERY_REQUEST_SUBMISSION_CLIP_PROBED, DELIVERY_REQUEST_SUBMISSION, ADVANCED_SUBTITLE, TRANSCRIPTION_SUMMARIZE, VOLUME_TRANSFER_COMPLETE, VOLUME_TRANSFER_FAILED, VOLUME_TRANSFER_MONITOR,

Content Type

application/json

Responses

Table 2. http response codes
Code	Description	Datatype
200	The request was successful.	`MediaObjectWorkflowReport`
403	The user needs START_TRANSLATION_WORKFLOW rights and additional rights depending on the type of the annotation.	`ForbiddenError`
404	The production or one of the annotations was not found.	`NotFoundError`
409	The workflow was already in progress.	`AlreadyExistsError`

The body of this request is a TranslationRequest JSON object.

Its annotationIds field is a list of source annotations. So, we put each id of the annotations we retrieved earlier in it.

In the language field, we put the desired target language, chosen from the list we retrieved earlier.

The workflow will translate each annotation passed to it in annotationIds, and create a new annotation with the same objectType and funnel, but now with the language being the one passed to the workflow.

The example below will start a workflow which will translate the annotation with id 922076 into German (language code "de"). Note too that the redo flag should be set to true, otherwise the workflow will not run if another translation workflow already ran before on that clip.

{
    "redo": true,
    "language": "de",
    "annotationIds": [
        922076
    ]
}

Follow-up the status of the translation workflow

The execution of the automated translation process is modeled as any other Limecraft Flow platform workflow, like the other enrichment and media processing workflows in our system. As such, the workflow API can be used to track its progress, till completion or failure.

The call mentioned above will return a MediaObjectWorkflowReport. Its workflowId field gives you a reference to the workflow that was started. Once this workflow completes, the Annotations containing the translation will have been created. To learn how to wait for a workflow to complete, see this section.

Retrieve the translation results

The translation results will be new Annotations, with the same objectType and funnel as the source annotations.

So, translating TranscriptAnnotations will result in just as many new TranscriptAnnotations, with the chosen language as their language field. See Retrieve the transcript to learn how to fetch the transcript in the target language.

Translating a Subtitle annotation will result in new Subtitle annotation, with the chosen language as its language field. See Retrieve a subtitle to learn how to fetch the subtitle in the target language.

Customize the translation process

Translate other annotations

While speech transcripts and subtitles are the primary candidates for translation, Limecraft Flow also supports translating other annotations, like a ClipAnnotation. In that case, the commonly used description field will be translated.

Use a different engine

The Limecraft Flow platform supports multiple translation engines. The default choice (same as not specifying translationEngine) will make a smart choice on what we feel is the best engine to translate the given source and target languages.

{
    "translationEngine": "default",
    "language": "fr",
    "annotationIds": [ {{annotationId}} ]
}

It is however possible to force using a particular engine by choosing a different value for translationEngine:

speechEngine	Description
default	will make a smart choice on what we feel is the best engine to translate the given source and target languages
googleTranslate	Google Translate
deepl	Deepl

speechEngine

Description

default

will make a smart choice on what we feel is the best engine to translate the given source and target languages

googleTranslate

Google Translate

deepl

Deepl