Translation

This guide explains the overall process for executing speech transcript and subtitle translation with Limecraft Flow.

The outline of the process is as follows:

Get supported languages for translation

First let’s learn which languages the system can translate between. This information can be retrieved using this call:

GET /production/{prId}/translation/language

List the available languages for translation.

Details
Description
Parameters
Path Parameters
Name Description Required Type

prId

ID of the production.

Long

Return Type

String

Content Type
  • application/json

Responses
Table 1. http response codes
Code Description Datatype

200

The request was successful.

String

403

The user needs START_TRANSLATION_WORKFLOW rights.

ForbiddenError

404

The production was not found.

NotFoundError

The response is a list of language code strings. E.g.:

[
    "en-GB",
    "en-US",
    "nl"
]

In the remainder of this guide, we will talk about the 'source language' (of the annotations we want to translate from) and the 'target language'. Both languages have to be in this list for the translation to be supported.

Create a clip

We assume a clip has already been uploaded to the Limecraft platform. If not, refer to Create And Upload MediaObjects to learn how to do this.

Retrieve the source annotations

Limecraft Flow translation happens on Annotations. We need the id of the annotations we want to translate. We describe how translation of subtitles and speech transcriptions is done, respectively.

Subtitle

We assume the clip already has subtitles in the source language available on the Limecraft platform. If not, refer to Subtitling for more info.

See Retrieve a subtitle to learn how to fetch the subtitle in the source language. Note that the language of the Subtitle annotation should be in the supported language list retrieved earlier.

The response is a single Subtitle annotation. We’ll need its id to put in the annotationIds list parameter when starting the translation workflow in the next section.

Transcription

We assume the clip already has a transcript in the source language available on the Limecraft platform. If not, refer to Transcription for more info.

See Retrieve the transcript to learn how to fetch the transcript in the source language. Note that the language of the TranscriptAnnotations should be in the supported language list retrieved earlier. The response is a list of TranscriptAnnotations. We’ll need the id of each of them to put in the annotationIds list parameter when starting the translation workflow in the next section.

Start the translation workflow

Starting the translation workflow is done using this call:

POST /production/{prId}/mo/{moId}/translate

A call for starting a translation workflow.

Details
Description

Depending on the body of the query, you can choose between different translation engines.

Parameters
Path Parameters
Name Description Required Type

prId

ID of the production.

Long

moId

ID of the media object.

Long

Body Parameters
Name Description Required Type

TranslationRequest

TranslationRequest

TranslationRequest

Field Name Required Type Description Format

annotationIds

List of long

List of source annotations to translate.

int64

language

String

The target language for translation.

redo

Boolean

Run again, even if the workflow already ran in this context.

redoSingleTask

Boolean

skipActiveWorkflowTest

Boolean

sourceLanguage

String

The source language for translation. Leave empty to use the language of the annotations themselves.

translationEngine

String

The engine to use for translation. Omit to make an automatic smart choice based on the languages involved.

waitForWorkflow

Boolean

Return Type

MediaObjectWorkflowReport

Field Name Required Type Description Format

adminOnly

Boolean

audioAnalyzerCompleted

Date

date-time

created

Date

The time when this resource was created

date-time

createdBy

String

The request or process that created this resource

createdByShareId

Long

int64

createdBySharedUserId

Long

int64

creatorId

Long

The id of the user who created this resource

int64

duration

Double

double

errorReports

List of TaskReport

extra

Object

funnel

String

id

Long

The id of this resource

int64

label

String

User-friendly label of the workflow

lastUpdated

Date

The time when this resource was last updated

date-time

mediaAnalyzerCompleted

Date

date-time

mediaObjectId

Long

int64

modifiedBy

String

The request or process responsible for the last update of this resource

objectType

String

The data model type or class name of this resource

productionId

Long

int64

publishedFiles

List of object

Files generated by the workflow, which can be downloaded.

removeFromQuota

Boolean

requiredRights

List of ProductionPermission

size

Long

int64

startupParameters

Object

status

String

Enum: Inited, Started, Completed, Error, Cancelled, Paused, CompletedPending, ErrorPending, WaitForCallback, Scheduled,

successFul

Boolean

target

String

taskReports

List of TaskReport

transcoder1Completed

Date

date-time

transcoder2Completed

Date

date-time

variables

Object

version

Long

The version of this resource, used for Optimistic Locking

int64

workflowCompleted

Date

When did the workflow complete?

date-time

workflowFailed

Date

When did the workflow fail?

date-time

workflowId

String

The id of the workflow. This can be used to retrieve the workflow status.

workflowStarted

Date

When was the workflow started?

date-time

workflowTask

String

workflowType

String

Enum: INGEST, SPEECH, IPPAMEXPORT, IPPAMSYNC, MOIEXPORT, REMOTE_SPEECH, VOLDEMORT_SPEECH, TRANSCODE, AUDIOANALYZE, EXPORT_VWFLOW, FEATURE_EXTRACTION, BLACK_FRAME, STON_APPROVE, AAF_EXPORT, FCP_EXPORT, VOLDEMORT_SPEECH_2, KALDI_SPEECH, SUBTITLING, MIGRATE, INDEX, BACKUP, VOLDEMORT_SPEECH_3, VOLDEMORT_SPEECH_4, VOLDEMORT_SPEECH_5, TRANSLATION, INDEX_SWITCH, SIMPLEINGEST, CLONE, UPDATE_CATEGORY, WEBHOOK, SETKEEPER_ATTACH, PDF_EXPORT, SHOT_DETECTION, EXPORT, REMOTE_HELLO_WORLD, CUSTOM, UNKNOWN, CHANGE_AUDIO_LAYOUT, WORKSPACE_BOOTSTRAP, MEDIA_TRANSFER_COMPLETE, MEDIA_TRANSFER_FAILED, DELIVERY_REQUEST_SUBMISSION_CLIP_PROBED, DELIVERY_REQUEST_SUBMISSION, ADVANCED_SUBTITLE, TRANSCRIPTION_SUMMARIZE,

Content Type
  • application/json

Responses
Table 2. http response codes
Code Description Datatype

200

The request was successful.

MediaObjectWorkflowReport

403

The user needs START_TRANSLATION_WORKFLOW rights and additional rights depending on the type of the annotation.

ForbiddenError

404

The production or one of the annotations was not found.

NotFoundError

409

The workflow was already in progress.

AlreadyExistsError

The body of this request is a TranslationRequest JSON object.

Its annotationIds field is a list of source annotations. So, we put each id of the annotations we retrieved earlier in it.

In the language field, we put the desired target language, chosen from the list we retrieved earlier.

The workflow will translate each annotation passed to it in annotationIds, and create a new annotation with the same objectType and funnel, but now with the language being the one passed to the workflow.

The example below will start a workflow which will translate the annotation with id 922076 into German (language code "de"). Note too that the redo flag should be set to true, otherwise the workflow will not run if another translation workflow already ran before on that clip.

{
    "redo": true,
    "language": "de",
    "annotationIds": [
        922076
    ]
}

Follow-up the status of the translation workflow

The execution of the automated translation process is modeled as any other Limecraft Flow platform workflow, like the other enrichment and media processing workflows in our system. As such, the workflow API can be used to track its progress, till completion or failure.

The call mentioned above will return a MediaObjectWorkflowReport. Its workflowId field gives you a reference to the workflow that was started. Once this workflow completes, the Annotations containing the translation will have been created. To learn how to wait for a workflow to complete, see this section.

Retrieve the translation results

The translation results will be new Annotations, with the same objectType and funnel as the source annotations.

So, translating TranscriptAnnotations will result in just as many new TranscriptAnnotations, with the chosen language as their language field. See Retrieve the transcript to learn how to fetch the transcript in the target language.

Translating a Subtitle annotation will result in new Subtitle annotation, with the chosen language as its language field. See Retrieve a subtitle to learn how to fetch the subtitle in the target language.

Customize the translation process

Translate other annotations

While speech transcripts and subtitles are the primary candidates for translation, Limecraft Flow also supports translating other annotations, like a ClipAnnotation. In that case, the commonly used description field will be translated.

Use a different engine

The Limecraft Flow platform supports multiple translation engines. The default choice (same as not specifying translationEngine) will make a smart choice on what we feel is the best engine to translate the given source and target languages.

{
    "translationEngine": "default",
    "language": "fr",
    "annotationIds": [ {{annotationId}} ]
}

It is however possible to force using a particular engine by choosing a different value for translationEngine:

speechEngine Description

default

will make a smart choice on what we feel is the best engine to translate the given source and target languages

googleTranslate

Google Translate

deepl

Deepl