Language and Subtitle Presets

A Subtitle annotation on a clip will have a language, to indicate which language the subtitles are in which is not necessarily the same as the clip audio language.

It will also have a subtitlePresetId, which points to the subtitle preset used to create these subtitles.

Subtitle Presets combine styling and spotting settings into a preset which can be used to generate subtitles.

It is important to note that multiple Subtitle annotations can exist on a single clip, but the combination of language and subtitlePresetId should be unique within the clip. In other words, there can at most be one Subtitle annotation on a clip for a given language and subtitlePresetId.

Subtitle Presets

A subtitle preset contains settings both for spotting the subtitles and for how it should be styled. The subtitle presets are stored in the production settings under the subtitlePresets key.

Use cases

  • A production containing a mix of videos in landscape and portrait orientation. A video with portrait orientation requires different subtitle settings than a video in landscape orientation (less characters per line, more lines, smaller font size relative to the video height).

  • Different types of content in a single production

  • Language-specific optimizations. Some languages have very long words etc.

  • Different distribution platforms might have different subtitle requirements

Structure

{
    "subtitlePresets": [
        {
            // the id should be unique within the production
            // One could generate the id using `sp_${new Date().getTime()}_${Math.round(10000 * Math.random())}`
            "id": "sp_20220112_12323",

            // A short label, shown to the user when selecting the preset
            "label": "Vertical Subtitle Preset",

            // Optional longer text describing the preset and what is should be used for.
            "description": "Some text describing the use of this subtitle preset",

            // At most one preset can be selected as the default preset
            "isDefault": false,

            // when true, preset will be disabled from UI (but still exists to resolve existing subtitles using it)
            "isDeactivated": false,

            // spotting rules settings
            "spottingRules": {
                // ... more settings. See Spotting Rules section
            },

             // Example of a properties influencing subtitle styling. See Subtitle Styling section
            "cellResolutionX": 32,
            "cellResolutionY": 15,
            "styles": [],
            "regions": [],
            "useSpeakerColors": []
        }
    ]
}

Subtitle presets is an array of objects. Each preset object contains

Key Type Description Example

id

String

A unique id. In javascript, you could generate one like sp_${new Date().getTime()}_${Math.round(10000 * Math.random())}

"sp_20220112_12323"

label

String

A short label, shown to the user when selecting the preset

"Vertical Subtitle Preset"

description

String

Optional longer text describing the preset and what is should be used for.

"Long description"

isDefault

Boolean

Set to true to make this the default preset.

false

isDeactivated

Boolean

When true, preset will be disabled from UI (but still exists to resolve existing subtitles using it)

false

spottingRules

Object

See Spotting rules section

…​

styling properties

various

See Subtitle Styling section

…​

The id of the subtitlePreset is stored on the Subtitle annotation under structuredDescription.subtitlePresetId, so it is always possible to fetch the correct settings for a given Subtitle.

Spotting rules

With “spotting”, we mean the process of defining in and out times of the individual subtitles. The spotting rules contain the main guidelines which the automatic subtitle workflow will use to automatically go from a transcript to spotted subtitles. They contain things like “a subtitle should be between 2 and 3 seconds long”, “a subtitle should have 2 or 3 lines with at most 32 characters per line” etc. It is not always possible to respect all spotting rules 100% automatically, in which case the automatic subtitler will spot the subtitles as good as possible. The subtitle editor UI will mark spotting rule violations so the user can manually inspect them.

An example is given below.

{
    "spottingRules": {
      "model": {
        "wpm": {
          "targetMax": 180
        },
        "lines": {
          "max": 3,
          "targetMax": 2
        },
        "duration": {
          "targetMax": 5,
          "targetMin": 2
        },
        "charsLine": {
          "max": 40,
          "targetMax": 37
        }
      },
      "postProcess": {
        "minGapSeconds": 2,
        "fadeOutSeconds": 0.5,
        "shotSnapThreshold": 0.5
      },
      "numericReplacements": {
        "numericToWordsOptions": {
          "exclude": []
        }
      }
    }
}
Besides being used in a subtitlePreset, it is also possible to set the spottingRules object above as the value of a spottingRules key in the production settings. Then, it will be used as the fallback settings when no default subtitle preset is set.
Character limits

Defines the max characters a subtitle line should have with an optional hard limit used to divide subtitle lines.

Key Description

spottingRules.model.charsLine.targetMax

Max characters

spottingRules.model.charsLine.max

Hard character limit

Lines limits

Defines the max lines a subtitle should have with an optional hard limit used to split subtitles.

Key Description

spottingRules.model.lines.targetMax

Max lines

spottingRules.model.lines.max

Hard lines limit

Subtitle duration and word rate
Key Description

spottingRules.model.duration.targetMin

Minimum duration of a subtitle, in seconds

spottingRules.model.duration.targetMax

Maximum duration of a subtitle, in seconds

spottingRules.model.wpm.targetMax

words per minute

Text to digits and digits to text

The subtitler can automatically convert numbers that are written out as text into digits.

The subtitler can automatically convert digits to text.

Note that the subtitler will first perform the "wordsToNumeric" rule if it is activated, and then the "numericToWords" rule.

Key Description

spottingRules.numericReplacements.wordsToNumeric

Replace textual numbers to digits. E.g. one hundred → 100.

spottingRules.numericReplacements.numericToWords

Activate replacement. E.g. "100 → one hundred".

spottingRules.numericReplacements.numericToWOrdsOptions.min

Replace when number is >= this number

spottingRules.numericReplacements.numericToWOrdsOptions.max

Replace when number is ⇐ this number

spottingRules.numericReplacements.numericToWordsOptions.exclude

Do not apply numeric to words for a certain category of words. It is an array which is either empty (always apply the replacement), or contains the value "measurements" to exclude measurements.

Subtitle appearance timing
Key Description

spottingRules.postProcess.shotSnapThreshold

Snap the subtitle appearance timing to the shot change if it is close enough. In seconds.

Subtitle disappearance timing
Key Description

spottingRules.postProcess.fadeOutSeconds

Define how long the subtitle remains on screen after the last spoken words (fade out). In seconds.

spottingRules.postProcess.minSpacing

Force a timing gap between consecutive subtitles (0 for no gap).

spottingRules.postProcess.minGapSeconds

Snap to next threshold. Will also respect the minSpacing.

Sentence splitting

When splitting a sentence, add the suffix at the end of the first part and the prefix at the beginning of the second part. Leave empty to have no indicating characters for a sentence split.

Key Description

spottingRules.sentenceSplitSuffix

If a sentence is split, use this suffix on the parts.

spottingRules.sentenceSplitPrefix

If a sentence is split, use this prefix on the parts.

Styling

There are a lot of settings influencing the look of a subtitle. The commented example below lists them. Most of these can be set via Flow-UI.

All these settings can be set on a subtitlePreset (where e.g. cellResolutionX would be a sibling to the id property).

Besides being used in a subtitlePreset, it is also possible to set the object below as the value of a subtitleSettings key in the production settings. Then, it will be used as the fallback settings when no default subtitle preset is set.
{
     // The video frame is divided in cells. The default is 32 cells per line
     // and 15 cells per column.
     // This results in a cell width of 100%/32 = 3.125% and
     // a cell height of 100%/15 = 6.667%
    cellResolutionX: 32,
    cellResolutionY: 15,

    styles: {
        // The default style for a span (sequence of characters).
        // Identified by its defaultSpanStyle property
        spanStyle: {
            color: '#FFFFFF',
            // this can also be a color with an alpha channel,
            // e.g. #000000EE
            backgroundColor: '#000000',
            defaultSpanStyle: true
        },

        //
        // Predefined styles which can be chosen in the UI
        //
        cyanStyle: {
            color: '#00FFFF',
            backgroundColor: '#000000',
            isSpeakerColor: true,
            speakerColorIndex: 2,
            label: "Cyan"
        },
        greenStyle: {
            color: '#00FF00',
            backgroundColor: '#000000',
            isSpeakerColor: true,
            speakerColorIndex: 3
        },
        whiteStyle: {
            color: '#FFFFFF',
            backgroundColor: '#000000',
            isSpeakerColor: true,
            speakerColorIndex: 0
        },
        yellowStyle: {
            color: '#FFFF00',
            backgroundColor: '#000000',
            isSpeakerColor: true,
            speakerColorIndex: 1
        },

         // This is the default paragraph style (identified by its
         // defaultParagraphStyle: true property)
        paragraphStyle: {
            // relative to the cell height (this means,
            // 100% of cell height)
            fontSize: 1,

            // relative to the font size. Note that the
            // background expands to this size.
            // https://www.bbc.co.uk/accessibility/forproducts/guides/subtitles/#tts-lineHeight
            lineHeight: 1.2,

            // half a cell padding before and after each line
            linePadding: '0.5c',

            wrapOption: 'noWrap',
            textAlign: 'center',
            fontFamily: 'proportionalSansSerif',
            multiRowAlign: 'center',
            defaultParagraphStyle: true
        }
    },

    //
    // The UI will show a toggle between top and bottom region.
    // The position and size of these regions is defined here
    //
    regions: {
        topRegion: {
            // 70% of video frame width
            width: 0.7,

            // 24% of video frame height
            height: 0.24,

            // left of region is at 15% of left side of video frame
            originX: 0.15,

            // top of region is at 16% of top of video frame
            originY: 0.16,

            // advised when paragraphStyle.wrapOption is noWrap
            overflow: 'visible',
            writingMode: 'lrtb',

            // the first subtitle line will stick to the top
            // of the region, https://www.bbc.co.uk/accessibility/forproducts/guides/subtitles/#tts-displayAlign
            displayAlign: 'before'
        },
        bottomRegion: {
            width: 0.7,
            height: 0.24,
            default: true,
            originX: 0.15,
            originY: 0.6,
            overflow: 'visible',
            writingMode: 'lrtb',

            // the last subtitle line will stick to the bottom
            // of the region, https://www.bbc.co.uk/accessibility/forproducts/guides/subtitles/#tts-displayAlign
            displayAlign: 'after'
        }
    }
}

Some of the parameters influencing font size and positioning are illustrated in the sketch below.

Example1

These settings are influenced by the ebu-tt-d spec. The BBC subtitle guidelines contains a well written explanation of the spec and how they are used at BBC.

speaker styles

A subset of styles can be used to indicate speaker color. The automatic subtitler will rotate between these colors, and the UI will show the user the option to select one of these styles for a selection of subtitle text.

The isSpeakerColor property is used to indicate a style is a speaker color style. When not set, the default value is true if both defaultParagraphStyle and defaultSpanStyle are not true. Otherwise the default is false.

The order of speaker colors (in which order they are shown in the UI, and also in which order the automatic subtitler rotates them) is indicated by speakerColorIndex. To handle legacy styles where this property is not set, the code handling this should

  • first order all styles which have speakerColorIndex

  • next find styles with the following keys and append them in the following order: ['whiteStyle', 'yellowStyle', 'cyanStyle', 'greenStyle']

  • other styles not having speakerColorIndex their order is not strictly defined, but they will appear after the styles handled in the previous steps

fonts

If paragraphStyle.fontFamily is proportionalSansSerif, it is converted into Arial, Helvetica, sans-serif. Other values will remain as-is. This value ends up in the font-family css property.

The code will look into productionSettings.fontDefinitions for matches. For matches, the appropriate font-face css is loaded. See Font Definitions.

The code will not fail if no match is found. It is assumed the font is installed on the system then.