Language and Subtitle Presets
A Subtitle annotation on a clip will have a language, to indicate which language the subtitles are in which is not necessarily the same as the clip audio language.
It will also have a subtitlePresetId, which points to the subtitle preset used to create these subtitles.
Subtitle Presets combine styling and spotting settings into a preset which can be used to generate subtitles.
It is important to note that multiple Subtitle annotations can exist on a single clip,
but the combination of language and subtitlePresetId should be unique within the clip.
In other words, there can at most be one Subtitle annotation on a clip for a given language and subtitlePresetId .
|
Subtitle Presets
A subtitle preset contains settings both for spotting the subtitles and for how it should be styled.
The subtitle presets are stored in the production settings under the subtitlePresets
key.
Use cases
-
A production containing a mix of videos in landscape and portrait orientation. A video with portrait orientation requires different subtitle settings than a video in landscape orientation (less characters per line, more lines, smaller font size relative to the video height).
-
Different types of content in a single production
-
Language-specific optimizations. Some languages have very long words etc.
-
Different distribution platforms might have different subtitle requirements
Structure
{
"subtitlePresets": [
{
// the id should be unique within the production
// One could generate the id using `sp_${new Date().getTime()}_${Math.round(10000 * Math.random())}`
"id": "sp_20220112_12323",
// A short label, shown to the user when selecting the preset
"label": "Vertical Subtitle Preset",
// Optional longer text describing the preset and what is should be used for.
"description": "Some text describing the use of this subtitle preset",
// At most one preset can be selected as the default preset
"isDefault": false,
// when true, preset will be disabled from UI (but still exists to resolve existing subtitles using it)
"isDeactivated": false,
// spotting rules settings
"spottingRules": {
// ... more settings. See Spotting Rules section
},
// Example of a properties influencing subtitle styling. See Subtitle Styling section
"cellResolutionX": 32,
"cellResolutionY": 15,
"styles": [],
"regions": [],
"useSpeakerColors": []
}
]
}
Subtitle presets is an array of objects. Each preset object contains
Key | Type | Description | Example |
---|---|---|---|
id |
String |
A unique id. In javascript, you could generate one like |
|
label |
String |
A short label, shown to the user when selecting the preset |
|
description |
String |
Optional longer text describing the preset and what is should be used for. |
|
isDefault |
Boolean |
Set to |
|
isDeactivated |
Boolean |
When true, preset will be disabled from UI (but still exists to resolve existing subtitles using it) |
|
spottingRules |
Object |
See Spotting rules section |
… |
styling properties |
various |
See Subtitle Styling section |
… |
The id of the subtitlePreset is stored on the Subtitle annotation under structuredDescription.subtitlePresetId
, so it is always possible to fetch the correct settings for a given Subtitle.
Spotting rules
With “spotting”, we mean the process of defining in and out times of the individual subtitles. The spotting rules contain the main guidelines which the automatic subtitle workflow will use to automatically go from a transcript to spotted subtitles. They contain things like “a subtitle should be between 2 and 3 seconds long”, “a subtitle should have 2 or 3 lines with at most 32 characters per line” etc. It is not always possible to respect all spotting rules 100% automatically, in which case the automatic subtitler will spot the subtitles as good as possible. The subtitle editor UI will mark spotting rule violations so the user can manually inspect them.
An example is given below.
{
"spottingRules": {
"model": {
"wpm": {
"targetMax": 180
},
"lines": {
"max": 3,
"targetMax": 2
},
"duration": {
"targetMax": 5,
"targetMin": 2
},
"charsLine": {
"max": 40,
"targetMax": 37
}
},
"postProcess": {
"minGapSeconds": 2,
"fadeOutSeconds": 0.5,
"shotSnapThreshold": 0.5
},
"numericReplacements": {
"numericToWordsOptions": {
"exclude": []
}
}
}
}
Besides being used in a subtitlePreset, it is also possible to set the spottingRules object above as the value of a spottingRules key in the production settings. Then, it will be used as the fallback settings when no default subtitle preset is set.
|
Character limits
Defines the max characters a subtitle line should have with an optional hard limit used to divide subtitle lines.
Key | Description |
---|---|
|
Max characters |
|
Hard character limit |
Lines limits
Defines the max lines a subtitle should have with an optional hard limit used to split subtitles.
Key | Description |
---|---|
|
Max lines |
|
Hard lines limit |
Subtitle duration and word rate
Key | Description |
---|---|
|
Minimum duration of a subtitle, in seconds |
|
Maximum duration of a subtitle, in seconds |
|
words per minute |
Text to digits and digits to text
The subtitler can automatically convert numbers that are written out as text into digits.
The subtitler can automatically convert digits to text.
Note that the subtitler will first perform the "wordsToNumeric" rule if it is activated, and then the "numericToWords" rule.
Key | Description |
---|---|
|
Replace textual numbers to digits. E.g. one hundred → 100. |
|
Activate replacement. E.g. "100 → one hundred". |
|
Replace when number is >= this number |
|
Replace when number is ⇐ this number |
|
Do not apply numeric to words for a certain category of words. It is an array which is either empty (always apply the replacement), or contains the value |
Subtitle appearance timing
Key | Description |
---|---|
|
Snap the subtitle appearance timing to the shot change if it is close enough. In seconds. |
Subtitle disappearance timing
Key | Description |
---|---|
|
Define how long the subtitle remains on screen after the last spoken words (fade out). In seconds. |
|
Force a timing gap between consecutive subtitles (0 for no gap). |
|
Snap to next threshold. Will also respect the |
Sentence splitting
When splitting a sentence, add the suffix at the end of the first part and the prefix at the beginning of the second part. Leave empty to have no indicating characters for a sentence split.
Key | Description |
---|---|
|
If a sentence is split, use this suffix on the parts. |
|
If a sentence is split, use this prefix on the parts. |
Styling
There are a lot of settings influencing the look of a subtitle. The commented example below lists them. Most of these can be set via Flow-UI.
All these settings can be set on a subtitlePreset (where e.g. cellResolutionX
would be a sibling to the id
property).
Besides being used in a subtitlePreset, it is also possible to set the object below as the value of a subtitleSettings key in the production settings. Then, it will be used as the fallback settings when no default subtitle preset is set.
|
{
// The video frame is divided in cells. The default is 32 cells per line
// and 15 cells per column.
// This results in a cell width of 100%/32 = 3.125% and
// a cell height of 100%/15 = 6.667%
cellResolutionX: 32,
cellResolutionY: 15,
styles: {
// The default style for a span (sequence of characters).
// Identified by its defaultSpanStyle property
spanStyle: {
color: '#FFFFFF',
// this can also be a color with an alpha channel,
// e.g. #000000EE
backgroundColor: '#000000',
defaultSpanStyle: true
},
//
// Predefined styles which can be chosen in the UI
//
cyanStyle: {
color: '#00FFFF',
backgroundColor: '#000000',
isSpeakerColor: true,
speakerColorIndex: 2,
label: "Cyan"
},
greenStyle: {
color: '#00FF00',
backgroundColor: '#000000',
isSpeakerColor: true,
speakerColorIndex: 3
},
whiteStyle: {
color: '#FFFFFF',
backgroundColor: '#000000',
isSpeakerColor: true,
speakerColorIndex: 0
},
yellowStyle: {
color: '#FFFF00',
backgroundColor: '#000000',
isSpeakerColor: true,
speakerColorIndex: 1
},
// This is the default paragraph style (identified by its
// defaultParagraphStyle: true property)
paragraphStyle: {
// relative to the cell height (this means,
// 100% of cell height)
fontSize: 1,
// relative to the font size. Note that the
// background expands to this size.
// https://www.bbc.co.uk/accessibility/forproducts/guides/subtitles/#tts-lineHeight
lineHeight: 1.2,
// half a cell padding before and after each line
linePadding: '0.5c',
wrapOption: 'noWrap',
textAlign: 'center',
fontFamily: 'proportionalSansSerif',
multiRowAlign: 'center',
defaultParagraphStyle: true
}
},
//
// The UI will show a toggle between top and bottom region.
// The position and size of these regions is defined here
//
regions: {
topRegion: {
// 70% of video frame width
width: 0.7,
// 24% of video frame height
height: 0.24,
// left of region is at 15% of left side of video frame
originX: 0.15,
// top of region is at 16% of top of video frame
originY: 0.16,
// advised when paragraphStyle.wrapOption is noWrap
overflow: 'visible',
writingMode: 'lrtb',
// the first subtitle line will stick to the top
// of the region, https://www.bbc.co.uk/accessibility/forproducts/guides/subtitles/#tts-displayAlign
displayAlign: 'before'
},
bottomRegion: {
width: 0.7,
height: 0.24,
default: true,
originX: 0.15,
originY: 0.6,
overflow: 'visible',
writingMode: 'lrtb',
// the last subtitle line will stick to the bottom
// of the region, https://www.bbc.co.uk/accessibility/forproducts/guides/subtitles/#tts-displayAlign
displayAlign: 'after'
}
}
}
Some of the parameters influencing font size and positioning are illustrated in the sketch below.
These settings are influenced by the ebu-tt-d spec. The BBC subtitle guidelines contains a well written explanation of the spec and how they are used at BBC.
speaker styles
A subset of styles can be used to indicate speaker color. The automatic subtitler will rotate between these colors, and the UI will show the user the option to select one of these styles for a selection of subtitle text.
The isSpeakerColor
property is used to indicate a style is a speaker color style. When not set, the default value is true
if both defaultParagraphStyle
and defaultSpanStyle
are not true
. Otherwise the default is false
.
The order of speaker colors (in which order they are shown in the UI, and also in which order the automatic subtitler rotates them) is indicated by speakerColorIndex
. To handle legacy styles where this property is not set, the code handling this should
-
first order all styles which have
speakerColorIndex
-
next find styles with the following keys and append them in the following order: ['whiteStyle', 'yellowStyle', 'cyanStyle', 'greenStyle']
-
other styles not having
speakerColorIndex
their order is not strictly defined, but they will appear after the styles handled in the previous steps
fonts
If paragraphStyle.fontFamily
is proportionalSansSerif
, it is converted into Arial, Helvetica, sans-serif
. Other values will remain as-is. This value ends up in the font-family
css property.
The code will look into productionSettings.fontDefinitions
for matches. For matches, the appropriate font-face css is loaded. See Font Definitions.
The code will not fail if no match is found. It is assumed the font is installed on the system then.