Annotations

Annotations are used to describe a temporal part (with a start and end) of a MediaObject, and are the main objects to attach metadata to clips of audiovisual material. Multiple types of annotations exist, but they all share some common characteristics.

Annotation Structure

Below is the structure of a generic Annotation. These properties are available on all annotations, as all annotation types extend from this base type. See the section on Common types of Annotation: about objectType and funnel below for more information on the particular types of annotations.

Field Name	Required	Type	Description	Format
annotationProductionId	✘	Long		int64
clipMetadata	✘	ClipMetadata
created	✘	Date	The time when this resource was created	date-time
createdBy	✘	String	The request or process that created this resource
createdByShareId	✘	Long		int64
createdBySharedUserId	✘	Long		int64
creatorId	✘	Long	The id of the user who created this resource	int64
crossProduction	✘	Boolean
customFields	✘	CustomFields
deleted	✘	Date		date-time
description	✘	String	Textual contents of the Annotation
end	✘	Long	The frame range described by the annotation runs up to end, but not including it. Should be less than or equal to the amount of frames the MediaObject has.	int64
funnel	✘	String	Describes how the Annotation should be interpreted by the client application. Can be thought of as a subtype.
id	✔	Long	The id of this resource	int64
includeTranslatedTo	✘	Boolean
includesFrom	✘	Set of string
keyframeFrames	✘	Long		int64
label	✘	String
language	✘	String
lastUpdated	✘	Date	The time when this resource was last updated	date-time
mediaObject	✘	MediaObject
mediaObjectId	✘	Long		int64
modifiedBy	✔	String	The request or process responsible for the last update of this resource
objectType	✘	String	The data model type or class name of this resource
origin	✘	String
productionId	✘	Long		int64
rating	✘	Double		double
relatedToId	✘	Long		int64
securityClasses	✘	Set of string		Enum:
source	✘	String
spatial	✘	String	Link the Annotation to a specific part of the video or image frame. A Media Fragments Spatial Dimension description string is expected.
start	✘	Long	First frame of the Annotation. 0 is the first frame of the clip. The start frame is included in the frame range the annotation describes.	int64
systemFields	✘	CustomFields
tags	✘	Set of string
translatedFromId	✘	Long		int64
translatedToIds	✘	Set of long		int64
version	✔	Long	The version of this resource, used for Optimistic Locking	int64

Field Name

Required

Type

Description

Format

annotationProductionId

✘

Long

int64

clipMetadata

✘

ClipMetadata

created

✘

Date

The time when this resource was created

date-time

createdBy

✘

String

The request or process that created this resource

createdByShareId

✘

Long

int64

createdBySharedUserId

✘

Long

int64

creatorId

✘

Long

The id of the user who created this resource

int64

crossProduction

✘

Boolean

customFields

✘

CustomFields

deleted

✘

Date

date-time

description

✘

String

Textual contents of the Annotation

end

✘

Long

The frame range described by the annotation runs up to end, but not including it. Should be less than or equal to the amount of frames the MediaObject has.

int64

funnel

✘

String

Describes how the Annotation should be interpreted by the client application. Can be thought of as a subtype.

✔

Long

The id of this resource

int64

includeTranslatedTo

✘

Boolean

includesFrom

✘

Set of string

keyframeFrames

✘

Long

int64

label

✘

String

language

✘

String

lastUpdated

✘

Date

The time when this resource was last updated

date-time

mediaObject

✘

MediaObject

mediaObjectId

✘

Long

int64

modifiedBy

✔

String

The request or process responsible for the last update of this resource

objectType

✘

String

The data model type or class name of this resource

origin

✘

String

productionId

✘

Long

int64

rating

✘

Double

double

relatedToId

✘

Long

int64

securityClasses

✘

Set of string

Enum:

source

✘

String

spatial

✘

String

Link the Annotation to a specific part of the video or image frame. A Media Fragments Spatial Dimension description string is expected.

start

✘

Long

First frame of the Annotation. 0 is the first frame of the clip. The start frame is included in the frame range the annotation describes.

int64

systemFields

✘

CustomFields

Understanding start and end

Perhaps the most important fields of an annotation are its start and end fields. The Annotation is usually about a temporal part of the MediaObject. The time where it starts is contained in start. So this is the first frame of the Annotation. The end field contains the frame where the Annotation ends. The end frame itself is not included in the range, so it is actually the first frame number which is no longer part of the Annotation. Essentially, the Annotation describes the open-ended frame range [start, end[.

By using frames to describe an Annotation range, we can unambiguously refer to a certain frame range of a video, or audio by using a default frame rate of 25 FPS for audio-only content.

Note: The value of -1 for the end field can be used to represent up until the end of the clip. This is equivalent to setting end to the actual duration of the MediaObject.

Example

Say we have a MediaObject of type VIDEO with sampleRate equal to { numerator: 25, denumerator: 1 }, and a length of 10 seconds.

The MediaObject will have a duration of 10*25 = 250 frames. Valid values for an Annotation’s start field will be 0, 1, … 249. Valid values for an Annotation’s end field will be 1, 2, … 250.

An Annotation with { start: 25, end: 50 } covers the range from 1" up to (but not including the first frame of) 2".

Language

All annotations have a language field. This is not always used, but it can be important, e.g. for Subtitle and TranscriptAnnotation.

The language is a string with a language code from the IETF BCP 47 specification. A readable resource for this specification can be found here.

Common types of Annotation: about objectType and funnel

Usually we don’t directly interact with a base Annotation object, but rather with a subclass of it. The objectType tells you what type of Annotation you are dealing with. The funnel describes how the Annotation should be interpreted by a client application. This allows using the same objectType for different applications (by using a different funnel) as long as the structure of the annotations is the same.

objectType funnel description

objectType	funnel	description
ClipAnnotation	`ClipAnnotation` and `ReviewComment`	Used by the Limecraft Flow UI Logger application to represent 'subclips' and 'comments'
StructuredAnnotation	Application-depended value.	The StructuredAnnotation represents an annotation with structured information in it.
StructuredAnnotation	`Subtitle`	Represents the subtitles of a clip in a particular language, with a particular subtitle template. The Annotation always covers the entire range.
MediaObjectAnnotation	`MediaObjectAnnotation`	This contains metadata applicable to the entire clip. It automatically exists for all clips. Its `start` is always 0, and its `end` is always equal to the `duration` of the MediaObject (so, the entire range).
TranscriptAnnotation	`TranscriptAnnotation`	Used by the Limecraft Flow Transcriber application. It represents a single paragraph of timecoded transcript text in a particular language.

ClipAnnotation

ClipAnnotation and ReviewComment

Used by the Limecraft Flow UI Logger application to represent 'subclips' and 'comments'

StructuredAnnotation

Application-depended value.

The StructuredAnnotation represents an annotation with structured information in it.

StructuredAnnotation

Subtitle

Represents the subtitles of a clip in a particular language, with a particular subtitle template. The Annotation always covers the entire range.

MediaObjectAnnotation

MediaObjectAnnotation

This contains metadata applicable to the entire clip. It automatically exists for all clips. Its start is always 0, and its end is always equal to the duration of the MediaObject (so, the entire range).

TranscriptAnnotation

TranscriptAnnotation

Used by the Limecraft Flow Transcriber application. It represents a single paragraph of timecoded transcript text in a particular language.