Getting started

This page is the orientation an external agent needs before calling any mutating tool. Read it once; the conventions here apply to every tool.

Browse before mutating

The first call of any session is `describe_video`. It returns a cheap structural overview — canvas meta, the backdrop, and a z-ordered tree of every layer with its id, type, name, label (filename/clip/text/kind), geometry, and which properties are animated. It deliberately omits keyframe values and styles (the unbounded part). It is free and never mutates anything.

You need describe_video because you must not invent element ids or filenames. Read them from the tree. The single most common way an agent gets lost is skipping this step and constructing an id like image.title from intuition — the real id is whatever the overview says it is.

When you need a layer's full detail — keyframe values, colour tracks, track-loop modes, styles, every field — call `inspect_layers([elementId, …])` for just the layers you're about to change. Don't guess values from the overview; don't pull detail for layers you won't touch.

A typical session is small:

  1. describe_video — see the overview tree.
  2. inspect_layers([elementId, …]) — full detail on the layers you'll change (skip if only adding new layers).
  3. Call one or more mutating tools.
  4. save_version once at the end.

With the SDK each step is a callTool(projectId, name, args):

await morpha.callTool(id, "describe_video", {});
await morpha.callTool(id, "inspect_layers", { elementIds: ["text.7f3a2c"] });
await morpha.callTool(id, "add_keyframe", { elementId: "text.7f3a2c", property: "opacity", frame: 0, value: 0 });
await morpha.callTool(id, "save_version", { name: "add fade-in" });

The save-version bracket

Versions are user-visible: the editor's Versions panel lists every saved version, and the user flicks between them to compare states or roll back. An edit you make without saving a version is one the user can't easily revisit.

Wrap a session like this:

save_version(name="baseline before <task>")   ← rollback point
… your mutations …
save_version(name="<short description>")        ← end-of-task marker

Call save_version once per logical change-set, not once per tool call. A version named "add fade-in for stars" is useful; thirty versions each named "change" are noise. Use a short imperative-mood label — the user sees it verbatim.

Element ids

Every layer is addressed by a prefixed id. Six shapes appear across the catalog:

PrefixElement
video.<id>A video layer — carries a clip filename, renders its source mp4 into the layer box.
image.<id>An image layer — renders an uploaded bitmap.
text.<id>A text layer — renders live typeset text (multi-line, auto-fit).
shapes.<id>A shape layer — rect, ellipse, triangle, or star.
group.<id>A layer group — holds an ordered children[] and composes a transform onto every descendant.
image.backgroundThe pinned canvas-backdrop sentinel. Exactly one per project; always painted at the back. The legacy alias background.canvas is still accepted.

<id> is a 6-char lowercase hex token ([a-f0-9]{6}) generated when the layer is created — opaque, not meaningful. It is never derived from the layer's name, filename, or text content. Matches the pattern used by After Effects, Premiere, Final Cut, Motion, Figma, and Illustrator: ids are storage keys; the user-facing handle is the layer's name. Never construct an id by hand; never expect it to look meaningful; always pull ids from describe_video's output.

Groups take a bare id in a few places. ungroup_layers, rename_group, and set_group_parent's parentGroupId argument take the bare id (e.g. "header"). Everywhere else — add_keyframe, move_layer, set_layer_fill, set_group_parent's elementId — use the full group.<id> form.

The layer tree

project.layer_order is the root-level z-ordering only. A group's children live under that group's children[], not in layer_order. The composition is a tree, not a flat list. describe_video surfaces groups[] and a top-first ordering so you don't have to re-derive the structure.

The canvas backdrop (background.canvas) is not in layer_order — it is always painted first, behind everything.

The coordinate system

The canvas defaults to 1080 × 1920 (9:16) and can be resized per project — 1080×1080 square, 1920×1080 landscape, 1080×1350 (4:5), or a custom size.

A layer's (x, y) is the CENTRE of its bounding box in canvas pixels — not the top-left. This matches Premiere / Final Cut / Motion. Consequences:

width and height are pixel dimensions of the bounding box. rotation is in degrees, clockwise.

Frames vs seconds

The timeline runs at 30 fps. Every frame: argument is an integer, 0-indexed frame number. Convert with frames = round(seconds × 30):

set_duration is the exception — it takes a duration in seconds directly (it sets the composition length).

Animation tracks

Each video / image / text / shapes / group layer can carry animation tracks under project.animations[elementId], keyed by property: x, y, width, height, scale, rotation, opacity. Each property's track is a sorted array of { frame, value, easing? } keyframes.

The key rule, After Effects / Premiere / Final Cut style: when a track exists on a property, it overrides the layer's static value at every frame. So move_layer sets the un-animated default; if the layer has an x track, the track wins. To animate, use add_keyframe. To set a value that isn't animated, use move_layer.

Track values are absolute for leaf layers: x/y are the layer centre's canvas-space pixel position, width/height are pixel dimensions, rotation is degrees, scale orbits the layer centre (1 = no change), opacity is 0..1.

Groups have no static body — their x/y track values are translation offsets applied around the group's frozen pivot, and a group transform composes onto every descendant. A group rotating 30° rotates everything inside it 30° on top of each child's own rotation.

Extrapolation past the ends

A separate per-property setting controls what happens before the first keyframe and after the last. Set it with set_track_loop:

Tracks with fewer than two keyframes ignore the loop mode.

Assets must exist first

add_image_layer, set_image_filename, add_video_layer, set_video_clip, and add_audio_overlay all reference a filename that must already be uploaded. If you reference a filename that isn't uploaded, the tool fails. describe_video only lists layers, not the asset bucket — confirm the upload before referencing a new filename.

Video clips can be uploaded over MCP: upload_clip(url) fetches a direct http(s) video link server-side, or upload_clip_presign + upload_clip_finalize stream a local/large file straight to R2 (see the tool catalog). Both return the stored filename to pass to add_video_layer. Images / audio still upload via the editor (drag-drop, or the /api/upload-asset HTTP route) — there's no MCP tool for those yet.