X-ARC

A continuous educational-video pipeline operating a public channel.

Problem

3ED Productions is a content studio focused on educational video. The problem was not quality. It was volume. Producing a single video required four to five roles and one to two weeks of human effort. Growth was linear in headcount, which is to say growth was not growth.

Existing tools could write a script. They could generate images. None of them constituted a pipeline. Each one solved a single step; the team still owned every handoff.

First principles

A video is a sequence of bounded decisions. Topic, hook, sentence-by-sentence script, voice timing, frame composition, thumbnail. Each decision is small. Each can be scored against an explicit criterion. Each failure is cheap to retry.

The bottleneck is not the quality of any individual step. It is the throughput cost of a human in the loop at every step. Removing the human from steps with cheap-retry failures unlocks the throughput. The human stays at the gates where retries are expensive or where taste is structurally human.

Quality gates have to be cheap to retry. A frame that misses the visual standard is regenerated, not surfaced for human review.

Pipeline

The operator triggers a new production from Telegram. The pipeline runs to completion and returns the cut for review.

01

Research

Trending topics in the studio's domain scored against attention triggers. Nine title-and-thumbnail combinations generated and evaluated. The strongest is selected.

02

Script

Full script with retention architecture. First ten seconds engineered to hold attention. Every sentence validated against the concreteness rules before the script is closed.

03

Visuals

Storyboard derived from the script audio. Grid-based batch generation, followed by validated selection. Frames that miss the visual standard are regenerated.

04

Production

Audio, visuals, and timing compiled through ffmpeg 8.0.1 into the final cut. Three thumbnail options ranked.

05

Publishing

YouTube-optimised metadata produced. Upload as a private draft for operator review and publication.

The motion track runs Seedance 2.0 plus GPT Image 2 with frame-chaining. Gemini 3 Pro runs as the video judge. Identity drift that killed earlier face-based attempts is avoided by the chaining contract: chunk N's end frame becomes chunk N+1's start frame.

Numbers

16 Videos produced
2.4K Views on breakout
3 mo Continuous operation
~$13 Cost per finished video
7 Quality gates per script
Hours Production time

Observation

Two patterns surfaced.

Quality gates have to be cheap to retry. A frame that misses the visual standard is regenerated, not surfaced for human review. The cost of an extra generation pass is small. The cost of a human-in-the-loop check at every stage is the original bottleneck. The pipeline retries against its own gates until the gate passes.

The pipeline communicates its reasoning. Output is returned with the decisions that produced it: title scoring, retention markers placed, regenerated frames. The operator overrides at the reasoning level, not the artefact level. The role shifts from reviewer to director.

The breakout video reached 2,400 organic views in nine days on a channel with no paid promotion. The point is not the view count. The point is that the pipeline iterated through 15 videos before that one caught, and the iteration is the value.

Contact

If something on this page is relevant to work you are running, write to us. The form is on the landing page. We come back within two working days.

Book a discovery call