Will AI Video Kill the Animation Star?
How generative video and AI agents could reshape the animation industry
Back in February of last year, I hopped on a Teams call with one of the animatic editors on my show, but instead of getting down to the cartoon work of the day, he just sat there with a wide-eyed stare. It looked like he’d seen the Ghost of Animation Future. After a beat, he managed to utter a few words, “Have you seen it?”
I knew exactly what he was talking about.
Earlier that day OpenAI dropped a demo of its new generative-video model, Sora. By simply typing in a few words, Sora promised to create a fully convincing video of anything you could imagine. No production teams, no giant budgets… no editors. Yep, it was scary.
Now, in the world of AI, February 2024 was practically a billion years ago. In the time since, Sora was finally released to the public and the results weren’t quite as amazing as the demos made it seem. Meanwhile, tools like Runway, Kling and Veo have only improved and the chatter of so-called “Prompt-to-Hollywood” continues to grow.
But as flashy as these generative video models can be, I don’t believe they will be the AI technology that transforms the animation industry. I think the real disruption will come from something else entirely.
How Generative Video Works
Generative video begins with a process for creating images called diffusion. A model starts from a field of static and bit by bit removes the noise until an image matches your prompt. It learns this by doing the opposite during training. It adds noise to real images so it knows how to subtract it later.
For video, this process plays out over a sequence of frames. And when these tools are trained on massive amounts of video, the result can be sort of mind blowing. But here’s the key: tools like Sora and Kling output final shots instantly, skipping the traditional stages of blocking, posing, and incremental polish that animation artists know so well.
Why the Impact of AI Video Will be Limited
To be clear, I’m not in the camp that believes generative video isn’t a threat because their output isn’t good. If you’ve been watching, you know how fast these models are improving.
The reason I don’t believe generative video will take over the animation pipeline isn’t about visual quality. It’s about process. These tools just aren’t controllable enough to work with the way animation is actually made.
Animation is a collaborative art form. At every stage, you have input from an art director, a storyboard artist, an animation supervisor, a director, a showrunner, and often multiple clients or broadcasters. Sometimes that can feel like too many cooks in the kitchen. But when it works, the result is a shared vision shaped by expert chefs in their own domain. (Do I need to continue this analogy? You’ve all seen Ratatouille).
Iteration and Collaboration
Our process depends on being able to get under the hood. We can’t just drop in a prompt and hope for the best. We block. We pose. We adjust the spacing of a blink. We nudge the tilt of a head. Every frame is something that is shaped layer by layer through rounds of feedback and revision.
Generative video doesn’t allow for that kind of iterative work. Tools like Sora and Veo give you final frames right away and the results can be unpredictable. This is sometimes called the slot machine effect: You pull the lever and hope for something good. There’s no way to move through rough passes and client feedback toward polish.
Yes, these tools are evolving. We're starting to see better control over camera movement, character consistency, and even editing. But fundamentally they’re still working from the outside in, focused on the final output not the underlying bones. And in animation, the bones are where we adjust for performance, nuance and meaning.
When you learn to draw for animation you’re always told to “draw through” to understand the structure of what you are creating. Generative video doesn’t draw through, it just paints on top.
So… I’m saying we don’t need to worry?
Not so fast.
The Real Disruption: Agentic AI
Since the process for creating animation necessitates working from the inside out, the future of AI disruption in animation and VFX will come from AI operating the same animation tools humans use today. The call is coming from INSIDE the house!
This is where AI Agents will likely have the biggest impact on our art form. Agentic AI lets systems move outside the chatbot and run other software (Or robotics, but that’s a dystopia for another newsletter).
This is already starting to happen across a number of fields. AI agents are currently being used for research and report writing (Open AI’s Deep Research), software engineering (Devin), robo-taxis (Waymo). You can imagine that an intelligent AI with the ability to fully control animation software such as Maya, Unreal, or Harmony could work side-by-side with an animator to automate tasks. It could potentially set up scenes, do a first pass of blocking, or even produce a timing pass based on a style guide for a particular show.
This isn’t too far from the automation scripts that are commonly used in animation software today to access animation libraries, use smart tweening, or clean up unused nodes. You could think of those scripts as a kind of precursor to agentic AI. Many are already task-specific agents, just without intelligence. The next evolution will be linking these tasks, making decisions about when to run them, and adapting them based on the needs of the scene.
This kind of approach allows for the same iterative process that the animation pipeline is built for, but gives AI a much more prominent seat at the table.
For better or worse.
Agentic AI Animation Today
This kind of AI use is already being experimented with by a handful of animation studios.
One of the boldest examples is happening at Animaj Studios for the new Pocoyo series. Their “sketch-to-motion” pipeline involves AI systems that take drawings from Storyboard Pro and automatically pose the character assets in Unreal Engine to match. From there, they use an inbetweening system to move the characters trained on the unique pose-to-pose style of animation that Pocoyo is known for.
So why is this different from generative video? An agentic process is not about AI creating the final frames of animation whole cloth, this is about letting AI drive the animation software itself which allows for continued refinement by human artists.
Looking ahead
Exactly how this will impact the artists working in the field is unclear. In an ideal world, this allows animation artists to expand their capabilities. Of course, the flip side is that this technology could be used to reduce headcount and increase quotas.
For me, the important thing is to keep a clear-eyed view of what’s happening out there so we can be as prepared as possible.
What do you think? Do these tools have the potential to be additive to the process, or will they simply be used as a cost saving measure long term? Do you think there’s more room for generative video in production? What AI tools have you started to see pop up across the industry?
Until next time,
Matt Ferg.