New 'Stable Video Diffusion' AI Model Can Animate Any Still Image (arstechnica.com) 13

Posted by BeauHD on Monday November 27, 2023 @07:10PM from the bring-to-life dept.

An anonymous reader quotes a report from Ars Technica: On Tuesday, Stability AI released Stable Video Diffusion, a new free AI research tool that can turn any still image into a short video -- with mixed results. It's an open-weights preview of two AI models that use a technique called image-to-video, and it can run locally on a machine with an Nvidia GPU. [...] Right now, Stable Video Diffusion consists of two models: one that can produce image-to-video synthesis at 14 frames of length (called "SVD"), and another that generates 25 frames (called "SVD-XT"). They can operate at varying speeds from 3 to 30 frames per second, and they output short (typically 2-4 second-long) MP4 video clips at 576x1024 resolution.

In our local testing, a 14-frame generation took about 30 minutes to create on an Nvidia RTX 3060 graphics card, but users can experiment with running the models much faster on the cloud through services like Hugging Face and Replicate (some of which you may need to pay for). In our experiments, the generated animation typically keeps a portion of the scene static and adds panning and zooming effects or animates smoke or fire. People depicted in photos often do not move, although we did get one Getty image of Steve Wozniak to slightly come to life.

Given these limitations, Stability emphasizes that the model is still early and is intended for research only. "While we eagerly update our models with the latest advancements and work to incorporate your feedback," the company writes on its website, "this model is not intended for real-world or commercial applications at this stage. Your insights and feedback on safety and quality are important to refining this model for its eventual release." Notably, but perhaps unsurprisingly, the Stable Video Diffusion research paper (PDF) does not reveal the source of the models' training datasets, only saying that the research team used "a large video dataset comprising roughly 600 million samples" that they curated into the Large Video Dataset (LVD), which consists of 580 million annotated video clips that span 212 years of content in duration.

New 'Stable Video Diffusion' AI Model Can Animate Any Still Image

This discussion has been archived. No new comments can be posted.

Load All Comments

Search 13 Comments Log In/Create an Account

Comments Filter:

A picture of Chuck Norris set my RTX3060 on fire! (Score:5, Funny)

by thesjaakspoiler ( 4782965 ) writes: on Monday November 27, 2023 @07:20PM (#64036795)

Only Chuck Norris decides when he wants to move.
I guess some pictures are not to be animated.

- Pics or it didn't happen (Score:1)
  
  by davidwr ( 791652 ) writes:
  
  Animated, of course.
It's a beginning (Score:3)

by war4peace ( 1628283 ) writes: on Monday November 27, 2023 @07:57PM (#64036867)

People these days want perfect stuff from the get-go, and if you expect plonking an image of someone in there and generating a video off it, with bells and whistles, you will be sorely disappointed.
It's a hesitant beginning. Wait 10-15 years, then we'll see.

- Re: (Score:3)
  
  by Rei ( 128717 ) writes:
  
  It's also just bad work by the author of the article.
  You don't just publish raw outputs, you at least want to run them through frame tweening first. At least use ffmpeg's motion interpolation - don't release your article with, what, 4-frame-per-second videos?
  Also, SD doesn't limit you to only 14 frames. Put in more than half an hour of compute time on a 3060, come on. A rented 3060 on vast.ai costs ~$0.25/hr ATM.
And a new magazine was launched on that same day (Score:2)

by Provocateur ( 133110 ) writes:

And a new magazine was launched for the occasion; Hentaimes was not a declaration of civilization reaching the end times, that's for sure.
Can't wait for slashdot's famous.. (Score:2)

by StevenMaurer ( 115071 ) writes:

goatse image to be animated. I'm sure it'll become a new site favorite!
Don't... (Score:2)

by Motleypuss ( 10291831 ) writes:

...Rickroll us again. Especially not with a classic Disney princess. Please.
- Re: (Score:2)
  
  by codebase7 ( 9682010 ) writes:
  
  Well... I mean they could just bring back OMG Ponies! /s
Seems like a short step to something useful (Score:2)

by Shaitan ( 22585 ) writes:

If the tool converted 2D images into 3D models which could then be imported and manipulated in 3D tools then it would be quite useful.
How is this different from HeyGen ? (Score:2)

by Micah NC ( 5616634 ) writes:

I've been making YouTube stuff from still images using HeyGen for a while.

This doesn't look news worthy to me.

And, yes, I read the summary.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

New 'Stable Video Diffusion' AI Model Can Animate Any Still Image (arstechnica.com) 13

New 'Stable Video Diffusion' AI Model Can Animate Any Still Image More Login

New 'Stable Video Diffusion' AI Model Can Animate Any Still Image

A picture of Chuck Norris set my RTX3060 on fire! (Score:5, Funny)

Pics or it didn't happen (Score:1)

It's a beginning (Score:3)

Re: (Score:3)

And a new magazine was launched on that same day (Score:2)

Can't wait for slashdot's famous.. (Score:2)

Don't... (Score:2)

Re: (Score:2)

Seems like a short step to something useful (Score:2)

How is this different from HeyGen ? (Score:2)

Related Links Top of the: day, week, month.

Slashdot Top Deals

Slashdot