"Midjourney in the video field"! AI video generation rookie Gen-2 closed beta works flowed out, netizens called it too realistic
The AI video generation tool Gen-2 , which claims to be able to shoot blockbuster movies in one sentence, has begun to reveal its true colors.
Some Twitter bloggers have taken the lead in obtaining the internal test qualification.
Here's what he generated with the prompt "a well-proportioned or symmetrical man being interviewed at a bar" :
I saw a man in a dark shirt looking at the other party and talking, his eyes and expression revealed seriousness and frankness, and the person opposite nodded from time to time to show his agreement.
The overall video is very coherent and the picture quality is very clear. At first glance, it feels like a real interview!
Another version of the same cue generated is not bad:
This time the shot is much closer, the background is more realistic, and the characters are still expressive.
Some netizens called out after watching this group of works:
It's unbelievable that such a result can be generated only by text prompts!
Someone else said bluntly:
Here comes the Midjourney of video.
Blogger measured Gen-2
The blogger, Nick St. Pierre, is dedicated to sharing his AI-made works on Twitter.
In addition to the realistic style seen at the beginning, he also released a set of Gen-2 sci-fi works.
For example, "Astronaut travels through space":
"An armed soldier runs down the hallway of the spaceship as a dark shadow destroys the wall behind him":
"A Robot Who Eats Microchips in a Human Zoo":
"Armies of humanoid robots colonize frozen flats":
(There is a kind of momentum of the alien army attacking the Great Wall in Game of Thrones...)
"Last Man on Earth Watches Invading Spaceship Landing Over Tokyo":
...
All of the above can be completed with only one prompt word, no need to learn from other pictures and videos.
Although the effect of this set of science fiction is slightly inferior to that of "the man in the bar being interviewed", the amazing thing is that it is comparable to the "chaotic" taste at the beginning of the birth of the AI image generation model—— It seems to see the shadow of AI such as Stable diffusion at that time.
As Nick St. Pierre said:
Gen-2 is still in its infancy, and it's bound to get better.
We also found some results of other people who have obtained the internal test qualification on the Internet:
It can be said that, except for the more difficult science fiction works, it handles all the other good ones.
AI video generation also soars
Gen-2, an AI tool specializing in video generation, was released on March 20 (paper March 11) .
As a rookie in the AIGC field, its iteration speed and quality are also quite fast:
The Gen-1 version was just born in February, and at that time, only existing videos could be edited;
Gen-2 can now use text and images as prompts to directly generate videos.
It is officially called "the next step of generative AI", and the slogan is also very domineering:
say it, see it. (If you can say it, I can let you see it)
The Gen-2 update brings eight features in one go:
Wensheng video, text + reference image raw video, still picture to video, video style migration, storyboard (Storyboard) , Mask (such as turning a walking white dog into a Dalmatian) , rendering and personalization (such as turning The boy who shakes his head turns into a sea turtle in seconds) .
Its appearance, like AI painting, makes people see the huge changes hidden in the fields of film and television, games and marketing.
The development company behind it is also worth mentioning, it is Runway.
Founded in 2018, Runway has provided technical support for the special effects of "Transient Universe", and also participated in the development of Stable Diffusion (a proper potential stock) .
Tip: Gen-1 is ready to play (after 125 chances are used up, you can only pay monthly), Gen-2 has not yet officially opened to the public.
In addition to the Gen series, Microsoft Asia Research Institute has also recently released an AI that can generate ultra-long videos based on text: NUWA-XL .
With only 16 simple descriptions, it can get an 11-minute animation:
A little bit ahead, on the same day Gen-2 was released, Ali Dharma Academy also open sourced a text-to-video AI with 1.7 billion parameters:
The effect is that of Aunt Sauce:
...
It is foreseeable that not only image generation, but also the video field will become lively.
Cough cough, so the next wave of public AI carnival, will it be it?
Reference link: [1] https://twitter.com/nickfloats/status/1644540896321437696 [2] https://twitter.com/nickfloats/status/1644081726141132800 [3] https://research.runwayml.com/gen2 [4] https://twitter.com/c_valenzuelab/status/1645158717766115328 [5] https://modelscope.cn/models/damo/text-to-video-synthesis/summary [6] https://nuwa-infinity.microsoft.com