Main

OpenAI Shocks the AI Video World - Sora Changes Everything

Download the free ChatGPT at Work PDFs: https://clickhubspot.com/vkl More from Futurepedia: 🖥️ Discover more AI tools: https://www.futurepedia.io/ 🐦 Follow on Twitter: https://twitter.com/futurepedia_io Links Sora - https://openai.com/sora Sora research and more examples - https://openai.com/research/video-generation-models-as-world-simulators Summary OpenAI unveiled their groundbreaking AI video model Sora. It's text to video, image to video, image generation, and video generation capabilities blow every other ai video generator out of the water. Sora's text-to-video creates incredibly photorealistic fully generated ai videos in lengths up to 1 minute. This pushes the boundaries of AI content creation with their advanced diffusion transformer model. RunwayML, Pika Labs, Stable Diffusion and all the others are left behind by this release. Cinematic ai videos are here. Chapters 0:00 Intro 1:02 Text to video 6:10 Sora research overview 6:39 Image to video 6:55 Extend videos 7:05 Video to video 7:42 Connect videos 8:27 Text to image 8:47 Emergent capabilities 9:30 Limitations 9:58 Thoughts 11:52 Futurepedia

Futurepedia

10 days ago

open AI just changed everything, again all  these videos you're seeing right now are fully AI generated I don't usually try to push  out News videos like this but this is the most excited I've been about an update from open AI  or any AI tool ever and that's saying a lot this is better than any other AI video platform  and it's not even close it's called sora it was just released today and it's not just text  to video a few hours after they released those demos they also released a research page
  that showed image to video video to video and actually my favorite part is this feature of  connecting videos like this drone flying through these ruins then changing into a butterfly  flying underwater for some reason but it looks amazing they're supposedly allowing select  creators to test it right now to get feedback and they said to give the public a sense of  what AI capabilities are on the horizon I mean the possibilities with this are pretty  intense honestly I'll get into some of my ov
erall thoughts and potential consequences  and opportunities later but let's jump into some more examples first this movie trailer is  from a single prompt and it maintains the scene and character consistency across multiple shots  it has this Dynamic camera movement and handheld shots plus just how realistic the people are and  then there's a couple drone shots here that are just indistinguishable from real drone shots same  with this longer one of an SUV driving on a dirt road I mean this scen
ery just looks so good where  this cat walking through a garden is so true to the way a cat actually moves with that like light  footed stealthiness plus the physics of the fur and whiskers bouncing and this is a 27 second shot  it can generate up to 1 minute as opposed to the 4sec morphing kind of slow motion Clips we see  from everywhere else but now let's look at some more complex scenes this one in particular really  blew me away first off just all these buildings being viewed coherently fro
m all angles as it  passes but at the same time there's all these Reflections that interact with them properly  with the lighting and refraction especially if you go frame by frame right here as it crosses  this building and see the scene it's actually reflecting honestly other video generators  would struggle just generating that scene on its own with multiple faces and hands let alone  as of reflection interacting with an even more complex scene I'll jump through a couple quick  ones here all
these people celebrating Chinese New Year is an unbelievable amount of people to be  tracking and they're all pretty coherent and this flythrough of a historical town has all sorts of  like horses and people walking and carrying things all while viewing these buildings from different  perspectives or the ability to generate so many completely unrelated videos across all these TVs  at once I that's just crazy to me then most of the wildlife shots look essentially perfect I really  like this crab
interacting with octopus a lot of this is photorealistic but it can do other styles  too so we have some 3D animation shots that are amazing then this papercraft one is awesome and  this dancing one is a great spot for comparison because just the other day I was generating videos  in Runway it was of a cult of barn owls performing Secret disco rituals don't ask but my point is  just look at the difference of these slow motion dance moves that start to like morph even in  just a couple seconds cl
ip and some of these just have no faces the comparison isn't even remotely  close and then I want to focus in on this one for a second so it has two just incredibly detailed  pirate ships that are viewed from all angles but simultaneously having multiple difficult physics  simulations of like the flags waving and then the liquid then it also knows to apply tilt shift  to make it seem like it's miniature and that just highlights how much is going on in this and  how timec consuming it would be to
do this in any other way now those are all amazing but let's look  closer at the shots that involve people that's where things get really difficult so first it can  do a close-up of an eye incredibly well so even captures the little micro starts and stops your  eye does when you move them from left to right they're called Cades but let's zoom out some more  here's a gray-haired man pondering the universe this is amazing even the just subtleties in his  wrinkles as his face makes like minor expr
essions and like the different speeds he blinks and then  there's really accurate urate refraction in its glasses but let's zoom out even more so this  one's actually the generation Greg Brockman used in his announcement and it does an incredible  job with Reflections and this realistic walk cycle and then proper physics on the purse and clothing  and her earrings that's all with a pretty complex background too this is miles apart from anything  we've seen before but there are some issues with t
his like you're not going to be convinced this is  a real video if you're paying attention at least you like the feet kind of sliding across the  ground or if you look at any of the people in the background they're good but some are a little  weird so there's certainly limitations so here's another one that has a lot of people in it while  the camera is spinning which is really pushing the limits if you watch this bottom left corner  there's a kind of trippy perspective shift here and then some
weirdness in the hands and movements  of the people overall they look great and can't stress enough how far beyond this is from anything  else we've seen but it is still of course not perfect especially with people but that's just  when I'm analyzing them to find something weird if I was just scrolling social media I would  think almost every one of these is real and as the saying goes this is the worst it will ever  be and all of those examples are at open.com Sora but they also released more e
xamples of all the  other stuff it can do Under the research section of their site I'll link to that as well before  I jump into that I assume everyone watching this video uses chat GPT to some extent but one thing  that can be a bit of a struggle is knowing how to implement it in your daily life especially at  work that's why I've partnered with HubSpot to share a free resource bundle to help it will be  linked in the description it covers topics across all different Industries with tips and gu
ides on  how to use chat in those Industries with specific examples and applications I mean the number one  problem I hear from people regarding AI is they see these videos about chat GPT or other AI tools  and say that's really cool but how can I actually use it to help me you know to save time or solve  a problem one example section here is called 100 ways to try chat GPT today with 100 sample  prompts you can use or modify for whatever career you have and there will actually be five  PDFs tha
t section's in the one called supercharge your workday with chat PT but go through all the  other ones in there too again they are completely free just use the link in the description to go  download those I've been happy to partner with HubSpot to bring these free resources to people  who watch this channel onto the Sora research so I'll link to that as well they call it video  generation models as World simulators which that title starts to feel a little weird they say it's  a promising path t
owards building general purpose simulations of the physical world and this goes  to some extent on how it was trained and how it works in these first sections but let's look at  some more examples it does demonstrate a lot more text to video capabilities with some really solid  generations involving people but since we've seen a lot of that let's jump into the image to video  just like with text to video their image to video is far beyond any other video generators so these  are examples of anim
ating Dolly images and they are all amazing I this one with the Surfers  being the most complex and just like with the texted video it does a really good job with the  physics like even in a really difficult example like this it can also extend videos forward and  backward in time this is an example Le where all three videos were extended backwards so they all  lead to the same ending this video to video is really impressive so it starts with the example of  change the setting to be in a lush ju
ngle and does an amazing job but you can click on this and it  will be a drop down to select from with all these different options so we've got make it underwater  and how about change it to Winter and we can add dinosaurs switch it to claymation we've got a  bunch of others here so this is just next level when you compare it to something like Gen 2 or  kyber kyber can do a bunch of stuff that is pretty different from this which I really like we'll have  to see once we can actually play with thi
s but either way this is mind-blowing then this next  section is probably my favorite part so it can connect videos where it interpolates between the  two input videos to create seamless transitions between them even with entirely different subjects  and scene compositions so the center video is the one that combines them and for each of these it  finds just a really creative and seamless way way to merge them like this drone shot where it puts  this snow diarama thing on the back side of the bu
ilding then we've got this slow morphing between the chameleon and this cool bird  it looks just awesome while it's transitioning and I'll just let these other  two Play Got This SUV turning into I think it's a cheetah and that historical Gold Rush Town  transforming into this underwater scene both just super cool and it is also a image generator now  I wonder if this will be replacing doly at some point I assume it will because honestly just  a screen grab from a lot of these videos looks bette
r than if you ran that prompt through  Dolly right this is what I got when I copied the prompt of the woman in Autumn I mean this is  just way better than Dolly and if we move along it talks about some of the emerging capabilities of  training at scale similar to how it is with llms there's just unexpected capabilities that emerge  when you train a model like this they's say some of that has led to its ability to generate Dynamic  camera motion that shifts and rotates while people move consisten
tly throughout 3D space and the  object permanence as these people walk past this dog like it'll be completely blocked from the  frame at some points but it still persists and will be there after they walk past interacting  with the world is another capability that no other model has been able to really do so it  can still only affect the state of the world in simple ways but these examples of painting  and actually leaving the brush Strokes on the canvas or leaving bite marks on a burger this 
is amazing this is a huge step forward they also do show some limitations like glass shattering  they did show some more limitations on the main site where the physics was off like this person  running backwards on a treadmill these wolf pups just kind of appearing out of nowhere some weird  interactions with his chair and a few others again not perfect and it also has just this mention of  simulating digital worlds where it can control a player in Minecraft while also rendering the world  and i
ts Dynamics in High Fidelity the ways this will be applied in video games in the future is  going to change that whole world too honestly I'm still processing ing a lot of this it is such a  Monumental breakthrough I wasn't expecting to see text a video like this until maybe the end of the  year at the earliest they're working on putting some guard rails around this which they should  but I mean there's only so much you can do you know people have complaints about the censorship  within chat GPT
which I get but when it comes to videos which is generally done with the intent  to share it it just gets a lot more complicated we already have fake videos all over the place  some of relatively harmless things like fake Landscapes or fake weather phenomenon then there  is the more sinister side of fake videos spreading around of War victims or political figures  up until this point they're usually pretty obvious to anyone that's close to this stuff but  maybe not to everyone else sometimes th
ey seem to trick millions of people but as these tools  keep getting more sophisticated that will just accelerate and also become impossible to detect as  far as videographers and filmmakers and YouTubers there will be a lot of possibilities that open up  I think creating just amazing b-roll really easily is going to be awesome I personally avoid stock  footage almost completely because it just feels generic but if I can generate something that I  kind of imagined for each occasion that's just a
lot more interesting to me and engaging for  the audience overall this completely opens the door for anyone to create visual stories in the  same way one person can write a book one person could be able to create an entire movie this just  provides more tools to create with and new types of content will emerge there will be bumps  along the way with AI content farms and deep fakes misinformation I'm not worried at all about  being replaced I think that no matter how good any of these tools get
people will always be more  interested on seeing things that do themselves there's a lot of uses for this to help people in  the process but not to replace people entirely also I think as more generated content comes  out there will just be a pendulum swing in the other direction for people craving more authentic  human content people are consuming more content than ever so really there will be demand for both  there's a lot of random thoughts at the end there this is a much more offthe cuff vid
eo than usual  this update was amazing and if you want to stay up to date along the way to everything that happens  in AI make sure to subscribe here and check out futurepedia .io we curate all the latest and  greatest AI tools you can find the best tool for any use case use the chatbot to help guide the  process you can save your favorites then you'll be given personalized tool recommendations every week  there's been a ton of updates there so go check it out if you haven't for a while thank yo
u so much  for watching and I'll see you in the next one

Comments