[Music] I am I am a Visionary Illuminating galaxies
to witness the birth of [Music] stars and sharpening our understanding of extreme weather [Music] events I am a helper
guiding the blind through a crowded world I was thinking about running
to the store and giving voice to those who cannot speak to not make me laugh I am a Transformer harnessing gravity to
store Renewable [Music] Power [Music] and Paving the way towards unlimited clean energy for us [Music] all I am a
[Music] trainer teach
ing robots to assist to watch out for
[Music] danger and help save lives I am a Healer providing a
new generation of cures and new levels of patient care doctor that I am
allergic to penicillin is it still okay to take the medications definitely these
antibiotics don't contain penicillin so it's perfectly safe for you to take them I
am a navigator [Music] generating virtual scenarios to let us safely explore the real world and understand every [Music] decision I even helped write the script
breathe life into the words [Music] I am AI brought to life by Nvidia
deep learning and Brilliant Minds everywhere please welcome to the stage Nvidia founder and CEO
Jensen [Music] [Applause] [Music] Wong welcome to GTC I hope you realize this is not a concert you have arrived at
a developers conference there will be a lot of science described
algorithms computer architecture mathematics I sensed a very heavy weight in the
room all of a sudden almost like you were in the wrong place no no
conference in the world
is there a great assembly of researchers from such diverse fields of science from climatech to
radio Sciences trying to figure out how to use AI to robotically control MOS for Next Generation 6G
radios robotic self-driving car s even artificial intelligence even artificial intelligence
everybody's first I noticed a sense of relief there all of all of a sudden also this conference
is represented by some amazing companies this list this is not the attendees these are t
he
presentors and what's amazing is this if you take away all of my friends close friends
Michael Dell is sitting right there in the IT industry all of the friends I grew up with in
the industry if you take away that list this is what's amazing these are the presenters of the
non it Industries using accelerated Computing to solve problems that normal computers
can't it's represented in life sciences healthc Care genomics Transportation of
course retail Logistics manufacturing industrial th
e gamut of Industries represented
is truly amazing and you're not here to attend only you're here to present to talk about
your research $100 trillion dollar of the world's Industries is represented in
this room today this is absolutely amazing there is absolutely something
happening there is something going on the industry is being transformed not just ours
because the computer industry the computer is the single most important instrument of society
today fundamental transformations in Co
mputing affects every industry but how did we start
how did we get here I made a little cartoon for you literally I drew this in one page
this is nvidia's Journey started in 1993 this might be the rest of the talk 1993 this
is our journey we were founded in 1993 there are several important events that happen
along the way I'll just highlight a few in 2006 Cuda which has turned out to have been
a revolutionary Computing model we thought it was revolutionary then it was going to be an
overni
ght success and almost 20 years later it happened we saw it coming two decades later in 2012 alexnet Ai and Cuda made first
Contact in 2016 recognizing the importance of this Computing model we invented a brand
new type of computer we called the dgx one 170 Tera flops in this supercomputer eight
gpus connected together for the very first time I hand delivered the very first dgx-1 to
a startup located in San Francisco called open AI dgx-1 was the world's first AI supercomputer
remember 170 T
era flops 2017 the Transformer arrived 2022 chat GPT capture the world's
imag imaginations have people realize the importance and the capabilities of artificial
intelligence and 2023 generative AI emerged and a new industry begins why why is a new industry
because the software never existed before we are now producing software using computers to write
software producing software that never existed before it is a brand new category it took share
from nothing it's a brand new category and the
way you produce the software is unlike anything
we've ever done before in data centers generating tokens producing floating Point numbers at very
large scale as if in the beginning of this last Industrial Revolution when people realized
that you would set up factories apply energy to it and this invisible valuable thing called
electricity came out AC generators and 100 years later 200 years later we are now creating new
types of electrons tokens using infrastructure we call factories AI fa
ctories to generate this
new incredibly valuable thing called artificial intelligence a new industry has emerged well
well we're going to talk about many things about this new industry we're going to talk
about how we're going to do Computing next we're going to talk about the type of software
that you build because of this new industry the new software how you would think about this
new software what about applications in this new industry and then maybe what's next and
how can we start p
reparing today for what is about to come next well but before I start I
want to show you the soul of Nvidia the soul of our company at the intersection of computer
Graphics physics and artificial intelligence all intersecting inside a computer in Omniverse
in a virtual world simulation everything we're going to show you today literally everything
we're going to show you today is a simulation not animation it's only beautiful because it's
physics the world is beautiful it's only amazing beca
use it's being animated with robotics it's
being animated with artificial intelligence what you're about to see all day it's completely
generated completely simulated and Omniverse and all of it what you're about to enjoy is the
world's first concert where everything is homemade everything is homemade you're about to watch
some home videos so sit back and enjoy [Music] [Music] yourself [Music] m [Music] what [Music]
0:13:29.120,1193:02:47.295
[Music] a [Music] [Music] [Music] God I love it N
vidia accelerated Computing has reached the
Tipping Point general purpose Computing has run out of steam we need another way of doing
Computing so that we can continue to scale so that we can continue to drive down the cost of
computing so that we can continue to consume more and more Computing while being sustainable
accelerated Computing is a dramatic speed up over general purpose Computing and in every single
industry we engage and I'll show you many the impact is dramatic but in no indu
stry is a
more important than our own the industry of using simulation tools to create products in this
industry it is not about driving down the cost of computing it's about driving up the scale of
computing we would like to be able to sim at the entire product that we do completely
in full Fidelity completely digitally in essentially what we call digital twins we would
like to design it build it simulate it operate it completely digitally in order to do that we need
to accelerate an enti
re industry and today I would like to announce that we have some Partners who
are joining us in this journey to accelerate their entire ecosystem so that we can bring the world
into accelerated Computing but there's a bonus when you become accelerated your infrastructure
is cou to gpus and when that happens it's exactly the same infrastructure for generative Ai and
so I'm just delighted to announce several very important Partnerships there are some of the most
important companies in the wor
ld and Anis does engineering simulation for what the world makes
we're partnering with them to Cuda accelerate the Ansys ecosystem to connect Ansys to the Omniverse
digital twin incredible the thing that's really great is that the install base of media GPU
accelerated systems are all over the world in every cloud in every system all over Enterprises
and so the app the applications they accelerate will have a giant installed base to go serve end
users will have amazing applications and of co
urse system makers and csps will have great customer
demand synopsis synopsis is nvidia's literally first software partner they were there in very
first day of our company synopsis revolutionized the chip industry with high level design we
are going to Cuda accelerate synopsis we're accelerating computational lithography one of
the most important applications that nobody's ever known about in order to make chips we have
to push lithography to limit Nvidia has created a library domain specif
ic library that accelerates
computational lithography incredibly once we can accelerate and software Define all of tsmc who
is announcing today that they're going to go into production with Nvidia kitho once this software
defined and accelerated the next step is to apply generative AI to the future of semiconductor
manufacturing push in Geometry even further Cadence builds the world's essential Eda and SDA
tools we also use Cadence between these three companies ansis synopsis and Cadence we
basically
build Nvidia together we are cud accelerating Cadence they're also building a supercomputer out
of Nvidia gpus so that their customers could do fluid Dynamic simulation at a 100 a thousand times
scale basically a wind tunnel in real time Cadence Millennium a supercomputer with Nvidia gpus inside
a software company building supercomputers I love seeing that building Cadence co-pilots together
imagine a day when Cadence could synopsis ansis tool providers would offer you AI co-pilo
ts so
that we have thousands and thousands of co-pilot assistants helping us design chips Design Systems
and we're also going to connect Cadence digital twin platform to Omniverse as you could see the
trend here we're accelerating the world's CAE Eda and SDA so that we could create our future in
digital Twins and we're going to connect them all to Omniverse the fundamental operating
system for future digital twins one of the industries that benefited tremendously from scale
and you know yo
u all know this one very well large language model basically after the Transformer
was invented we were able to scale large language models at incredible rates effectively doubling
every six months now how is it possible that by doubling every six months that we have grown
the industry we have grown the computational requirements so far and the reason for that
is quite simply this if you double the size of the model you double the size of your brain you
need twice as much information to go
fill it and so every time you double your parameter count you
also have to appropriately increase your training token count the combination of those two numbers
becomes the computation scale you have to support the latest the state-of-the-art open AI model is
approximately 1.8 trillion parameters 1.8 trillion parameters required several trillion tokens to go
train so so a few trillion parameters on the order of a few trillion tokens on the order of when you
multiply the two of them together
approximately 30 40 50 billion quadrillion floating Point
operations per second now we just have to do some Co math right now just hang hang with me so you
have 30 billion quadrillion a quadrillion is like a paa and so if you had a PA flop GPU you would
need 30 billion seconds to go compute to go train that model 30 billion seconds is approximately
1,000 years well 1,000 years it's worth it like to do it sooner but it's worth it which is usually my answer
when most people tell me hey how l
ong how long's it going to take to
do something 20 years how it it's worth it but can we do it next week and so 1,000 years 1,000 years so what
we need what we need are bigger gpus we need much much bigger gpus we recognized this early on
and we realized that the answer is to put a whole bunch of gpus together and of course innovate
a whole bunch of things along the way like inventing 10 censor cores advancing MV links so
that we could create essentially virtually Giant gpus and connecting
them all together with amazing
networks from a company called melanox infiniband so that we could create these giant systems and so
djx1 was our first version but it wasn't the last we built we built supercomputers all the way all
along the way in 2021 we had Seline 4500 gpus or so and then in 2023 we built one of the largest AI
supercomputers in the world it's just come online EOS and as we're building these things we're
trying to help the world build these things and in order to help the
world build these things we got
to build them first we build the chips the systems the networking all of the software necessary
to do this you should see these systems imagine writing a piece of software that runs across the
entire system Distributing the computation across thousands of gpus but inside are thousands of
smaller gpus millions of gpus to distribute work across all of that and to balance the workload so
that you can get the most Energy Efficiency the best computation time keep
your cost down and so
those those fundamental Innovations is what got us here and here we are as we see the miracle of chat
GPT emerg in front of us we also realize we have a long ways to go we need even larger models we're
going to train it with multimodality data not just text on the internet but we're going to we're
going to train it on texts and images and graphs and charts and just as we learn watching TV and
so there's going to be a whole bunch of watching video so that these Mo model
s can be grounded in
physics understands that an arm doesn't go through a wall and so these models would have common
sense by watching a lot of the world's video combined with a lot of the world's languages it'll
use things like synthetic data generation just as you and I do when we try to learn we might use
our imagination to simulate how it's going to end up just as I did when I Was preparing for
this keynote I was simulating it all along the way I hope it's going to turn
out as well as
I had it in my head as I was simulating how this keynote was
going to turn out somebody did say that another performer did her performance completely on
a treadmill so that she could be in shape to deliver it with full energy I I didn't do that if
I get a l wind at about 10 minutes into this you know what happened and so so where were we we're
sitting here using synthetic data generation we're going to use reinforcement learning we're going
to practice it in our mind we're going to have ai
working with AI training each other just like
student teacher Debaters all of that is going to increase the size of our model it's going
to increase the amount of the amount of data that we have and we're going to have to build
even bigger gpus Hopper is fantastic but we need bigger gpus and so ladies and gentlemen I
would like to introduce you to a very very big [Applause] GPU named after David Blackwell math
ician game theorists probability we thought it was a perfect per per perfect
nam
e black wealth ladies and gentlemen enjoy this the com [Applause] Blackwell is not a chip Blackwell
is the name of a platform uh people think we make gpus and and we do but gpus don't look
the way they used to here here's the here's the here's the the if you will the heart
of the blackw system and this inside the company is not called Blackwell it's
just the number and um uh this this is Blackwell sitting next to oh this is the
most advanced GPU in the world in production today this is Hopp
er this is Hopper
Hopper changed the world this is Blackwell it's okay Hopper you're you're very good good good boy
well good girl 208 billion transistors and so so you could see you I can see that there's a
small line between two dyes this is the first time two dieses have abutted like this together
in such a way that the two chip the two dieses think it's one chip there's 10 terabytes of
data between it 10 terabytes per second so that these two these two sides of the Blackwell
Chip have
no clue which side they're on there's no memory locality issues no cach issues it's
just one giant chip and so uh when we were told that Blackwell's Ambitions were beyond
the limits of physics uh the engineer said so what and so this is what what happened and
so this is the Blackwell chip and it goes into two types of systems the first one is form fit
function compatible to Hopper and so you slide all Hopper and you push in Blackwell that's the
reason why one of the challenges of ramping is
going to be so efficient there are installations
of Hoppers all over the world and they could be they could be you know the same infrastructure
same design the power the electricity The Thermals the software identical push it right
back and so this is a hopper version for the current hgx configuration and this is what
the other the second Hopper looks like this now this is a prototype board and um Janine
could I just borrow ladies and gentlemen Jan Paul and so this this is the this is
a f
ully functioning board and I just be careful here this right here is I don't know10 billion the second one's five it gets cheaper after that so
any customers in the audience it's okay all right but this is this one's quite
expensive this is to bring up board and um and the the way it's going to go to production
is like this one here okay and so you're going to take take this it has two blackw Dy two two
blackw chips and four Blackwell dies connected to a Grace CPU the grace CPU has a super
fast chipto chip link what's amazing is this computer is the first of its kind where this much
computation first of all fits into this small of a place second it's memory coherent they feel
like they're just one big happy family working on one application together and so everything
is coherent within it um the just the amount of you know you saw the numbers there's a lot
of terabytes this and terabytes that's um but this is this is a miracle this is a this let's see
what are some of the thi
ngs on here uh there's um uh MV link on top PCI Express on the bottom on
on uh your which one is mine and your left one of them it doesn't matter uh one of them one
of them is a CPU chipto chip link is my left or your depending on which side I was just I was
trying to sort that out and I just kind of doesn't matter hopefully it comes plugged in so okay so this is the grace Blackwell system but there's more so it turns out it turns out all of the
specs is fantastic but we need a whole lot of
new features uh in order to push the limits
Beyond if you will the limits of physics we would like to always get a lot more X factors and
so one of the things that we did was We Invented another Transformer engine another Transformer
engine the second generation it has the ability to dynamically and automatically rescale and
recas numerical formats to a lower Precision whenever it can remember artificial intelligence
is about probability and so you kind of have you know 1.7 approximately 1.
7 time approximately
1.4 to be approximately something else does that make sense and so so the the ability for
the mathematics to retain the Precision and the range necessary in that particular stage of the
pipeline super important and so this is it's not just about the fact that we designed a smaller ALU
it's not quite the world's not quite that simple you've got to figure out when you can use that
across a computation that is thousands of gpus it's running for weeks and weeks on weeks and
you
want to make sure that the the uh uh the training job is going going to converge and so this new
Transformer engine we have a fifth generation MV link it's now twice as fast as Hopper but very
importantly it has computation in the network and the reason for that is because when you
have so many different gpus working together we have to share our information with each other
we have to synchronize and update each other and every so often we have to reduce the partial
products and then
rebroadcast out the partial products the sum of the partial products back to
everybody else and so there's a lot of what is called all reduce and all to all and all gather
it's all part of this area of synchronization and collectives so that we can have gpus working
with each other having extraordinarily fast links and being able to do mathematics right in the
network allows us to essentially amplify even further so even though it's 1.8 terabytes per
second it's effectively higher than that
and so it's many times that of Hopper the likel
Ood of a supercomputer running for weeks on in is approximately zero and the reason for that
is because there's so many components working at the same time the statistic the probability of
them working continuously is very low and so we need to make sure that whenever there is a
well we checkpoint and restart as often as we can but if we have the ability to detect a weak
chip or a weak note early we could retire it and maybe swap in another p
rocessor that ability
to keep the utilization of the supercomputer High especially when you just spent $2 billion
building it is super important and so we put in a Ras engine a reliability engine that does 100%
self test in system test of every single gate every single bit of memory on the Blackwell
chip and all the memory that's connected to it it's almost as if we shipped with every
single chip its own Advanced tester that we CH test our chips with this is the first time
we're doing this
super excited about it secure AI only this conference do they clap for Ras the
the uh secure AI uh obviously you've just spent hundreds of millions of dollars creating a very
important Ai and the the code the intelligence of that AI is encoded in the parameters you
want to make sure that on the one hand you don't lose it on the other hand it doesn't get
contaminated and so we now have the ability to encrypt data of course at rest but also in transit
and while it's being computed it's all e
ncrypted and so we now have the ability to encrypt and
transmission and when we're Computing it it is in a trusted trusted environment trusted
uh engine environment and the last thing is decompression moving data in and out of these
nodes when the compute is so fast becomes really essential and so we've put in a high linee speed
compression engine and effectively moves data 20 times times faster in and out of these computers
these computers are are so powerful and there's such a large inves
tment the last thing we want
to do is have them be idle and so all of these capabilities are intended to keep Blackwell
fed and as busy as possible overall compared to Hopper it is two and a half times two and
a half times the fp8 performance for training per chip it is ALS it also has this new format
called fp6 so that even though the computation speed is the same the bandwidth that's Amplified
because of the memory the amount of parameters you can store in the memory is now Amplified
fp4
effectively doubles the throughput this is vitally important for inference one of the
things that that um is becoming very clear is that whenever you use a computer with AI on the
other side when you're chatting with the chatbot when you're asking it to uh review or make an
image remember in the back is a GPU generating tokens some people call it inference but it's more
appropriately generation the way that Computing is done in the past was retrieval you would
grab your phone you would tou
ch something um some signals go off basically an email goes off
to some storage somewhere there's pre-recorded content somebody wrote a story or somebody made
an image or somebody recorded a video that record pre-recorded content is then streamed back to
the phone and recomposed in a way based on a recommender system to present the information to
you you know that in the future the vast majority of that content will not be retrieved and the
reason for that is because that was pre-recorded b
y somebody who doesn't understand the context
which is the reason why we have to retrieve so much content if you can be working with an AI
that understands the context who you are for what reason you're fetching this information and
produces the information for you just the way you like it the amount of energy we save the amount of
networking bandwidth we save the amount of waste of time we save will be tremendous the future
is generative which is the reason why we call it generative AI whi
ch is the reason why this
is a brand new industry the way we compute is fundamentally different we created a processor
for the generative AI era and one of the most important parts of it is content token generation
we call it this format is fp4 well that's a lot of computation 5x the Gen token generation 5x
the inference capability of Hopper seems like enough but why stop there the answer is it's
not enough and I'm going to show you why I'm going to show you why and so we would like to
hav
e a bigger GPU even bigger than this one and so we decided to scale it and notice but first
let me just tell you how we've scaled over the course of the last eight years we've increased
computation by 1,000 times8 years 1,000 times remember back in the good old days of Moore's Law
it was 2x well 5x every what 10 10x every 5 years that's easier easiest math 10x every 5 years
a 100 times every 10 years 100 times every 10 years at the in the middle in the hey days of the
PC Revolution one 100
times every 10 years in the last 8 years we've gone 1,000 times we have
two more years to go and so that puts it in perspective the rate at which we're advancing
Computing is insane and it's still not fast enough so we built another chip this chip
is just an incredible chip we call it the Envy link switch it's 50 billion transistors
it's almost the size of Hopper all by itself this switch ship has four MV links in it
each 1.8 terabytes per second and and it has computation in as I mentioned
what is
this chip for if we were to build such a chip we can have every single GPU talk to every
other GPU at full speed at the same time that's insane it doesn't even make sense but if you could
do that if you can find a way to do that and build a system to do that that's cost effective that's
cost effective how incredible would it be that we could have all these gpus connect over a coherent
link so that they effectively are one giant GPU well one of one of the Great Inventions in
order
to make a cost effective is that this chip has to drive copper directly the seres of
this chip is is just a phenomenal invention so that we could do direct drive to copper and as
a result you can build a system that looks like this now this system this system is kind of
insane this is one dgx this is what a dgx looks like now remember just six years ago
it was pretty heavy but I was able to lift it I delivered the uh the uh first djx1 to
open Ai and and the researchers there it's on you kno
w the pictures are on the internet
and uh uh and we all autographed it uh and um uh if you come to my office it's autographed
there is really beautiful and but but you could lift it uh this dgx this dgx that djx by
the way was 170 teraflops if you're not familiar with the numbering system that's
0.17 pedop flops so this is 720 the first one I delivered to open AI was 0.17 you
could round it up to 0.2 won't make any difference but and back then was like wow you
know 30 more teraflops and so
this is now 720 pedop flops almost an exal flop for training and
the world's first one exal flops machine in one rack just so you know there are only a
couple two three exop flops machines on the planet as we speak and so this
is an exop flops AI system in one single rack well let's take a look at the back of it so this is what makes it possible
that's the back that's the that's the back the dgx MV link spine 130
terabytes per second goes through the back of that chassis that is more
than
the aggregate bandwidth of the internet so we we could basically send everything
to everybody within a second and so so we we have 5,000 cables 5,000 mvlink cables in total 2 miles
now this is the amazing thing if we had to use Optics we would have had to use transceivers
and retim and those transceivers and reers alone would have cost 20,000 watts 2 kilowatts
of just transceivers alone just to drive the mvlink spine as a result we did it completely
for free over mvlink switch and we were
able to save the 20 kilow for computation this entire
rack is 120 kilowatts so that 20 kilowatts makes a huge difference it's liquid cooled what
goes in is 25° C about room temperature what comes out is 45°c about your jacuzzi so room
temperature goes in jacuzzi comes out 2 liters per second we could we could sell a peripheral 600,000 Parts somebody used to say
you know you guys make gpus and we do but this is what a GPU looks like to me when somebody
says GPU I see this two years ago when
I saw a GPU was the hgx it was 70 lb 35,000 Parts our
gpus now are 600,000 parts and 3,000 lb 3,000 lb 3,000 lb that's kind of like the weight of a you
know Carbon Fiber Ferrari I don't know if that's useful metric but everybody's going I feel it I
feel it I get it I get that now that you mention that I feel it I don't know what's 3,000 lb okay
so 3,000 lb ton and a half so it's not quite an elephant so this is what a dgx looks like now
let's see what it looks like in operation okay let's i
magine what is what how do we put this
to work and what does that mean well if you were to train a GPT model 1.8 trillion parameter
model it took it took about apparently about you know 3 to 5 months or so uh with 25,000 amp
uh if we were to do it with hopper it would probably take something like 8,000 gpus and
it would consume 15 megawatts 8,000 gpus on 15 megawatts it would take 90 days about 3 months
and that would allows you to train something that is you know this groundbreaking AI mod
el and
this is obviously not as expensive as as um as anybody would think but it's 8,000 8,000
gpus it's still a lot of money and so 8,000 gpus 15 megawatts if you were to use Blackwell
to do this it would only take 2,000 gpus 2,000 gpus same 90 days but this is the amazing part
only 4 me GS of power so from 15 yeah that's right and that's and that's our goal our goal
is to continuously drive down the cost and the energy they're directly proportional to each other
cost and energy associate
d with the Computing so that we can continue to expand and scale up the
computation that we have to do to train the Next Generation models well this is training inference
or generation is vitally important going forward you know probably some half of the time that
Nvidia gpus are in the cloud these days it's being used for token generation you know they're
either doing co-pilot this or chat you know chat GPT that or um all these different models that
are being used when you're interacting w
ith it or generating IM generating images or generating
videos generating proteins generating chemicals there's a bunch of gener generation going on
all of that is B in the category of computing we call inference but inference is extremely
hard for large language models because these large language models have several properties
one they're very large and so it doesn't fit on one GPU this is Imagine imagine Excel doesn't fit
on one GPU you know and imagine some application you're running on
a daily basis doesn't run
doesn't fit on one computer like a video game doesn't fit on one computer and most in fact
do and many times in the past in hyperscale Computing many applic applications for many
people fit on the same computer and now all of a sudden this one inference application where
you're interacting with this chatbot that chatbot requires a supercomputer in the back to run it
and that's the future the future is generative with these chatbots and these chatbots are
trillion
s of tokens trillions of parameters and they have to generate tokens at interactive
rates now what does that mean well uh three to tokens is about a word I you know the the uh
you know space the final frontier these are the adventures that's like that's like 80 tokens
okay I don't know if that's useful to you and so you know the art of communications is is selecting good an good analogies
yeah this is this is not going well every I don't know what he's talking about
never seen Star Trek and
so and so so here we are we're trying to generate these tokens
when you're interacting with it you're hoping that the tokens come back to you as quickly as
possible and as quickly as you can read it and so the ability for Generation tokens is really
important you have to paralyze the work of this model across many many gpus so that you could
achieve several things one on the one hand you would like throughput because that throughput
reduces the cost the overall cost per token of uh generat
ing so your throughput dictates the cost
of of uh delivering the service on the other hand you have another interactive rate which is another
tokens per second where it's about per user and that has everything to do with quality of service
and so these two things um uh compete against each other and we have to find a way to distribute
work across all of these different gpus and paralyze it in a way that allows us to achieve
both and it turns out the search search space is enormous you know
I told you there's going to
be math involved and everybody's going oh dear I heard some gasp just now when I put up that slide
you know so so this this right here the the y axis is tokens per second data center throughput the
x- axis is tokens per second interactivity of the person and notice the upper right is the best
you want interactivity to be very High number of tokens per second per user you want the tokens
per second of per data center to be very high the upper upper right is is ter
rific however it's very
hard to do that and in order for us to search for the best answer across every single one of those
intersections XY coordinates okay so you just look at every single XY coordinate all those blue dots
came from some repartitioning of the software some optimizing solution has to go and figure out what
whether to use use tensor parallel expert parallel pipeline parallel or data parallel and distribute
this enormous model across all these G different gpus and sustain per
formance that you need this
exploration space would be impossible if not for the programmability of nvidia's gpus and so we
could because of Cuda because we have such Rich ecosystem we could explore this universe and find
that green roof line it turns out that green roof line notice you got tp2 EPA dp4 it means two
parall two uh tensor parallel tensor parallel across two gpus expert parallels across eight data
parallel across four notice on the other end you got tensor parallel cross 4 and
expert parallel
across 16 the configuration the distribution of that software it's a different different um
runtime that would produce these different results and you have to go discover that roof
line well that's just one model and this is just one configuration of a computer imagine all of the
models being created around the world and all the different different um uh configurations
of of uh systems that are going to be available so now that you understand the
basics let's take a look at
inference of Blackwell compared to Hopper and this is this
is the extraordinary thing in one generation because we created a system that's designed
for trillion parameter gener generative AI the inference capability of Blackwell is off the
charts and in fact it is some 30 times Hopper y for large language models for large language
models like Chad GPT and others like it the blue line is Hopper I gave you imagine we didn't
change the architecture of Hopper we just made it a bigger chip we j
ust used the latest you
know greatest uh 10 terab you know terabytes per second we connected the two chips together we got
this giant 208 billion parameter chip how would we have performed if nothing else changed and it
turns out quite wonderfully quite wonderfully and that's the purple line but not as great as it
could be and and that's where the fp4 tensor core the new Transformer engine and very importantly
the MV link switch and the reason for that is because all these gpus have to shar
e the results
partial products whenever they do all to all all all gather whenever they communicate with each
other that mvlink switch is communicating almost 10 times faster than what we could do in the
past using the fastest networks Okay so Blackwell is going to be just an amazing system for a
generative Ai and in the future in the future data centers are going to be thought of as I
mentioned earlier as an AI Factory an AI Factory's goal in life is to generate revenues generate
in this
case intelligence in this facility not generating electricity as in AC generator but of
the last Industrial Revolution and this Industrial Revolution the generation of intelligence and
so this ability is super super important the excitement of Blackwell is really off the charts
you know when we first when we first um uh you know this this is a year and a half ago two years
ago I guess two years ago when we first started to to go to market with hopper you know we had the
benefit of of uh two
two uh two csps uh joined us in a lunch and and we were you know delighted
um and so we had two customers uh we have more now unbelievable excitement for Blackwell
unbelievable excitement and there's a whole bunch of different configurations of course I
showed you the configurations that slide into the hopper form factor so that's easy to upgrade
I showed you examples that are liquid cooled that are the extreme versions of it one entire rack
that's that's uh connected by mvlink 72 uh we're
going to Blackwell is going to be ramping to the
world's AI companies of which there are so many now doing amazing work in different modalities the
csps every CSP is geared up all the OEM and odms Regional clouds Sovereign AIS and Telos all over
the world are signing up to launch with Blackwell this Blackwell Blackwell would be the the the
most successful product launch in our history and so I can't wait wait to see that um I want
to thank I want to thank some partners that that are joinin
g us in this uh AWS is gearing up for
Blackwell they're uh they're going to build the first uh GPU with secure AI they're uh building
out a 222 exf flops system you know just now when we animated uh just now the digital twin if
you saw the the all of those clusters are coming down by the way that is not just art that is a
digital twin of what we're building that's how big it's going to be besides infrastructure we're
doing a lot of things together with AWS we're Cuda accelerating stag maker
AI we're Cuda accelerating
Bedrock AI uh Amazon robotics is working with us uh using Nvidia Omniverse and Isaac Sim AWS
Health has Nvidia Health Integrated into it so AWS has has really leaned into accelerated
Computing uh Google is gearing up for Blackwell gcp already has A1 100s h100s t4s l4s a whole
Fleet of Nvidia Cuda gpus and they recently announced the Gemma model that runs across all
of it uh we're work working to optimize uh and accelerate every aspect of gcp we're accelerating
d
ata proc which for data processing their data processing engine Jax xlaa vertex Ai and mojoko
for robotics so we're working with uh Google and gcp across a whole bunch of initiatives uh Oracle
is gearing up for black wellth Oracle is a great partner of ours for Nvidia dgx cloud and we're
also working together to accelerate something that's really important to a lot of companies
Oracle database Microsoft is accelerating and Microsoft is gearing up for Blackwell Microsoft
Nvidia has a wide- r
anging partnership we're accelerating Cuda accelerating all kinds of
services when you when you chat obviously and uh AI services that are in Microsoft Azure uh
it's very very likely Nvidia is in the back uh doing the inference and the token generation uh
we built they built the largest Nvidia infiniband supercomputer basically a digital twin of hours
or a physical twin of hours uh we're bringing the Nvidia ecosystem to Azure Nvidia djx cloud
to Azure uh Nvidia Omniverse is now hosted in Az
ure Nvidia Healthcare is an Azure and all of
it is deeply integrated and deeply connected with Microsoft fabric the whole industry is gearing up
for Blackwell this is what I'm about to show you most of the most of the the the uh uh uh scenes
that you've seen so far of Blackwell are the are the full Fidelity design of Blackwell everything
in our company has a digital twin and in fact this digital twin idea is it is really spreading and it
it helps it helps companies build very complicated th
ings perfectly the first time and what could
be more exciting than creating a digital twin to build a computer that was built in a digital
twin and so let me show you what wistron is doing to meet the demand for NVIDIA accelerated
Computing widraw one of our leading manufacturing Partners is building digital twins of Nvidia
dgx and hgx factories using custom software developed with Omniverse sdks and apis for
their newest Factory wraw started with a digital twin to virtually integrate their
multi-ad and process simulation data into a unified view testing and optimizing layouts
in this physically accurate digital environment increased worker efficency icy by 51% during
construction the Omniverse digital twin was used to verify that the physical build matched
the digital plans identifying any discrepancies early has helped avoid costly change orders and
the results have been impressive using a digital twin helped bring wion's Factory online in half
the time just 2 and 1/2 mont
hs instead of five in operation the Omniverse digital twin helps
widraw rapidly Test new layouts to accommodate new processes or improve operations in
the existing space and monitor real-time operations using live iot data from every machine
on the production line which ultimately enabled wion to reduce End to-end Cycle Times by 50%
and defect rates by 40% with Nvidia Ai and Omniverse nvidia's Global ecosystem of partners
are building a new era of accelerated AI enabled [Music] digitalizati
on [Applause] that's how we that's the way it's going
to be in the future we're going to manufacturing everything digitally first
and then we'll manufacture it physically people ask me how did it start what
got you guys so excited what was it that you saw that caused you to put it
all in on this incredible idea and it's this hang on a second guys that was going to be such a moment that's what happens when you don't rehearse this as you know was first
Contact 20 12 alexnet you put a cat into
this computer and it comes out and it says cat and we said oh my God this is going to change everything you take 1 million numbers you take
one Million numbers across three channels RGB these numbers make no sense to anybody you
put it into this software and it compress it dimensionally reduce it it reduces it from a
million dimensions a million Dimensions it turns it into three letters one vector one number
and it's generalized you could have the cat be different cats and and you could hav
e it be the
front of the cat and the back of the cat and you look at this thing you say unbelievable you mean
any cats yeah any cat and it was able to recognize all these cats and we realized how it did it
systematically structurally it's scalable how big can you make it well how big do you want to make
it and so we imagine that this is a completely new way of writing software and now today as you
know you could have you type in the word c a and what comes out is a cat it went the other way
am
I right unbelievable how is it possible that's right how is it possible you took three letters
and you generated a million pixels from it and it made sense well that's the miracle and here we
are just literally 10 years later 10 years later where we recognize textt we recognize images we
recognize videos and sounds and images not only do we recognize them we understand their meaning
we understand the meaning of the text that's the reason why it can chat with you it can summarize
for yo
u it understands the text it understood not just recognizes the the English it understood the
English it doesn't just recognize the pixels and understood the pixels and you can you can even
condition it between two modalities you can have language condition image and generate all kinds
of interesting things well if you can understand these things what else can you understand
that you've digitized the reason why we started with text and you know images is because
we digitized those but what
else have we digitized well it turns out we digitized a lot of things
proteins and genes and brain waves anything you can digitize so long as there's structure we can
probably learn some patterns from it and if we can learn the patterns from it we can understand its
meaning if we can understand its meaning we might be able to generate it as well and so therefore
the generative AI Revolution is here well what else can we generate what else can we learn well
one of the things that we would lo
ve to learn we would love to learn is we would love to learn
climate we would love to learn extreme weather we would love to learn uh what how we can predict
future weather at Regional scales at sufficiently high resolution such that we can keep people out
of Harm's Way before harm comes extreme weather cost the world $150 billion surely more than that
and it's not evenly distributed $150 billion is concentrated in some parts of the world and of
course to some people of the world we need to
adapt and we need to know what's coming and so
we are creating Earth too a digital twin of the Earth for predicting weather we and we've made an
extraordinary invention called Civ the ability to use generative AI to predict weather at extremely
high resolution let's take a look as the earth's climate changes AI powered weather forecasting
is allowing us to more accurately predict and track severe storms like super typhoon chanthu
which caused widespread damage in Taiwan and the surrounding
region in 2021 current AI forecast
models can accurately predict the track of storms but they are limited to 25 km resolution which
can miss important details Invidia cordi is a revolutionary new generative AI model trained on
high resolution radar assimilated Warf weather forecasts and air 5 reanalysis data using cordi
extreme events like chanthu can be super resolved from 25 km to 2 km resolution with 1,000 times
the speed and 3,000 times the Energy Efficiency of conventional weather mod
els by combining the
speed and accuracy of nvidia's weather forecasting model forecast net and generative AI models like
cordi we can explore hundreds or even thousands of kilometer scale Regional weather forecasts
to provide a clear picture of the best worst and most likely impacts of a storm this wealth
of information can help minimize loss of life and property damage today cordi is optimized
for Taiwan but soon generative super sampling will be available as part of the in viia Earth
2 i
nference service for many regions across the [Music] globe the weather company has the trust a source
of global weather predictions we are working together to accelerate their weather simulation
first principled base of simulation however they're also going to integrate Earth to cordi so
that they could help businesses and countries do Regional high resolution weather prediction and
so if you have some weather prediction you'd like to know like to do uh reach out to the weather
company real
ly exciting really exciting work Nvidia Healthcare something we started 15 years
ago we're super super excited about this this is an area where we're very very proud whether
it's Medical Imaging or genene sequencing or computational chemistry it is very likely that
Nvidia is the computation behind it we've done so much work in this area today we're announcing
that we're going to do something really really cool imagine all of these AI models that are being
used to generate images and audio b
ut instead of images and audio because it understood images
and audio all the digitization that we've done for genes and proteins and amino acids that
digitalization capability is now now passed through machine learning so that we understand
the language of Life the ability to understand the language of Life of course we saw the
first evidence of it with alphafold this is really quite an extraordinary thing after
Decades of painstaking work the world had only digitized and reconstructed usi
ng cor electron
microscopy or Crystal XR x-ray crystallography um these different techniques painstaking
reconstructed the protein 200,000 of them in just what is it less than a year or so
Alpha fold has reconstructed 200 million proteins basically every protein every of every
living thing that's ever been sequenced this is completely revolutionary well those models are
incredibly hard to use um for incredibly hard for people to build and so what we're going to
do is we're going to build t
hem we're going to build them for uh the the researchers around
the world and it won't be the only one there'll be many other models that we create and so
let me show you what we're going to do with it virtual screening for new medicines
is a computationally intractable problem existing techniques can only scan billions
of compounds and require days on thousands of standard compute nodes to identify new drug
candidates Nvidia biion Nemo Nims enable a new generative screening Paradigm using
Nims
for protein structure prediction with Alpha fold molecule generation with MIM and docking
with diff dock we can now generate and Screen candidate molecules in a matter of minutes MIM
can connect to custom applications to steer the generative process iteratively optimizing for
desired properties these applications can be defined with biion Nemo microservices or built
from scratch here a physics based simulation optimizes for a molecule's ability to bind to
a Target protein while optimi
zing for other favorable molecular properties in parallel MIM
generates high quality drug-like molecules that bind to the Target and are synthesizable
translating to a higher probability of developing successful medicines faster biion Nemo
is enabling a new paradigm in drug Discovery with Nims providing OnDemand microservices
that can be combined to build powerful drug Discovery workflows like denovo protein
design or ided molecule generation for virtual screening bio Nims are helping resea
rchers
and developers reinvent computational drug design Nvidia M MIM MIM cord diff there's a whole
bunch of other models whole bunch of other models computer vision models robotics models and
even of course some really really terrific open source language models these models are
groundbreaking however it's hard for companies to use how would you use it how would you bring
it into your company and integrate it into your workflow how would you package it up and run it
remember earlier I jus
t said that inference is an extraordinary computation problem how would
you do the optimization for each and every one of these models and put together the Computing
stack necessary to run that supercomputer so that you can run the models in your company and so we
have a great idea we're going to invent a new way invent a new way for you to receive and operate
software this software comes basically in a digital box we call it a container and we call it
the Nvidia inference micr service a Ni
m and let me explain to you what it is a Nim it's a pre-trained
model so it's pretty clever and it is packaged and optimized to run across nvidia's install
base which is very very large what's inside it is incredible you have all these pre-trained
state-ofthe-art open source models they could be open source they could be from one of our partners
it could be created by us like Nvidia mull it is packaged up with all of its dependencies so Cuda
the right version CNN the right version tensor RT
llm Distributing across the multiple gpus Tred and
inference server all completely packaged together it's optimized depending on whether you have a
single GPU multi- GPU or multi node of gpus it's optimized for that and it's connected up with apis
that are simple to use now this think about what an AI API is an AI API is an interface that you
just talk to and so this is a piece of software in the future that has a really simple API and that
API called human and these packages incredible bo
dies of software will be optimized and packaged
and we'll put it on a website and you can download it you could take it with you you could run
it in any Cloud you can run it in your own data center you can run in workstations if it fit
and all you have to do is come to ai. nvidia.com we call it Nvidia inference microservice but
inside the company we all call it Nims okay just imagine you know one of some someday there
there's going to be one of these chat Bots and these chat Bots is going t
o just be in a Nim and
you you'll uh you'll assemble a whole bunch of chat Bots and that's the way software is going
to be be built someday how do we build software in the future it is unlikely that you'll write
it from scratch or write a whole bunch of python code or anything like that it is very likely that
you assemble a team of AIS there's probably going to be a super AI that you use that takes the
mission that you give it and breaks it down into an execution plan some of that execution
plan
could be handed off to another Nim that Nim would maybe uh understand sap the language of sap is
abap it might understand service now and it go retrieve some information from their platforms
it might then hand that result to another Nim who that goes off and does some calculation
on it maybe it's an optimization software a combinatorial optimization algorithm maybe it's
uh you know some just some basic calculator maybe it's pandas to do some numerical analysis on it
and then it comes
back with its answer and it gets combined with everybody else's and it because
it's been presented with this is what the right answer should look like it knows what answer what
an what right answers to produce and it presents it to you we can get a report every single day at
you know top of the hour uh that has something to do with a bill plan or some forecast or uh some
customer alert or some bugs database or whatever it happens to be and we could assemble it using
all these Nims and beca
use these Nims have been packaged up and ready to work on your systems so
long as you have video gpus in your data center in the cloud this this Nims will work together as a
team and do amazing things and so we decided this is such a great idea we're going to go do that and
so Nvidia has Nims running all over the company we have chatbots being created all over the place
and one of the mo most important chatbots of course is a chip designer chatbot you might not
be surprised we care a lot ab
out building chips and so we want to build chatbots AI co-pilots that
are co-designers with our engineers and so this is the way we did it so we got ourselves a llama
llama 2 this is a 70b and it's you know packaged up in a NM and we asked it you know uh what is a
CTL Well turns out CTL is an internal uh program and it has a internal proprietary language but
it thought the CTL was a combinatorial timing logic and so it describes you know conventional
knowledge of CTL but that's not very use
ful to us and so we gave it a whole bunch of new examples
you know this is no different than employee onboarding an employee uh we say you know thanks
for that answer it's completely wrong um and and uh and then we present to them uh this is what
a CTL is okay and so this is what a CTL is at Nvidia and the CTL as you can see you know CTL
stands for compute Trace Library which makes sense you know we were tracing compute Cycles
all the time and it wrote the program isn't that amazing and so
the productivity of our chip
designers can go up this is what you can do with a Nim first thing you can do with is customize
it we have a service called Nemo microservice that helps you curate the data preparing the
data so that you could teach this on board this AI you fine-tune them and then you guardrail
it you can even evaluate the answer evaluate its performance against um other other examples and
so that's called the Nemo micr service now the thing that's that's emerging here is this
there
are three elements three pillars of what we're doing the first pillar is of course inventing
the technology for um uh AI models and running AI models and packaging it up for you the second
is to create tools to help you modify it first is having the AI technology second is to help you
modify it and third is infrastructure for you to fine-tune it and if you like deploy it you could
deploy it on our infrastructure called dgx cloud or you can employ deploy it on Prem you can deploy
it a
nywhere you like once you develop it it's yours to take anywhere and so we are effectively
an AI Foundry we will do for you and the industry on AI what tsmc does for us building chips and
so we go to it with our go to tsmc with our big Ideas they manufacture and we take it with us and
so exactly the same thing here AI Foundry and the three pillar ERS are the NIMS Nemo microservice
and dgx Cloud the other thing that you could teach the Nim to do is to understand your proprietary
information
remember inside our company the vast majority of our data is not in the cloud it's
inside our company it's been sitting there you know being used all the time and and gosh it's
it's basically invidious intelligence we would like to take that data learn its meaning like we
learned the meaning of almost anything else that we just talked about learn its meaning and then
reindex that knowledge into a new type of database called a vector database and so you essentially
take structured data or un
structured data you learn its meaning you encode its meaning so now
this becomes an AI database and that AI database in the future once you create it you can talk to
it and so let me give you an example of what you could do so suppose you create you get you got a
whole bunch of multi modality data and one good example of that is PDF so you take the PDF you
take all of your PDFs all the all your favorite you know the stuff that that is proprietary to
you critical to your company you can enco
de it just as we encoded pixels of a cat and it becomes
the word cat we can encode all of your PDF and it turns into vectors that are now stored inside
your vector database it becomes the proprietary information of your company and once you have
that proprietary information you can chat to it it's an it's a smart database and so you just
ch chat with data and how how much more enjoyable is that you know we for for our software team
you know they just chat with the bugs database you know how
many bugs was there last night um
are we making any progress and then after you're done talking to this uh bugs database you need
therapy and so so we have another chatbot for you you can do it okay so we call this Nemo Retriever and the
reason for that is because ultimately it's job is to go retrieve information as quickly as
possible and you just talk to it hey retrieve me this information it goes if brings it back to
you and do you mean this you go yeah perfect okay and so we call it th
e Nemo retriever well the
Nemo service helps you create all these things and we have all all these different Nims we
even have Nims of digital humans I'm Rachel your AI care manager okay so so it's a really
short clip but there were so many videos to show you I guess so many other demos to show
you and so I I had to cut this one short but this is Diana she is a digital human Nim
and and uh you just talked to her and she's connected in this case to Hippocratic ai's large
language model for
healthcare and it's truly amazing she is just super smart about Healthcare
things you know and so after you're done after my my Dwight my VP of software engineering talks to
the chatbot for bugs database then you come over here and talk to Diane and and so so uh Diane
is is um completely animated with AI and she's a digital human uh there's so many companies that
would like to build they're sitting on gold mines the the Enterprise IT industry is sitting on a
gold mine it's a gold mine becau
se they have so much understanding of of uh the way work is
done they have all these amazing tools that have been created over the years and they're
sitting on a lot of data if they could take that gold mine and turn them into co-pilots
these co-pilots could help us do things and so just about every it franchise it platform in
the world that has valuable tools that people use is sitting on a gold mine for co-pilots and
they would like to build their own co-pilots and their own chatbots and
so we're announcing
that Nvidia AI foundary is working with some of the world's great companies sap generates
87% of the world's Global Commerce basically the world runs on sap we run on sap Nvidia and
sap are building sap Jewel co-pilots uh using Nvidia Nemo and dgx cloud service now they run
80 85% of the world's Fortune 500 companies run their people and customer service operations on
service now and they're using Nvidia AI Foundry to build service now uh assist virtual assistance
cohes
ity backs up the world's data they're sitting on a gold mine of data hundreds of exobytes of
data over 10,000 companies Nvidia AI Foundry is working with them helping them build their Gaia
generative AI agent snowflake is a company that stores the world's uh digital Warehouse in
the cloud and serves over 3 billion queries a day for 10,000 Enterprise customers snowflake is
working with Nvidia AI Foundry to build co-pilots with Nvidia Nemo and Nims net apppp nearly half
of the files in the wo
rld are stored on Prem on net apppp Nvidia AI Foundry is helping them uh
build chat Bots and co-pilots like those Vector databases and retrievers with Nvidia neemo and
Nims and we have a great partnership with Dell everybody who everybody who is building these
chat Bots and generative AI when you're ready to run it you're going to need an AI Factory and
nobody is better at Building end-to-end Systems of very large scale for the Enterprise than
Dell and so anybody any company every company w
ill need to build AI factories and it turns out
that Michael is here he's happy to take your order ladies and gentlemen Michael del okay let's talk about the next wave of
Robotics the next wave of AI robotics physical AI so far all of the AI that we've talked about is
one computer data comes into one computer lots of the world's if you will experience in digital
text form the AI imitates Us by reading a lot of the language to predict the next words it's
imitating You by studying all of the
patterns and all the other previous examples of course it
has to understand context and so on so forth but once it understands the context it's essentially
imitating you we take all of the data we put it into a system like dgx we compress it into a
large language model trillions and trillions of parameters become billions and billion trillions
of tokens becomes billions of parameters these billions of parameters becomes your AI well
in order for us to go to the next wave of AI where the AI
understands the physical world we're
going to need three computers the first computer is still the same computer it's that AI computer
that now is going to be watching video and maybe it's doing synthetic data generation and maybe
there's a lot of human examples just as we have human examples in text form we're going to have
human examples in articulation form and the AIS will watch us understand what is happening and try
to adapt it for themselves into the context and because it can genera
lize with these Foundation
models maybe these robots can also perform in the physical world fairly generally so I just
described in very simple terms essentially what just happened in large language models except
the chat GPT moment for robotics may be right around the corner and so we've been building
the end to-end systems for robotics for some time I'm super super proud of the work we have
the AI system dgx we have the lower system which is called agx for autonomous systems the world's
first robotics processor when we first built this thing people are what are you guys building
it's a s so it's one chip it's designed to be very low power but it's designed for high-speed
sensor processing and Ai and so if you want to run Transformers in a car or you want to run
Transformers in a in a you know anything um that moves uh we have the perfect computer for you
it's called the Jetson and so the dgx on top for training the AI the Jetson is the autonomous
processor and in the middl
e we need another computer whereas large language models have the
benefit of you providing your examples and then doing reinforcement learning human feedback what
is the reinforcement learning human feedback of a robot well it's reinforcement learning physical
feedback that's how you align the robot that's how you that's how the robot knows that as
it's learning these articulation capabilities and manipulation capabilities it's going to adapt
properly into the laws of physics and so we need
a simulation engine that represents the world
digitally for the robot so that the robot has a gym to go learn how to be a robot we call that
virtual world Omniverse and the computer that runs Omniverse is called ovx and ovx the computer
itself is hosted in the Azure Cloud okay and so basically we built these three things these
three systems on top of it we have algorithms for every single one now I'm going to show you one
super example of how Ai and Omniverse are going to work together the
example I'm going to show you
is kind of insane but it's going to be very very close to tomorrow it's a robotics building this
robotics building is called a warehouse inside the robotics building are going to be some autonomous
systems some of the autonomous systems are going to be called humans and some of the autonomous
systems are going to be called forklifts and these autonomous systems are going to interact
with each other of course autonomously and it's going to be overlooked upon by
this Warehouse to
keep everybody out of Harm's Way the warehouse is essentially an air traffic controller and
whenever it sees something happening it will redirect traffic traffic and give New Way points
just new way points to the robots and the people and they'll know exactly what to do this warehouse
this building you can also talk to of course you could talk to it hey you know sap Center how are
you feeling today for example and so you could ask the same the warehouse the same questions
basically the system I just described will have Omniverse Cloud that's hosting the virtual
simulation and AI running on djx cloud and all of this is running in real time let's take
a look the future of heavy industri starts as a digital twin the AI agents helping robots workers
and infrastructure navigate unpredictable events in complex industrial spaces will be built
and evaluated first in sophisticated digital twins this Omniverse digital twin of a 100,000 ft
Warehouse is operating as a
simulation environment that integrates digital workers amrs running the
Nvidia Isaac receptor stack centralized activity maps of the entire Warehouse from 100 simulated
ceiling mount cameras using Nvidia metropolis and AMR route planning with Nvidia Koop software
in Loop testing of AI agents in this physically accurate simulated environment enables us to
evaluate and refine how the system adapts to real world unpredictability here an incident occurs
along this amr's planned route blocking
its path as it moves to pick up a pallet Nvidia Metropolis
updates and sends a realtime occupancy map to kopt where a new optimal route is calculated the AMR
is enabled to see around corners and improve its Mission efficiency with generative AI powered
Metropolis Vision Foundation models operators can even ask questions using natural language the
visual model understands nuanced activity and can offer immediate insights to improve operations
all of the sensor data is created in simulation a
nd passed to the real-time AI running as Nvidia
inference microservices or Nims and when the AI is ready to be deployed in the physical twin the real
Warehouse we connect metropolis and Isaac Nims to real sensors with the ability for continuous
Improvement of both the digital twin and the AI models isn't that incredible and so remember
remember a future facility Warehouse Factory building will be software defined and so the
software is running how else would you test the software so you you
you test the software to
building the warehouse the optimization system in the digital twin what about all the robots all of
those robots you are seeing just now they're all running their own autonomous robotic stack and so
the way you integrate software in the future cicd in the future for robotic systems is with digital
twins we've made Omniverse a lot easier to access we're going to create basically Omniverse Cloud
apis four simple API and a channel and you can connect your application
to it so this is this
is going to be as wonderfully beautifully simple in the future that Omniverse is going to be and
with these apis you're going to have these magical digital twin capability we also have turned om ver
into an AI and integrated it with the ability to chat USD the the language of our language is
you know human and Omniverse is language as it turns out is universal scene description and
so that language is rather complex and so we've taught our Omniverse uh that language an
d so
you can speak to it in English and it would directly generate USD and it would talk back
in USD but Converse back to you in English you could also look for information in this world
semantically instead of the world being encoded semantically in in language now it's encoded
semantically in scenes and so you could ask it of of uh certain objects or certain conditions
and certain scenarios and it can go and find that scenario for you it also can collaborate with
you in generation you co
uld design some things in 3D it could simulate some things in 3D or
you could use AI to generate something in 3D let's take a look at how this is all going to work
we have a great partnership with Seamans Seamans is the world's largest industrial engineering
and operations platform you've seen now so many different companies in the industrial space heavy
Industries is one of the greatest final frontiers of it and we finally now have the Necessary
Technology to go and make a real impact seen
s is building the industrial metaverse and today
we're announcing that Seamans is connecting their Crown Jewel accelerator to Nvidia Omniverse let's
take a look seens technology is transformed every day for everyone team Center acts our leading
product life cycle management software from the sems accelerator platform is used every day by
our customers to develop and deliver products at scale now we are bringing the real and the
digital worlds even Closer by integrating Nvidia Ai and Omniver
se Technologies into team Center X
Omniverse apis enable data interoperability and physics-based rendering to Industrial scale design
and Manufacturing projects our customers HD market leader in sustainable ship manufacturing builds
ammonia and hydrogen power chips often comprising over 7 million discrete Parts with Omniverse apis
team Center X lets companies like HD yundai unify and visualize these massive engineering data
sets interactively and integrate generative AI to generate 3D objec
ts or HDR I backgrounds
to see their projects in context the result an ultra inuitive photoal physics-based digital
twin that eliminates waste and errors delivering huge savings in cost and time and we are building
this for collaboration whether across more semens accelerator tools like seens anex or Star CCM
Plus or across teams working on their favorite devices in the same scene together in this is
just the beginning working with Nvidia we will bring accelerated Computing generative Ai an
d
Omniverse integration across the Sean accelerator portfolio the pro the the professional
the professional voice actor happens to be a good friend of mine Roland Bush
who happens to be the CEO of seens once you get Omniverse connected into your
workflow your ecosystem from the beginning of your design to engineering to manufacturing
planning all the way to digital twin operations once you connect everything together it's insane
how much productivity you can get and it's just really really
wonderful all of a sudden everybody
is operating on the same ground truth you don't have to exchange data and convert data make
mistakes everybody is working on the same ground truth from the design Department to the
art Department the architecture Department all the way to the engineering and even the marketing
department let's take a look at how Nissan has integrated Omniverse into their workflow
and it's all because it's connected by all these wonderful tools and these developers
that
we're working with take a look unbel [Music] [Music] for [Music] for that was not an animation that was Omniverse today we're announcing that Omniverse
Cloud streams to The Vision Pro and it is very very strange that you walk around
virtual doors when I was getting out of that car and everybody does it it is really really quite
amazing Vision Pro connected to Omniverse portals you into Omniverse and because all of these CAD
tools and all these different design tools are now integrated and co
nnected to Omniverse you can
have this type of workflow really incredible let's talk about robotics everything that moves will be
robotic there's no question about that it's safer it's more convenient and one of the largest
Industries is going to be Automotive we build the robotic stack from top to bottom as I was
mentioned from the computer system but in the case of self-driving cars including the self-driving
application at the end of this year or I guess beginning of next year we will be
shipping in
Mercedes and then shortly after that jlr and so these autonomous robotic systems are software
defined they take a lot of work to do has computer vision has obviously artificial intelligence
control and planning all kinds of very complicated technology and takes years to refine we're
building the entire stack however we open up our entire stack for all of the automotive industry
this is just the way we work the way we work in every single industry we try to build as much of
it
as we can so that we understand it but then we open it up so everybody can access it whether
you would like to buy just our computer which is the world's only full functional save asld system
that can run AI this functional safe asld quality computer or the operating system on top or of
course our data centers which is in basically every AV company in the world however you would
like to enjoy it we're delighted by it today we're announcing that byd the world's largest ev company
is adopting
our next Generation it's called Thor Thor is designed for Transformer engines Thor
our next Generation AV computer will be used by byd you probably don't know this fact that we
have over a million robotics developers we created Jetson this robotics computer we're so proud of
it the amount of software that goes on top of it is insane but the reason why we can do it at all
is because it's 100% Cuda compatible everything that we do everything that we do in our company
is in service of our dev
elopers and by us being able to maintain this Rich ecosystem and make
it compatible with everything that you access from us we can bring all of that incredible
capability to this little tiny computer we call Jetson a robotics computer we also today
are announcing this incredibly Advanced new SDK we call it Isaac perceptor Isaac perceptor most
most of the Bots today are pre-programmed they're either following rails on the ground digital rails
or theyd be following April tags but in the futur
e they're going to have perception and the reason
why you want that is so that you could easily program it you say would you like to go from
point A to point B and it will figure out a way to navigate its way there so by only programming
waypoints the entire route could be adaptive the entire environment could be reprogrammed just
as I showed you at the very beginning with the warehouse you can't do that with pre-programmed
agvs if those boxes fall down they just all gum up and they just wa
it there for somebody to come
clear it and so now with the Isaac perceptor we have incredible state-of-the-art Vision
odometry 3D reconstruction and in addition to 3D reconstruction depth perception the reason
for that is so that you can have two modalities to keep an eye on what's happening in the world
Isaac perceptor the most used robot today is the manipulator manufacturing arms and they are also
pre-programmed the computer vision algorithms the AI algorithms the control and path planni
ng
algorithms that are geometry aware incredibly computational intensive we have made these Cuda
accelerated so we have the world's first Cuda accelerated motion planner that is geometry aware
you put something in front of it it comes up with a new plan and our articulates around it it has
excellent perception for pose estimation of a 3D object not just not it's pose in 2D but it's pose
in 3D so it has to imagine what's around and how best to grab it so the foundation pose the grip
foundat
ion and the um articulation algorithms are now available we call it Isaac manipulator and
they also uh just run on nvidia's computers we are are starting to do some really great work in the
next generation of Robotics the next generation of Robotics will likely be a humanoid robotics
we now have the Necessary Technology and as I was describing earlier the Necessary Technology
to imagine generalized human robotics in a way human robotics is likely easier and the reason
for that is because we
have a lot more imitation training data that we can provide there robots
because we are constructed in a very similar way it is very likely that the human robotics
will be much more useful in our world because we created the world to be something that we can
interoperate in and work well in and the way that we set up our workstations and Manufacturing
and Logistics they were designed for for humans they were designed for people and so these human
robotics will likely be much more productiv
e to deploy while we're creating just like we're doing
with the others the entire stack starting from the top a foundation model that learns from watching
video human IM human examples it could be in video form it could be in virtual reality form we then
created a gym for it called Isaac reinforcement learning gym which allows the humanoid robot to
learn how to adapt to the physical world and then an incredible computer the same computer that's
going to go into a robotic car this computer w
ill run inside a human or robot called Thor
it's designed for Transformer engines we've combined several of these into one video this is
something that you're going to really love take a look it's not enough for humans
to [Music] imagine we have to invent and explore real and push
Beyond what's been done fair amount of detail we create smarter and faster we push it to fail so it can learn we teach it then help it teach
itself we broaden its understanding to take on new challenges with absol
ute precision and succeed we make it perceive and move and even reason so it can share our world with us [Music]
1:52:22.520,1193:02:47.295
[Music] this is where inspiration leads us the
next Frontier this is Nvidia Project Groot a general purpose Foundation model for
humanoid robot learning the group model takes multimodal instructions and past interactions
as input and produces the next action for the robot to execute we developed Isaac lab a
robot learning application to train gr on Omniv
erse Isaac Sim and we scale out with osmo a
new compute orchestration service that coordinates work flows across dgx systems for training and
ovx systems for simulation with these tools we can train Groot in physically based simulation
and transfer zero shot to the real world the Groot model will enable a robot to learn from a
handful of human demonstrations so it can help with everyday tasks and emulate human movement
just by observing us this is made possible with nvidia's technologies th
at can understand
humans from videos train models and simulation and ultimately deploy them directly to physical
robots connecting group to a large language model even allows it to generate motions by following
natural language instructions hi go1 can you give me a high five sure thing let's high five can
you give us some cool moves sure check this out all this incredible intelligence is
powered by the new Jetson Thor robotics chips designed for Groot built for the
future with Isaac lab os
mo and Groot we're providing the building blocks
for the next generation of AI powered [Applause] robotics [Music] about the same size the soul of Nvidia the intersection
of computer Graphics physics artificial intelligence it all came to bear at this moment
the name of that project general robotics 003 I know super good super good well
I think we have some special guests do [Music] we hey guys so I understand you guys
are powered by Jetson they're powered by Jetson little Jetson robotics
computers inside they learn to walk in Isaac Sim ladies and gentlemen this this is orange and this is the famous green they are the
bdx robots of Disney amazing Disney research come on you guys let's wrap
up let's go five things where you going I sit right here Don't Be Afraid come here green hurry up what are you saying no it's
not time to eat it's not time to I'll I'll give you a snack in a moment let
me finish up real quick come on green hurry up stop wasting time five things five things
first
a new Industrial Revolution every data center should be accelerated a trillion dollars worth
of installed data centers will become modernized over the next several years second because
of the computational capability we brought to bear a new way of doing software has emerged
generative AI which is going to create new in new infrastructure dedicated to doing one thing and
one thing only not for multi-user data centers but AI generators these AI generation will create
incredibly valuab
le software a new Industrial Revolution second the computer of this revolution
the computer of this generation generative AI trillion parameters blackw insane amounts of
computers and computing third I'm trying to concentrate good job third new computer new
computer creates new types of software new type of software should be distributed in a
new way so that it can on the one hand be an endpoint in the cloud and easy to use but still
allow you to take it with you because it is your intellig
ence your intelligence should be pack
packaged up in a way that allows you to take it with you we call them Nims and third
these Nims are going to help you create a new type of application for the future not
one that you wrote completely from scratch but you're going to integrate them like teams
create these applications we have a fantastic capability between Nims the AI technology the
tools Nemo and the infrastructure dgx cloud in our AI Foundry to help you create proprietary
applications
proprietary chat Bots and then lastly everything that moves in the future
will be robotic you're not going to be the only one and these robotic systems whether they
are humanoid amrs self-driving cars forklifts manipulating arms they will all need one thing
Giant stadiums warehouses factories there can to be factories that are robotic orchestrating
factories uh manufacturing lines that are robotics building cars that are robotics these
systems all need one thing they need a platform a digi
tal platform a digital twin platform and
we call that Omniverse the operating system of the robotics World these are the five things that
we talked about today what does Nvidia look like what does Nvidia look like when we talk about
gpus there's a very different image that I have when I when people ask me about gpus first I see
a bunch of software stacks and things like that and second I see this this is what we announce
to you today this is Blackwell this is the plat amazing amazing proces
sors MV link switches
networking systems and the system design is a miracle this is Blackwell and this
to me is what a GPU looks like in my mind listen orange green I think we
have one more treat for everybody what do you think should we okay we
have one more thing to show you roll [Music] it [Music] [Music] he [Music]
2:01:21.920,1193:02:47.295
[Music] [Music] m [Music] yeah [Music] [Music] thank you thank you have a great have a
great GTC thank you all for coming thank you
Comments