What is AI? Jared Spataro Keynote | Microsoft Global Nonprofit Leaders Summit

- Two things as I get started here. Number one, just a sincere, sincere thank you for spending your time with us. We recognize you have a lot to do and we're so grateful. Number two, I want you to pretend like it's you and me for the next 20 minutes, and we're gonna go and demo. I don't have any slides to show you, but we're gonna go through a demo. Now, I'm actually gonna demo on my data, like, my real data. Sometimes it's gonna mess up, and sometimes it might even show you stuff I'm not suppos

ed to show you. If that happens, I'm gonna tell you to avert your eyes over to some wall (audience laughs) or something like that. So let's get started. I am gonna start in a very, very basic place and take us all the way back to ChatGPT to get started. I recognize that in this room, we have a lot of people with a lot of different levels of experience with the technology itself, and I'm gonna start from the very beginning. Think back to the first time that you saw ChatGPT and you saw it do somet

hing, perhaps a party trick. We're gonna start there. (audience laughs) I also will go back and forth between a whole bunch of different prompts that I'm gonna be using. That's mostly so you don't have to watch me type. So let's talk for a moment about GPT, what's called 3.5, and I'm gonna ask a very simple question: What can you tell me about elephants? Now, the reason I do this is I just wanna explain conceptually, just to review for a moment, what are these large language models, what we toda

y call foundational models? In essence, what they are is that they have taken the information that's been organized into knowledge by people through the written word and they have processed that written word and created a conceptual view of the world. So now, based on the tremendous amount of training data that we've stuffed into these things, they can tell you about just about any topic that an educated college university grad would be able to tell you about. So let's start with what can you te

ll me about elephants? Now to be clear, we never trained this specifically on elephants, we just gave it a lot of different materials, and it turns out that elephants were mentioned, and so as you can see here, it knows a lot about elephants. It can tell me a bit about species, about size, about lifespan, physical features, social structure, and basically, you should think that the neural network that is an LLM has taken, again, the written word and tried to conceptually create an understanding

of the world around us. It's based on that written word. Now, over the last 12 months, what we've learned about these LLMs is that they can not only spit out facts, they can actually reason. Sometimes, in the industry, it's called inference. I like to think of it as simply thinking. They can not only tell you about a topic, but they can apply their knowledge about many topics to solve problems. Let me give you an example. We'll move from a simple prompt to this prompt: Could an elephant pull my

Tundra? How would it work? Well, I drive a Toyota Tundra, and I've always wondered if an elephant could pull it, and here we will ask it. Now, one of the most amazing things is that first, it understands that we're not talking about some large landmass, the tundra, but a Toyota Tundra, it understands the context of what's happening, and then it gives me its reasoned answer: "It's highly unlikely that an elephant could pull a Toyota Tundra or any other modern vehicle, on its own." And then it goe

s on: "Here are a few reasons why an elephant couldn't effectively pull a vehicle like a Tundra," and it gives me its view, traction, weight distribution, vehicle resistance, practicality. It goes on and reasons across what it knows about the various topics, puts those things together, and does its best, to say: "Even if it were somehow possible for an elephant to move a car, it would not be practical or a safe method of transportation." Thank you. (audience laughs) I've never thought of that be

fore. As we've been working with these models, we have been understanding what they're good at reasoning about and what they're not, and we'll talk a little bit about that in just a moment. Okay, now let's ask another question. Here, we'll get to more of a factual question for a moment. I'm gonna go to December, 2023, and I'm gonna ask a little bit about the jobs report from the BLS, the Bureau of Labor Statistics. Let's see how it answers. Hmm, that's interesting: "I apologize, but I don't have

access to real-time or future data." It turns out that it's knowledge cutoff date, its training date, was January of 2022, so it can't tell me much about what happened in December of 2023. That was the state of the art roughly about 12 months ago. We had the equivalent of a college-educated thinker, but unfortunately, it didn't have access to current or recent information. Well, we addressed that with something that we call grounding, or sometimes it's called retrieval augmented generation. To

show you that, I'm gonna move over to Copilot, where I'm gonna issue that same exact query: What can you tell me about the December, 2023 jobs report from the Bureau of Labor Statistics? I'm just at copilot.microsoft.com, you can try this yourself, and we'll see something happen here that's very interesting. It will say it's searching for that particular report, it will find in that report, and then it will give me its view of the report, and if you go look at the source here, it went to the act

ual bls.gov website to find that source. You can see, for instance, that total nonfarm payroll employment rose by 216,000, that the unemployment rate remained unchanged at 3.7%. And it has references which are very interesting that take us directly to some source data that we can look at. This pattern that you just saw is what is called retrieval augmented generation. It's the idea that we can take what essentially, effectively, is a brain that can do some thinking, it's fairly well educated, we

can go out and get the most recent data, we can pull that data together in what we now call in the industry a context window, and we can take the question and the context window and push it against the LLM and ask it to reason, and that's exactly what we get here. Now, this opens up all sorts of amazing possibilities. We can apply that basic structure, that basic pattern to business. We certainly can apply it to your sector, to the work that you do, the missions that you pursue. It's an interes

ting thought exercise to think, "What would I now do with synthetic reasoning," we like to think of this as a general-purpose reasoning engine, "and my data, what problems could I solve?" All right, let's continue, then, over the course of the last few months and see what has started to become of all of this. So now I'm gonna give it a different question: Can you give me a graph of the jobs report from the Bureau of Labor Statistics over, I think I asked, the last eight quarters. So we will ask

that question. The way it handles this is actually changing right now, every day, it's a little bit different, but it will go out and find that information that we're looking for, says, "I can provide you with a summary. However, I'm unable to generate or display graphs directly. You can find detailed graphical representations of employment data," it gives me a reference there. But you can push it, and here's how you push it. Wait a second. (audience laughs) I thought you could code. Can't you g

enerate the Python code that would take that data and graphically represent it? (audience member laughs) Please? (audience laughs) Okay, see what it does here. So it goes out, RAG pattern, grabs the data: "Certainly! To graphically represent the jobs report data from the Bureau of Labor Statistics over the last" blah blah, blah, "you can code." And it turns out it just remembered it could code. (audience laughs) It can code. Sometimes you have to remind it of things. It has something that is cal

led code generator that you can think of as a new skill that has now been appended to this overall thinking general purpose reasoning engine that it can call out to, and it has generated here, and not only generated, but told you what you need to do to run it, the Python code required to actually create that plot. "This code assumes you have a DataFrame named data with the correct information. You'll need to replace the example data," et cetera, et cetera. Our next step is to actually automate t

hat process. Right now, it's a little bit manual. Now, if you pause for a moment, that's a lot of progress over the course of 12 months. We've discovered we have a general-purpose reasoning engine, it started on kind of essentially a fact base, we can send it, using this RAG pattern, any information, and it can reason over it and give us answers, and now, all of a sudden, we can augment it with new skills. It can make a call, essentially, almost like a centralized brain, taking a question, takin

g an outcome and making a call to something that is more specialized to do the work. I'll show you just one last example of this. It is using a very specialized additional model that allows it to actually draw pictures. So here, I'm going to ask it to create a picture for me: Create a picture of the workplace of the future. I wanna show an open floor plan with individual workstations. I'll just get that started as we go. And what this is doing now is it's first parsing that information and it's

giving us a sense of, okay, I can try to create that. It will then... Don't worry, that's not the picture, that's not the picture. (audience laughs) And everybody gets really worried at this point in the demo. It's like, but that doesn't look anything like the workplace of the future, what's going on? And actually creates a couple of workplaces of the future. Now, this is a little bit dystopian for my taste, I'm not sure I love it, but it gives me a couple of different options to choose from. Lo

oks very futuristic, I have to admit. Very nice. And you can continue to go back and forth in what we think of as an iterative process to get to where you want to be. All right, that's the basics of what we see, and if I just go through the concepts we have, essentially, the underlying LLM model, we've learned that it can spit out facts, we learned that it can reason, we learned that with the RAG pattern, you can feed it additional data, we've learned that with this augmentation pattern, you can

feed it additional skills. Wow, all of a sudden, you have a little bit of a system that can do pretty amazing things as it relates to your organization and your mission. Let's take you over, then, into Copilot for Microsoft 365. Here, we're gonna see those same types of concepts, but applied to your data. In this case, it'll be my data. Remember, if I say avert your eyes, half of you look that way, half of you look that way. I'll just start very quickly with a general prompt here: What's the la

test from, let's think for a moment, Alexia, I did a forward slash so that knows who I'm talking about, Alexia Cambon, who's on my team, organized by email, let's do chat, and how about documents? Okay, Alexia works for me, she works on a research team. She actually lives in the UK, and she's been hard at work looking at the quantitative and qualitative measures of what Copilot can do. It is now going out to use that basic pattern to grab- Well, there we go. "I'm sorry, I couldn't find any recen

t emails, or I found a couple of recent documents authored by Alexia Cambon. One is an Excel file, and it looks like the other is a PowerPoint presentation." That's interesting. Wait a second, (audience laughs) I thought you could find emails. Can you search my email again? Sometimes if you tell it to try harder and give it a little bit of encouragement, (audience laughs) and I'm not joking, it actually does, it does, and we're still trying to figure out why that's the the case, but it actually

does something different. (audience laughs) There you go. Wait a second, oh, "I found several recent emails from Alexia," it says. "The most recent one was sent yesterday evening." Now that's interesting, isn't it? You know, this is much more like a person than you'd think (audience laughs) it turns out, uh-huh. All right, what you see here, however, is that it's going into your information, finding that information, giving you a summary of that information, then giving you references that you c

an go to to find the actual information. This RAG-based approach, this pattern, doesn't entirely remove hallucination, but it does mean that it hallucinates a lot less. We talk about it being sometimes usefully wrong, which is it doesn't like to just make up things in the context of what you send to it, but sometimes it does get things wrong, and so we tell people, continue to use your brains (chuckling). You have to continue to look at what's coming to you. Okay, here's one of my favorites to d

o. I do this at my own peril: Can you summarize the last five emails from my boss? (audience laughs) All right, to be specific, look for the last five emails from him over the last 24 hours, and stack rank them in order of most important to least important based on your analysis of the contents. This is always a really good view of how my boss is doing in sending me important things. All right, it should access my email here, we'll see how it does, and if and when it finds that email, if it unde

rstands that my boss's name is Takeshi Numoto, he's the chief marketing officer here at Microsoft, says, "I did find five recent emails from your boss over the past 24 hours. Here's a summary." And here you go. He sent an email with the subject Video Gen Product. We're talking about video generation right now. He sent an email about Copilot Pro and Copilot for Microsoft 365. That should not surprise you at all. He sent something about some deep immersion sessions we're doing on AI for science, c

hemistry, and biology, and it looks like we have some security work that we're doing. So you can get a sense for your ability, now, to perform tasks that perhaps you've never seen done before. For many people, this idea that you could query your information using Copilot like this is just totally foreign. I'll show you one last one here as we move on to one or two other things. Let's try this. Oh gosh, there're such good ones. Let's try this: Review my meetings this week and create five to seven

categories that describe how I spent my time. For each category, provide a short description and give me an approximate percentage of time that I spent there. Again, people don't even think about these types of prompts being how they would do their work. Now, this particular prompt, we're still working on, it's not perfect. Copilot is learning how to think about time in your calendar and how to do math, all sorts of things, but "Based on the information I have, you have over 50 meetings schedul

ed for this week. After analyzing the subject and content in your meetings, I've categorized them into the following five categories." I do have to travel, it turns out, that's not so interesting. I do give speeches and presentations, I meet with my teams, I have customer engagements, and then let's see what the fifth one is, I'm kinda interested. Personal. Okay, personal activities, such as Seattle Temple. There you go, I'm a religious man, I went to the Seattle temple this morning, it turns ou

t. But wow, pause for a second, pause for a second. General-purpose reasoning engine, any data that's impressive if you think about its ability to do things, very impressive. Okay, we're gonna keep going. I'm gonna now transition from just the generalized chat experience that you're seeing here over into some of the apps. My favorite app to show is Teams. I like to show this particular app. Oh, my gosh, it's gonna blow up on me here. I like to show, let me see if I can just, okay, if that comes

back up, we're gonna try and silence it. I like to show this particular app because we're looking now at what we call a meeting recap. This is a fictitious meeting, so I really don't leak trade secrets out to you here, a fictitious meeting that my team recorded, it's 30 minutes long, where they were talking about the launch of a product that we called Project Falcon. Now, this recap was based on the recording, which I can access here. I can see the speakers who were involved, and get a view for

when they spoke. I can look at the topics that were discussed. I can even get, if you're old enough to remember the DVD chapterization, you used to be able to skip between chapters in DVDs. Yes, I know, hard to remember, but here it is, together with a timestamp associated with those. It gives me meeting notes, it gives me a summary of follow-up tasks, but the best thing about this is your ability to query Copilot about the meeting. Just think of the meetings you won't have to go to any longer.

I love it! I love this! Okay, so I have some saved queries here to show you what this is all about, and we're just gonna look at a couple of them quickly. For instance, I can ask it simply: Can you list the actions and owners in a table from this 30 minutes of discussion? It says, sure, you know, "Here are the the table of actions and owners that I extracted from the text." I asked another interesting question, was there any tension in the meeting? How did you detect that tension? (audience laug

hs) "Yes, there was some tension in the meeting. I detected that tension by the following clues: Jason's reaction to Marlene's suggestion to delay the launch by a month. He said, 'Are you kidding me?' (audience laughs) and 'We can't delay now,' in a frustrated tone. And then it goes on, Callie's agreement saying it would throw everything out of the window, you know, some really, let me see if I can, oh boy, sorry. that might come up a couple times, showing that it really understands human emotio

n. Now here's a really interesting one. The whole idea behind this meeting was my team talking about when was the best time to actually announce the new product, so I can ask Copilot, based on the transcript, when would you recommend that we announce the product, and why? And it says, "Based on the transcript, I'd recommend we announce the product on September 3rd because it would be ahead of the competitor's launch on September 5th. It would give us enough time to finish product testing and cer

ts by August 29th, it would allow us to have a full day of rehearsals with Paul, our CMO, on September 2nd, it would avoid any potential issues with Amy, our CEO, who's traveling to Tokyo and has a board meeting on September 4th." Wow, right? Wow! Sometimes we, and by the way, this is stochastically statistical, it gives you different answers each time or different reasoning a little bit, but sometimes it actually will even say, "And I'm worried about Amy's jet lag." No joke, (audience laughs) I

can't make this stuff up. Like, wow, I never thought about my CEO's jet lag, it doesn't even occur to me. But interesting to see it reason across 30 messy minutes of people talking to each other to get to a business outcome. All right, my time's almost up here for demos, so now I'm gonna move you over into my email. Yes, this is my email inbox. Here, we're gonna look at a thread that is fictitious, though. This is a thread like everybody gets, somebody sends you something, somebody replies, som

ebody else replies. It's only been 15 minutes, and you feel like you're three days behind. That happens all the time. Well, my favorite thing to do is to hit this little button, Summary by Copilot. You're gonna love this button. Summary by Copilot will go in across this thread, it uses that same RAG pattern, retrieval augmented generation, and it will summarize what's happening in the thread. Says Leslie's completed the first draft of a blog, she's requested feedback from these folks. Theresa di

d this. A meeting was set up, Theresa did that. Leslie is now waiting for the thumbs-up from Jon Friedman before sending to Jason for posting on the day of the event. Not bad. Very good summary. Now when you want to reply, you can simply hit Reply, and after reading the thread, it actually will give you a couple of different options here on how to reply. You can approve it and praise them, you can suggest minor edits, you can request major changes, or you can do a custom prompt. I'll do a custom

prompt. Approve it, tell them great work, be warm, (audience laughs) be a little funny. (audience laughs) That's good enough for me, yep. Here we go, we'll ask it to generate, and it's gonna go away and generate. Now, here's the truth of what I'm experiencing as I use this, I'm finding that those things I need to dash off really quickly, no reason for me to use Copilot, I could do it quickly, but the places where I have to engage brain, I need to be particularly sensitive, I need to think about

, you know, how I would deal with an issue, hey, this is very useful. Here's a quick thing it came up with. I just looked, "I just reviewed the blog and it's looking great. You all have done fantastic work and I give my final sign off. Keep up the amazing work and let's celebrate with some virtual high fives." That's not as funny as I wish it were. (audience laughs) Can you make it a little longer and a little funnier, please? Try hard. (audience laughs) Okay, I don't know what it's gonna come u

p with now, we'll see. We'll see what happens. It's kind of a little bit of a game too, you know. "Wow, this blog is a masterpiece. (audience laughs) You all have done a phenomenal job of showcasing the Copilot design system," and blah, blah blah. "This is a definite win for" blank. Hey, it knows my boss. "I'm sure Takeshi will be thrilled too. I hereby give my final sign off and a big round of applause." It still kept the high fives, I think I need to talk to it, "some real treats. "How about w

e order some pizza and cupcakes for everyone? You've earned it." Mm, (audience laughs) I don't know, I don't know. What we are doing, however, is one of the newest features, is something we call Sound Like Me. This allows it to read your previous emails that you have written and then to take your style and to write in your style so that it can sound like you. After having done some tests with Sound Like Me, I'm not making this stuff up, we've actually run some qualitative tests with Sounds Like

Me, with human graders who don't know what's been written by the person and written by the Copilot. It turns out that those human graders like the Sounds Like Me versions of the people better than the people. (audience laughs) I don't know, that's what it looks like. All right, one very last one, and then we'll be done. Here, I'm gonna show you very quickly, a long, frequently asked questions document. It's some, it's longer than 34 pages. I think this is some 70-odd pages long as it loads up he

re, and this literally is something that we used in the early days of Copilot, where we were compiling the questions people had about the product, and we were just asking them to give us their questions and we would give them answers. It's interesting and long, but watch this. You can then use things like this, and I'm just gonna reload this for a moment, sometimes it times out on me, to, in a blank document, ask Copilot to do some thinking for you. And so the first thing I'm going to do is find

that document, so I do a forward slash and just see if it will pull up the document for me. There it is, Field Sales FAQ, so it gives, I give a link to it, and then, so you don't have to watch me write, I will go down and grab this prompt, and we'll read it together: Referencing this document, create a 10-step deployment guide for IT administrators for Microsoft 365 Copilot. For each step, provide an overview of the work required, an explanation of why it's important, and a detailed set of task

s to complete. Please generate. Now, it is gonna take a moment. This is, like I said, a 70-plus-page document. Today, the context window can be as big as 300 type-written pages, so it's not small, and you can take that context window and shove it into the LLM and ask it to do work for you. It will take just a moment here for it to go through the entire document, but I'm asking it to take a source document that had something to do with my request, but you actually have to apply some brain power t

o it and come up with something else, and here you go. Deployment Guide, a 10-step guide for IT administrators to enable and manage Microsoft 365 Copilot in their organization. It writes me a real nice introduction. Step 1, review the early access program terms and conditions. We'll see it go down to these steps here. Step 2, I have no idea what it'll be, but it'll pop out. Assign Microsoft 365 licenses to users. It's really thinking, right now, but essentially, what you can see it doing is that

it's taking that information, it really is processing it, and it is going to do the work. If I had someone on my team do this, there is no doubt it would take them an afternoon to do, it's 75 pages, and if we waited, typically, this will take about two to three minutes to be done. Wow! Wow, wow, wow! We are entering a new era. Now, as a reminder, it is not perfect, it does make mistakes. It's really interesting to me that you have to encourage it right now. We'll try and do something about that

. So we're still learning about the technology ourselves, but as you can see, this is powerful, powerful stuff. When you put it into practice in real places, we think that it's gonna change the world. We're excited to get it into your hands so that you can change the world with it. Thank you very much.

Comments

@waseemahmed5682

My outlook account is hacked and whenever i tried to recover it. Microsoft shows it's indigence that we can not send the otp on that no. What rubbish....? How it will be solved...?

@raresborcea

This gives me the same vibes I had watching Bill demoing the first versions of Windows, amazing!!

@deductivereasoning4257

Every game in your store sucks because of whatever it actually is. It ruins every game. Take anything that should be random with endless patterns, not with artificial intelligence. Nope, you get the same unrealistic patterns over and over. With every card game, you get the rarest occurrences happening multiple times a game. I've never seen video games play so poorly and so dumb...

@gamingwithdeepdev

Did alien already landed

What is AI? Jared Spataro Keynote | Microsoft Global Nonprofit Leaders Summit

Related articles

Comments