Main

CppCon 2018: Stoyan Nikolov “OOP Is Dead, Long Live Data-oriented Design”

http://CppCon.org — Presentation Slides, PDFs, Source Code and other presenter materials are available at: https://github.com/CppCon/CppCon2018 — For decades C++ developers have built software around OOP concepts that ultimately failed us - we didn’t see the promises of code reuse, maintenance or simplicity fulfilled, and performance suffers significantly. Data-oriented design can be a better paradigm in fields where C++ is most important - game development, high-performance computing, and real-time systems. The talk will briefly introduce data-oriented design and focus on practical real-world examples of applying DoD where previously OOP constructs were widely employed. Examples will be shown from modern web browsers. They are overwhelmingly written in C++ with OOP - that’s why most of them are slow memory hogs. In the talk I’ll draw parallels between the design of systems in Chrome and their counterparts in the HTML renderer Hummingbird. As we’ll see, Hummingbird is multiple times faster because it ditches OOP for good in all performance-critical areas. We will see how real-world C++ OOP systems can be re-designed in a C++ data-oriented way for better performance, scalability, maintainability and testability. — Stoyan Nikolov, Coherent Labs AD Chief Software Architect Stoyan Nikolov is the Chief Software Architect and Co-Founder of Coherent Labs. He designed the architecture of all products of the company. Stoyan has more than 10 years experience in games. Currently he heads the development of Hummingbird - the fastest HTML rendering engine in the industry and of LensVR, the first VR-centric web browser. Previously he worked on multiple graphics & core engine systems and on large-scale ERP solutions. Stoyan has degrees in Applied Mathematics and Computer Graphics. He is interested in high-performance computing, graphics, multithreading, VR and browser development. Coherent Labs AD Coherent Labs is a leading game middleware company that develops cross-platform game user interface products. It aims to solve complex problems for major gaming companies such as Arena Net, NCSoft, Bluehole, and hundreds of others, and to help them create stunning and high-performance UI. Using its experience in web, game technologies, and user interface, the company is developing a Virtual Reality browser. — Videos Filmed & Edited by Bash Films: http://www.BashFilms.com *-----* Register Now For CppCon 2022: https://cppcon.org/registration/ *-----*

CppCon

5 years ago

good morning my name is Tian and first of all thank you for waking up early to come to this talk today we'll be chatting about data oriented design and I decided to do this talk because when looking at all the resources all the talks all the blogs on data oriented design I was unable to find a good example of people showing a real-world production system made with data oriented design comparing it with object-oriented programming and outside of games so I've been in the video games industry for
around 10 years now and I've been doing games I've been doing mostly however game technology I work at a company called coherent labs where we create the technology for games so it's very likely that if you're playing and if you're a gamer your machine has executed some of the code we'll be looking at for the past six and a half years we've been working on products based on chromium WebKit and now our in-house proprietary game UI and browser engine if you're not familiar with the terms chromium
is the open source project that is at the heart of the Chrome browser of slack of Skype and many other applications that are probably running on your machines and WebKit is the original project that the core of chromium is forked from and works in in Safari on iOS on Mac so if you're an Apple user you're using it as well so we had this successful project and around two and a half years ago when I was working on optimizing and improving the software based on these technologies the whole time I fe
lt like I was in a cage and that cage was the architecture of WebKit which is also the architecture in many ways of chromium of that of those layout engines so we gather up with my co-founders and did the only sensible thing of course which is to start from scratch leave all that behind and build our own in-house browser engine for games and there was a very big question for us can a browser engine be successful with date oriented design we didn't know that we had successful projects with date o
riented design our graphics engine was built with that we knew about a lot of games that have been successful by employing the design but we didn't know of anything with with the scale we were looking at and we didn't know if it scaled to a software that is very object oriented in its core if you look at all those open source projects they're all object oriented and this has many reasons it's part legacy it's part the way that the standard of HTML rendering requires you to do some things so to s
ee how that went let's do a very quick demo ok let's run a test page that I did with chromium this is a build that I have here ok it's a very simple example these are three thousand rectangles all that I do is they move left and right and they changed al-qura their color and at the center there's a big red rectangle whose only idea is for us to gain an insight on the performance so if you're a keen gamer or somebody that works a lot on performance you already see that the frame rate is not so gr
eat and the exact number is it's a bit difficult to see it's around 13 15 ish frames per second so to zoom that a bit I'll use a very old-school method of doing this I'll paste it in paint and zoom it and the reason is that the windows magnifier actually reduces the performance of the windows presenters so we will skew our test if we did it with the magnifier so we get fifteen point seven frames per second it's very far from the sixty frames that we would like to have for a smooth animation okay
let's see what we did if you look down on the on the right we got around 50 frames actually in my hotel room it was 70 but who knows what's happening so we have a significant improvement on the performance and the answer to that question happens to be yes we're successful and we'll see how how that how that was implemented so today's agenda is we'll be looking at the basic issues of object-oriented programming now just glance at the things that we believe can be done better we look at the basic
s of data oriented design there are a lot of resources I have a final slide with references that you can look up later the idea for today will be to design a specific system to look at a design made with object-oriented programming then to redo it with data oriented design and see the difference between those two what's so wrong with object-oriented programming why people have a problem with it I believe that a picture speaks a thousand words so this is an inheritance hierarchy it's a bit diffic
ult to follow right so you get diamond inheritance and whatever so object-oriented programming at its gist marries the operations with the data they're done on and it's not a really happy marriage because in the end you end up with those giant objects with a lot of unrelated data in them that you poked in different parts of your software if we imagine a game and you have an enemy you might have a data about its physics some data about its AI some data about its rendering and so on but every time
you access that object for instance to draw it you're actually polluting your cache and getting a lot of data from the other stuff that you're not using right now and object-oriented programming also has the feature of hiding State in many places and by hiding I mean that you have all these boolean all these hidden ifs in your in your methods that in the end increased substantially the complexity of your code I know those things have a real impact on performance scalability modify ability and t
estability and this will be the for quality attributes will be analyzing data oriented design takes a different approach it puts the data first you can you can imagine it sits in the title so if we look at the lower side of our slide in object-oriented programming you you usually have this logical entity with all its fields all the stuff that logically belongs to that object in data oriented design we do a bit of a different approach we separate the data according to its usage so we might have d
ata a to see data a with fields a to see that it used that it that is used in system alpha we get data B from fields with from D to F we use that in alpha and in beta and we transform that data to a new collection that it you that is used down the pipeline so the gist of data oriented design is to separate the data from the logic we try to use the data where necessary without polluting our caches and trying to keep our logic simpler so we embrace the data we don't try to hide it we avoid hidden
usage who see how it's an interesting technique we try to avoid virtual calls because very often we just don't need them and data-oriented design is one of those things that at its core are simple but are very difficult to master a bit like the game of Go it promotes domain knowledge you cannot separate well your data and build a good system if you don't know that data if you don't know the problem that you're solving and the system you're designing and as I mentioned there references at the end
of the talk for for additional information so let's design a system and the system I have selected is a system for animations from the browsers browser engines that I showed earlier and to give you an idea about what is possible with with such a system this is a very small example from from one of our samples but it gives you the idea that with animations you can do a lot of things in CSS so the example that we showed earlier for the performance was just moving rectangles and changing their col
or but you can do more complex things like transformations moving text opacity and so on so you can modify different properties the details of the of the system are not that important it's more important for us to to gain an insight on how to create it the definition of the animation is again very simple we have a textual definition we have our keyframes everybody who's doing animations know what the keyframe is it's I want the animation to start here and then here and has this property here and
end here and we have the duration of the animation in a real world system that there are other properties obviously but these are the basics and the others don't change the design that much we already saw that we can modify different types of properties and this will have a real impact on our design this means that when we in Clemente's in c++ will have to interpolate different types of c++ structures or objects or all error we can have numbers we can have colors we can have transformations and
so on and another important design is that this definition is not static we want to allow the user at runtime to modify the animations so let's try to implement this with object-oriented programming and for the example today I've selected chromium and I don't know how many people are familiar with the chromium code base but I really really encourage you to go and study it it's an incredible piece of software done by one of the best engineer ones of the best engineers in the world so there's a l
ot to learn of course it's a giant code base many millions lines of code but when you get the gist of it is very cool so chromium actually has two animation systems we'll be looking at just one of them the other is similar in its in its design so all the things that I'm talking apply to that as well and it very closely follows the HTML standard so that's why probably they have this object-oriented design in the first place as its object oriented and we are creating animations what do we expect t
o have a class called animation right that's that's exactly what I looked up in the code there is a class called animation that's great but wow it inherits six other non-trivial classes well at least its final so we we know that there are no others down the line and you remember the the picture that I showed at the beginning about the inheritance hierarchy it actually it's actually the inheritance hierarchy of this class so up you get a lot of a lot of inheritance it's already difficult to follo
w right okay so to gain a better understanding of the system we'll be looking at the most common operation which is picking or updating an animation that is from the time and the definition to generate the new value for this frame of the property that you're that we are changing the new color or the new position of our elements and we'll delve into the code of chromium it starts very as we would expect there is a method called service animations it takes time at the second line it takes the curr
ent time and it calls update on every animation pointer that is in our collection there is a collection of animations needing update so clear so far however we already see some some strange things happening they are not walking over the animations needing update vector they are copying all the pointers and walking over and iterating over another temporary vector and the reason for that is that in this call stack but also probably in other call stacks they are deleting from the original vector so
we have some unclear lifetime semantics it's very difficult for us now to know when those animations get created and most importantly when they get deleted if they're writing this code there is a reason for that so let's look at the update method what what's happening right there fairly simple it takes the reason but the very first line is the hidden state I was talking you about if there is no time line which is some point or whatever they return false so the animation will do nothing in some
condition but we're already paying a big cost just for this if because the animations are allocated on the heap so we are just touching that field just to see if it if it's set or not and we are paying for a cache miss and we might be having a lot of branch mispredictions as well because if we have a good mix of active and inactive animations the branch predictor will probably get confused so we're paying some performance cost and we're exponentially increasing the states and the complexity of o
ur system okay so if we have this content which is another condition what do we do we delegate actually the animation to another class which is held in the Declaration of our animation as a member so chromium this garbage collection in C++ but you can think of this member type as a shared pointer so they are we are again paying the cost of a possible cache miss and this animation effect read-only is actually the definition of our animation all right so I'll be skipping some of the cost tags I ha
d a lot more slides but had to have to remove them for time constraints will be skipping the cost tags and looking just the interesting parts next up we're going to update our value so we skipped a bit and we see some interesting stuff that happens a lot in object-oriented programming in some cases the animations have to call an event happens so if there is an event delegate we'll call an event condition and here we have brought coupling between our animation system and the event system moreover
we have no idea what the on event condition will actually do so it might call JavaScript which will destroy all the hot data that we have in our caches it will have to load all the cold data for the JavaScript execute whatever it's doing and then we have to reload all the data that we just flew away so a possible very heavy performance hit there are probably also some instruction cache misses because we don't know where this code is going in our giant source so we have calculated all these and
comes an interesting part that I mentioned at beginning we want to be able to interpolate different C++ types because our animation could be on a color could be on an on a number whatever so however we still want to keep that in our system somehow so how do we do that in in classic C++ classic object-oriented programming we want a method that is different depending on on on what is on the type that we want to interpolate and they have different sizes anyone yep virtual functions an abstract clas
s that's exactly how they do it so there is a class interpolation which is abstract and it is inherited for every possible type that can be interpolated and the virtual interpolate method does the magic unfortunately this type of dynamic type eurasia in c++ is very costly to do you have the cost of allocating all those interpolations and in the in our example earlier that's six thousands animation so six thousands of those and you have the very likely cache miss when you call the called virtual
and a very likely instruction cache miss depending on the types of properties you are animating finally we have calculated the new value the interpolation that is magic we have to apply that new value to the properties to the objects that are being animated and again we do it in a very straightforward manner we have a member a member variable target which is the element that will receive the new value and again we bring some coupling between the system's because now our animation system whose on
ly job was to calculate new values according to time knows about the document object model about the styling system so that it can call the appropriate method of the element in our case this method is called set needs animation style recalc and if we look at the implementation we'll see something very very scary this is again a symptom of object-oriented programming I believe where you come a method but as it is a black box you have no idea what's going on so you might be you might end paying a
very big cost on that and we are indeed paying a because because Seth needs style recalc which is called there actually walks up the document object model tree for every element and sets a boolean and on each call up this tree will very likely be getting a cache miss so we were doing simple animations now we're walking up the dom tree we're again changing the context of our program so to recap very quickly to do our animation we've used more than six non-trivial classes the objects contain smart
pointers to to the other systems they are coupled with the other systems from the interpolation we use abstract classes and we're basically drinking the poison of object-oriented programming all the way all right now let's try do it in with date oriented design let's clear our mind forget what we saw and return to the drawing board so let's design our system around the most common operation and this is the operation of ticking of updating the animation it happens like 99% of the time we have ot
her operations obviously like adding removing pausing an animation whatever but the most important one the one that defines our performance is the peak for this tick we have two very simple data as pieces of data is input we have the definition obviously of the animation and we have time and as output if you think about it we also have some very simple piece of data we need the properties that have changed we need the new property values and we need a mapping elements that we receive those prope
rties and again as a cornerstone of data oriented design we will work on this system like there are many animations with object-oriented programming and and the thing that we saw earlier it works as if there is just one animation so you call update on the animation it does what it does but this is not what's actually happening you don't have one animation you have dozens hundreds even thousands so how will our pic work oh it can be done very simply let's create a class called animation controlle
r or whatever name we find interesting it will contain an array of our animation States it will take time as input those states are generated from the definitions and it will output exactly what we need an array a vector of the changed properties an array of the elements that have changed with the links back to to the new properties and here the element pointers are just dumb pointers for our system it doesn't know what they are it doesn't directly call to the document object model with that dat
a as output we can leave it to the next systems in our pipeline to decide what to do so the events system can walk these changed properties these changed elements and decide to call the events where necessary the styling system can walk over these elements and decide what to do knowing their new properties so we've reached a lot of decoupling a lot of separation of concerns how do we build those animation States well it's very easy again go flat forget about methods just have plain old data styl
e struct with all the data that we need the details are not important this is the runtime data that we need so the start time the post time of the animation and here's a bit of a twist we also have all the animation definition copied for each animation state and we did this for two reasons the first one is that empirically we found out that the performance is better this way and second it makes some operations very easy imagine you want to change the duration of a certain animation if the defini
tions were shared pointers you would have to implement some sort of copy-on-write or other idiom - - to do that but now that they are separate you change just the property for that animation all the others are completely unaffected so think about how that might help us for instance doing this in multiple threads okay so we have these properties however we haven't solved the problem of having to interpolate different types of different C++ types remember they were doing it with the abstract class
but we don't want to pay the cost for that so to avoid type erasure C++ has a very good way to do it statically we can use a template we can use a template because we know exactly all the types of properties that we'll be modifying there are no new ones so we now have different different types templated by the animation keyframe and we would like to iterate over them to run the animation however obviously they have different sizes in memory so we cannot put them in in a nice neat vector and wal
k walk it over but as I said we know each one of them so what we can do is very simple let's have a vector for each type now they are the same size so we have as many vectors as types as needed and when we want to interpolate them it's very simple we go over all the types a all of the vectors of that type then we go to the second one then we go to the third one so we are now having an order of magnitude less cache misses than we had with the object-oriented case the reason is that we'll be havin
g a cache miss when we change the type of the vector we were iterating on however and this is again a bit of the main knowledge but we know that we usually have a handful handful of properties that are really animated so we'll end up with a bunch of very large vectors that we can linearly iterate and the prefecture of the CPU will do its magic and will will be will not be paying the cost of cache misses and we'll end up with a lot of vectors that are simply antion and we'll skip over them so ins
tead of having one cache miss for each animation as it was in the object-oriented case so oh when we're actually having one cache miss for each type of animation which is an order of magnitude less actually in the performance example I showed earlier there were 6,000 animations because there were 3,000 rectangles but they were changing the color and the position so these are different animations so in the object-oriented case we got 6,000 cash misses each frame in this case we're just walking ov
er the color and the position so we'll be getting just 2 cache misses probably and on the implementation level it's very easy we can template our tick animation function according to the property and just have the same code running I mentioned that object oriented programming tends to hide a lot of state we saw that with the activeness of the animations so you can have some that are enabled some that are disabled and in the easiest way of implementation you can have a boolean is active or in the
case of chromium it was if there is a timeline which is basically the same thing and do if this do that and whatever and in our system the activeness in activeness is the most important state the most important boolean that we have so we can employ something that we call existence based predication we just have two arrays one array is for the active animations one is for the inactive animations the act of enabling an animation means just moving data from one to the other or vice versa in this w
ay when we iterate and have to apply an operation on our active animations we know that there there is no hidden state we know we can apply the exact same code the exact same transformation on all of them this is a bit difficult to do for every possible state that you have so you can prioritize by the most important by the one that has the most volatility or you can try to reduce the number of states so to look a bit at the code that that's final the details are not that important but if you loo
k at it it's very simple and and in many ways beautiful because there are no ifs there are no branches and you can read it like like a book so I mean we interpolate the keyframes we interpolate the values everything works as we would expect and this get interpolated value function that I have highlighted is just a template that will do the right thing and call the right function for the property we are changing and all this code can be safely now in our C++ file it won't pollute with templates a
ll over the place and will also keep the code together very likely in our final executable so branch so instruction cache misses are very unlikely okay however our system is now working it's very high performance we're very happy with it but we want to add an API to the user and we want it to be a convenient API and here object-oriented style of classes with with methods and so on is really really convenient you can give the user just a name object and she can call play pause and and whatever so
we want to give that to the user so let's create our animation object it's now is the time it has all the methods that we would like but contains only one simple piece of data and that's animation handle just an ID and the first all the operations to other functions the animation controller so our animation class is very done but it gives this convenient API to the user while we can still rework our internal data and make our system as performant as possible and we're using the simple handle to
simplify the lifetime of those objects because now we know that the lifetime is completely controlled by our controller there is no shared ownership over this data and to implement the Dom API and the controller is very simple you just have all the methods that take the ID so to quickly recap the concepts between our P and data oriented design that we saw in the chromium case we were using an inheritance of six classes and all the logic was cramped in this animation object and it's it's links i
n DoD we did a very simple flat animation state struct templated and we had other methods that worked on that we used a list of dynamically allocated interpolation so abstract classes but in C++ you can do better you can use templates and if you know all the types that you're having you can just have an array for each one of them I will be used boolean flags for activeness and again by Flags it's not just a boolean it can be in the presence or not of a pointer whatever and in data oriented desig
n we used different arrays for the different states and to give an API to the user with object-oriented programming we directly gives the the API of the class however our user just wants to play on animation and immediately gets like a hundred methods that doesn't know what to do we solve that by by giving a very clear API only with the stuff that is needed and finally with object orientation you reach out to the other systems in the program directly with data oriented design we just output tabl
es with the new data and let the other guys in the in the pipeline decide what to do with them so the key points of data oriented design try to keep your data flat try to use existence based predication to get rid of this of this state all over the place try to use ID based handles and to reduce the pointers and to simplify the lifetime of your object and try to use table based output because this table based output will actually be the input for the next thing down the pipeline let's analyze ho
w we did okay the first thing we'll be looking at is performance and and I'll be looking at the performance of this system it's six times faster so this is this is a hidden treasure for us we did not change the algorithm we did not invent some very clever way to do animations we're executing and doing pretty much the same thing only by reorganizing how the data is used how the code is structured we've gained six times better performance on this system and think about it it's very likely that you
have such systems in your code bases as well and the treasure is waiting to be found next up let's let's talk about scalability how could we hypothetically scale those systems and here by scaling I mean a very simple thing making them work on multiple threads well in the object-oriented case it will be very difficult because we'll have to account for all those dependencies for the data races that are very probable in the system so we don't know when we say to the target to to change its style i
f some other tread is doing that we don't know if all the methods we're calling our thread safe and we have to make sure they are another problem with the hidden state is that if we hypothetically solve these issues and run let's say half the animation on Corre half the animation corby if they have a very different runtime profile so if core a is lucky and get a lot of inactive animations then it will be idle for a lot of time we won't have an even distribution of work with data oriented design
things look simpler our animation states are completely independent of each other we duplicated some data to achieve that so what we can do is just let's cut our our vectors in half half course here half goes on the other core every one of them outputs its data and in the end in a classic for conjoined fashion we can merge that data together and leave it to the next system what about testability Wow again with the object-oriented case it looks well difficult I wouldn't be the one that writes uni
t tests for all this because I would have to mock a lot of things for instance the events the document object model just to test my my animation system moreover I have a lot of hidden states so a lot of ifs and else's and so on so if I want to cover all of them I have this combinatorial explosion of tests that I would have to do in the data oriented case looks again a bit more simpler just because we have really isolated the animation system from everything else we have our very clear input our
very clear output so as long as that contract is not breached and the right input gives you the right output you are fine the system works and the other systems down on the line have the responsibility to use that output in the correct way for the modify ability such large and complex object-oriented hierarchies tend to harden and by that I mean that they become so complicated that the user that the programmer is very much inclined not to modify them so it's very difficult for us to for instance
to experiment with changing the layout of the object or moving some data from one class to the other just because everything will break and we have to fix a lot of things so those things are usually left to big refactorings that in the end never happen in the data oriented case as we are completely owning our data and it's very clear we have a lot more wiggle room we could separate our animation stating in to erase our animation could continue to use both of those but we can send part of that d
ata to another system and we can try and reorder some of the functions it's a bit clearer however there is one thing that object oriented programming excels at and this is what I think what I call quick modifications imagine that we add a new state like if it's if it's Tuesday the animation runs twice as fast with object-oriented programming is very easy to do if choose they do whatever else do the other thing so we add a new state and it just works if we are to keep the pure date oriented desig
n this is a lot more difficult to do because we would have to reanalyze our data we would have to look for ways to keep this state out of our main kernel of the function that is that is running the animation and it will probably take us a lot more time but in the end it will probably pay off obviously the data oriented design is a tool like object-oriented programming is like everything we do in programming is and as every - it has it has its it it's place but it also has some downsides and the
biggest downside I believe is that the correct data separation is actually very hard to do I mentioned that earlier but data oriented design is easy to explain but very difficult to master and very difficult in the end to really achieve what you're looking for and here comes a bit the the domain knowledge that is good for you to have if you don't know very well the problem that you're solving it's very it's very difficult for you to design the data in in the way that it will be optimal so know y
our problem as existence based predication is also a bit a bit difficult to achieve if you have a lot of those states because you start having this multi-dimensional arrays with this Indies and whatever and my advice there is try to reduce the state if you've done GPU programming there are great examples there you can try and reduce the state by executing the same operation on multiple data or or other techniques just get rid of those ifs and else's quick modifications can be tough that that I m
entioned already and we as programmers might have to unlearn a thing of two or two so we have been wired with this object oriented type of thinking since since forever actually so in the beginning especially it's difficult to start and rethink everything in this new framework and unfortunately the language C++ is a bit difficult to to use in in some cases is is not always your friend so should we get rid of object-oriented programming altogether I don't think so it is a two as well it has its it
s place in in our code bases and yeah it's it's the antithesis of my title but sometimes we have no choice you might be using third-party libraries they use object-oriented programming so you're out of luck you might have API requirements that make you need to use object-oriented programming you might have seen that but object-oriented programming does not mean is not a synonym for classes and methods and so on with bait oriented design I still had classes I still had structs it was that I put t
hem first I put the data first and not the operations not the code polymorphism and interfaces have to be kept under control I don't believe that their their role is in a very low-level system like interpolating a couple of numbers in the animations but they have their place in high level systems in client-facing api's they are amazing if you do a type of plug-in and you real really need some dynamic dynamic polymorphism and remember that simple task has amazing facilities for static polymorphis
m you can go a long way with templates or just when you know all the types that you have you can you can generate everything actually it works surprisingly well finally are there some changes in C++ that will make make it easier for us to write data oriented design code I have a dream which will probably never come true but there are languages that allow you to change the memory layout of objects very easily and to experiment for instance with array of structures structure of arrays and not havi
ng to rewrite like half of your code for that I don't see it coming but we can always hope I didn't show that but in games very often date oriented design goes hand in hand with entity component systems and even if you don't need the whole flexibility of an entity component system it would be great if it was possible for us with out paying a big cost to reorder and to split our classes into components and this is actually doable in C++ and I'm talking about doing it without a pointer in directio
n because if you do it with pointer in direction it's very easy but it requires a lot of custom code that it's it's a bit of a mess for me the ranges proposal looks very exciting I believe it will be a great addition and it will help a lot with some of the constructs especially to make the code more clear more readable and this is a bit of a pet peeve of mine but unordered maps and adored unordered sets are not that great because they do allocations I think that we need in the standard a new ano
ther let's say another hash table that has some relaxed requirements and would allow us to use another scheme of addressing and reduce those allocations so obviously there are a lot of open source solutions we're our own hash table and actually reduced the allocation count in one of our version versions by 10% just by changing the unordered maps and sets with a scheme that usually uses open addressing and this again brings you back to the know your machine and design software for the machine tha
t will run it don't design for a hypothetical abstract machine like it's in the standard because it doesn't exist so to conclude object-oriented programming is not a silver bullet but date oriented design is not one either the thing is that you must use your best judgment they are just tools in your toolbox so they have their places you it's your job to find them thank you we have around 10 minutes for questions so please yeah you're educating sorry you're advocating compile-time polymorphism an
d I liked it as well and then there are always people who said to me a button the compiler will stamp out a lot of code and from your experience in also knowing the Machine when do you hit problems in that area like there's too much codes so that you get instruction okay yes sir and do you know if - so how I can see whether those problems occur yes this is this is a concern so what we try to do is keep the templates inside implementation files in the animation system this is doable if we see an
explosion of templates this is this is a problem that we try to cope - case-by-case so you have two problems that you might explode your build time which is obviously very bad for productivity and the other thing is what you mentioned and we have seen in some cases the compiler producing a lot of binary and increasing our instruction cache misses there is no formula to solve that to be perfectly honest either you have to reduce the templates or you can rely on some black magic and and this is wh
at we have done in some places honestly so if you have the fortune of working on a platform that very easily shows you where the instruction cache misses happen which which we have access to so on game consoles for instance it's very easy to see you can go there and try to reorder your functions or you can try to use link time optimization actually in our experience link time optimization helps a lot with some of those cases but sometimes you're out of luck and you have version 1 where everythin
g is fine and then you change like a line somewhere in code and you see like 5% slower just because clonk has reordered some stuff and when you're when you're working really close to to the maximum possible formas it's a thing of life I don't have a formula to solve this all the time unfortunately thank you thank you hi hi you've described an architecture for this system that includes data and operations on that data and I'm not familiar with this domain very much but I'll take your word for it
that it's a good choice but with those structures you may want to maintain certain invariants in the data for example and you could do that with constructors or methods yes and you've also described that for certain parts of your system like the user facing part of the client facing part you may want to have those those abstractions yes and so I guess my biggest question is why what exactly do you consider to be object-oriented programming and why is why is what you've described and not that you
guys I think really who's talking about choosing different abstractions for different parts of your system and I don't really think that that contradicts this idea of opie unless you have a different idea of what that is got it yes it's a different way of of solving the same problem so it's the same data it's the same invariants that's true however we have turned a bit around the way it is done because with object-oriented programming what I believe it means and what usually people end up doing
is having this object that we saw animation that contains all the data all the invariants and all the operations for for one animation right that's what we saw however this does not map well to the hardware it also does not map well to the complexity of our system because we end up with these giant giant clouds that knows everything and can do everything let me try to refine my question so the title of your talk is opie is dead yeah and I know you yourself sort of we can that stance at the end
a little bit but what exactly is that is what I'm asking what what precisely is is you want to die the notion of having these giant illogical classes that contain unrelated data and I believe they have to be reworked in different collections of data used by the different systems did that answer your question that's just sort of sense to me like that engineering versus good engineering well a lot of code base unfortunately end up with with the thing that I showed because when you have the class a
nimation every developer usually goes and adds a new boolean or adds a new data or adds a new method and you can call it better engineering you can call it however you want but it's it's a fact of life no matter how much you try and definitely the chromium and WebKit are done by some of the best engineers but in the end you end up with this giant inheritance hierarchies and so on hi so if I understand correctly there was a difference of about 15 to 20 milliseconds per frame in your in your initi
al demo versus to chromium demo yeah do you have a sense of where those 15 to 20 milliseconds were going like we talked a lot about virtual and directions and cache misses but we didn't really talk about how many milliseconds per frame those were taking up just that they were occurring so like for example Chrome's GPU process is heavily sandboxed and it has to actually communicate via IPC just to even read the frame buffer and so that's a potentially dominating source of cycles that obviously yo
ur program doesn't have absolutely so my demo ran a bit slower than I was expecting it to be honest but it's probably because the PC is not plugged what I have an idea and this is definitely it has some effect on the performance that's why when I do the final comparison I comparison just I compare just this system the first one was a way to engage in and to see that the overall system works better and there are a lot of reasons for it work better the GPU process honestly is probably not the bigg
est one the biggest one is usually the last cache misses the simpler call stacks I don't know if you're familiar with with the code with the code bases of chromium etc but they have very long call stacks and there are reasons for them to do that I get that but for instance the animation system could be made it just does the same thing and it's a lot faster and the milliseconds are here so six eight and this is on these PC versus one one like obviously the simple demo is simple and yeah do you th
ink that the data oriented design scales well with actual required design complexity like for example you've got probably 20 other subsystems that need to hook into the animation system and modify it or be notified when things happen or run JavaScript code and like I mean maybe your stuffs out I don't know but yeah okay does that yeah it does that and one one other point on the only performance here we usually see on PC around four to five times better performance and this was an artificial test
but I'm talking about real you eyes and so on but this is because this PC has eight megabytes of l3 cache I don't know what that is okay okay if you if you go to a platform that has a lot less cash like a mobile phone or console the difference in performance goes way up so yeah continuing on the line up with the previous person on this microphone was saying I don't think you're being entirely fair to object-oriented programming and not all implementations of object-oriented programming feature
all of those features and also yeah this is a case where obviously the classic approach with object or programming is not applicable actually I want you to ask a bit about this 6.12 type of CPU is this on this one yeah because honestly I think that it would be even more on an arm oh yeah yeah tolerance for cash yeah yeah I mentioned that but well the problem was that I had to build the chromium for arm and I didn't want to do that but definitely if you have less cash on your processor this numbe
r goes up and to answer your the first part yes I'm sure that there are in the world well-written object-oriented systems that they have their place that's that's what I say at the at the end I can see some challenges in debugging system like this if you had some challenges you know did you manage to solve them yes there are some challenges the biggest challenge is we are accustomed to while we we see a bug we go into the burger we have this object we can inspect all the data but when you're doi
ng data enter design you might know which data belongs to which logical entity and what we usually do is that in the bug we annotate our data so it's easy for us to relate it to other stuff in the systems that's what we do usually I don't understand this sure I haven't haven't seen seen generic tools I have seen a lot of tools not hoc like the ones that we have just to solve this so we annotate the data we have some some extensions for the debugger to be a bit easier for us to find it I've seen
a lot of game companies especially when when things go multi-threaded it's even more difficult to find it a lot of companies have some type of graph that they visualize or or try to do it that way but I'm not aware and I cannot really think of a generic tool that that would really help it's very domain specific thank you thank you what is the experience is a long-term support and maintenance using this approach my biggest concern is that as time goes you have more data to the structures mm-hm an
d well yeah you keep multiple copies so the data there's will be like data bloat so in a few years you pretty much has to reorganize all the layers and have to adjust them to like attack sure of the CPUs using to the caches and pretty much redoing the whole thing yes well we pay attention to what we do in the first place and second I believe that these goes a bit on the on the slide that I said about the things that the systems that tend to harden I believe that it's easier to rework and eventua
lly split the data if it is done in this way then it if it is done in a giant hierarchy of classes so as time goes by requirements change and the system has to change we've had so far very good experience so we've worked for years on chromium's base and for some years on this stuff and actually the maintenance time and solving a bug and implementing a new feature time is an order of magnitude faster and of course it might be related to the fact that chromium is millions of lines of code ours is
very big but it's not that big so it depends I believe that in time this is easier to maintain thank you thank you you talked about that there was read-only common state between all of these classes that you needed to duplicate yes for us all of this yes could you talk about the challenge as to why that couldn't be simply pulled out and read-only States separated from mutable State sure because it's not redone the fact is that the user can modify that definition at any time so one way to do it i
s to do a copy on write because you have might have multiple animation that refer that the other is to copy it so as we have little data it's it's an empiric finding that we did that it's easier and faster for us to copy the data as it changes that fact of life might change and and you saw in the chromium base you might have noticed that the class was called animation effect read-only it's not read-only they have cost methods that do cost castes remove the cost and change them so that's not true
so it's mutable in their state as well unfortunately we are out of time I'll be here so everybody who has questions is welcome and thank you again for your time

Comments

@Condog64

References for easy googling: "Data-Oriented Design and C++", Mike Acton, CppCon 2014 "Pitfalls of Object Oriented Programming", Tony Albrecht "Introduction to Data-Oriented Design", Daniel Collin "Data-Oriented Design", Richard Fabian "Data-Oriented Design (Or Why You Might Be Shooting Yourself in The Foot With OOP)", Noel Llopis "OOP != classes, but may == DOD", Lane Roathe "Data Oriented Design Resources", Daniele Bartolini

@nexovec

A sad thing is web people think 13 fps from rendering a few quads is completely normal

@TanelTagavali

Good talk. I hope that some viewers notice that the real benefit of this approach is how this makes testing, scaling, and modifying the code a lot easier, because of the loose coupling, reduced complexity, and freedom to reorder the algorithm. Performance is just the cherry on the cake.

@SasLuca

Great to see another data oriented talk at CppCon

@onerimeuse

Whoever lined the addbreak up for the end of the talk before the questions, well done and thank you. It was wonderful to get through this talk uninterupted.

@tiagocaseiro

The more I research about Data-Oriented Design, the more I believe Object-Oriented projects have just been saved by great hardware. It's just a luxurious way of programming that's completely detached from the actual behavior of the machine. Your program might be object-oriented but your machine isn't.

@obiwanus

Stoyan: "Study Chromium, it's made by the best engineers in the world, there's a lot to learn!" Also Stoyan: Shows how the best engineers in the world designed an overengineered system with poor performance.

@duminicad

fantastic talk, the questions were very provocative, the speaker handled them excellent

@Chalisque

A breath of fresh air. Every time I try to read the source of an open source project written in C++, I find that, for even the simplest thing, I end up having to hunt down and read a few dozen different member functions in quite a few distinct classes (in distinct files). Object-orientitis I call it. Though I find the arguments here are more against excessive abstraction and splitting things into an excessive number of objects pointed to.

@Gargantupimp

Better hide that PUBG picture if you want to talk about high performance...

@SLiV9

I find it funny that for every one of these talks, there is always someone trying to make a jab along the lines of "if you use data-oriented design, then after ten years of data bloat you have to do a lot of hard work to keep the code running as fast", as if that is a downside of data-oriented design. Well, duh. Of course it's hard work. The same bloat happens in object-oriented code as well, but there it's too difficult to see past all the classes and indirection, so most people just give up and accept the slowdown as "inevitable".

@TheLavaBlock

Great talk. Thanks Stoyan!

@fhajji

Lots of OOP fans are being triggered here, lol. Basically, it boils down to memory access patterns. If you can arrange your computations to simply "flow", best without hiccups, i.e. without branching and chasing pointers all over the place, and if you can put data being accessed by hot code in the same place, so they can be prefetched in the cache, you win big, performance-wise. That is what DOD is all about: to better match program data to the way hardware operates. Now, in cold code, OOP is just fine: it dramatically increases programmer efficiency at the cost of runtime efficiency, a cost we're willing to pay. But in hot code, HPC code, and real-time code, DOD is vastly superior, as it much better matches the problem to the hardware.

@sqaxomonophonen5998

"[OOP and DOD] are just tools in your toolbox" — this is good advice! Unfortunately, many developers treat OOP (and TDD, and...) almost like a religion; not as a tool, but as a rigid set of beliefs that must be adhered to at all times, lest you evoke the anger of The Prophets; Fowler and Uncle Bob, hallowed be their names. But OOP is just a tool. Go explore, have fun, learn new things, and expand your toolbox. You'll see that hammer-oriented carpentry is limited :-)

@jatvarthur

I believe GoF's Flyweight is exactly what this talk is about. The part of the problem with OOP lies in how we approach OO design and how we teach it. Beeing a teacher myself, I always encounter students who have this Animal and Dog and Cat style of OO design, so starting with very beginning we have this very naive mindset how to model the world in the software. Like Animation class in Chromium. My take is to see OO more like system/API level thing, more like modules in Oberon or Service in Spring. Here is where OOP is really shines

@panakap2186

That's nice feeling when I notice, I reinvented DOD just after watching some videos about caches.

@nextlifeonearth

I'd like to see the code and cache hit data from the guy who asked the question at 51:05. I'm 99% sure his code is a prime example of what Nikolov is talking about.

@maluramichael

Really great talk. Well understandable.

@llothar68

I'm now trying to use Data Oriented Design in a business apps where different aspects are moved into the components (in the sense of Game Entity Component Systems). For the reason that it is much easier to synchronize with remote systems and avoid fatal sync failures, lets see how it goes. While it is a nice talk i have not seen Data Oriented Programming outside the Games Industry. But on the other hand. A normalized inmemory relational database system is doing exactly what an ECS is supposed to do.

@andreanobile

In HPC data oriented design is the norm, it comes from the way people used to write programs in FORTRAN where not even structs were available. Scientific codes with a long history have high likelihood of being written by people who care about performance and know about the hardware.