In these first few videos, I want to lay the
foundation of some core concepts you will need throughout these video tutorials. Let's
get started with the basics. So what is a data structure? one definition that I really
like is a data structure is a way of organizing data so that it can be used efficiently. And
that's all data structure really is it is a way of organizing data in some fashion so
that later on it can be accessed, queried, or perhaps even updated quickly and easily.
So why data str
uctures? Why are they important? Well, they are essential ingredients in creating
fast and powerful algorithms. Another good reason might be that they help us manage and
organize our data in a very natural way. And this last point, is more of my own making.
And it's that it makes code cleaner and easier to understand. As a side note, one of the
major distinctions that I have noticed from bad mediocre to excellent programmers is that
the ones who really excel are the ones who fundamentally unders
tand how and when to use
the appropriate data structure for the task they're trying to finish. data structures
can make the difference between having an okay product and an outstanding one. It's
no wonder that every computer science undergraduate student is required to take a course in data
structures. It is strange that before we even begin talking about data structures that we
need to talk about the abstraction of data structures. What I'm talking about is the
concept of an abstract data type.
What is an abstracted type and how does it differ
from a data structure? Well, the answer is that an abstract data type is an abstraction
of a data structure, which provides only the interface to which that data structure must
adhere to. The interface does not give any specific details on how a specific data structure should be implemented,
or in what programming language. One particular example I like to give is to suppose that
your abstract data type is for a mode of transportation to get fro
m point A to point B. Well, as we
both know, there are many modes of transportation to get from one place to another. So which
one do you choose? some specific modes of transportation might be walking or biking,
perhaps even taking a train and so on, these specific modes of transportation would be
analogous to the data structures themselves. We want to get from one place to another through
some mode of transportation, that was our abstract data type. How did we do that? Exactly?
Well, that is th
e data structure. Here are some examples of abstract data types on the
left, and some possible underlying implementation on the right hand side. As you can see, a
list can be implemented in two ways. You can have a dynamic array or a linked list. They
both provide ways of adding, removing and indexing elements in the list. Next, we have
a queue and the map abstract data types, which themselves can be implemented in a variety
of ways. Notice that under the implementation for a queue, I put a stac
k based queue because
yes, you can have a queue, which is implemented using only stacks. This may not be the most
efficient way of implementing a queue. But it does work and it is possible. The point
here is that the abstract data type only defines how a data structure should behave, and what
methods it should have, but not the details surrounding how those methods are implemented.
Alright, now that we were done with abstract data types, we need to have a quick look at
the wild world of computat
ional complexity, to understand the performance that our data
structures are providing. So as programmers, we often find ourselves asking the same two
questions over and over and over again. That is, how much time does this algorithm need
to finish and also how much space Does this algorithm need for my computation. So, if
your program takes a lifetime of the universe to finish, then it's no good. Similarly, if
your program runs in constant time, but requires a space equal to the sum of all the
bytes
of all the files on the internet, internet, your algorithm is also useless. So just standardize
a way of talking about how much time and how much space is required for an algorithm to
run. theoretical computer scientists have invented big O notation, amongst other things
like big theta, big omega, and so on. But we're interested in Big O because it tells
us about the worst case. Big O notation only cares about the worst case. So if your algorithm
sorts numbers, imagine the input is the wor
st possible arrangement of numbers for your particular
sorting algorithm. Or, as a concrete example, suppose you have an unordered list of unique
numbers, and you're searching for the number seven, or the position where seven occurs
in your list. And the worst case isn't one seven, that is at the beginning. Or in the
middle of the list, it's one seven is the very last element of the list. for that particular
example, the time complexity would be linear with respect to the number of elements in
t
he size view list. Because you have to traverse every single element until you find seven.
The same concept applies to space, you can just consider the worst possible amount of
space your algorithm is going to need for that particular input. There's also the fact
that big O really only cares about what happens when your input becomes arbitrarily large.
We're not interested in what happens when the input is small. For this reason, you'll
see that we ignore things like constants and multiplicative
factors. So in our big O notation,
you'll often see these coming up. Again, again, again. So to clarify one thing, when I say n, n is usually always want to
be the input size coming into your algorithm, there's always going to be some limitation
of size n. So constant time referred to as a one wrapped around a big O. If your algorithm
case in logarithmic amount of time to finish, we say that's big O of a log event. If it's
linear, then we just say n, if it takes like quadratic time or cubic tim
e, then we say
that's n raised to that power, it's exponential. Usually, this is going to be something like
two to the n three, the N, N number be greater than one to the n. And then we also have n
factorial. But we also have things in between these like square root of n log log of n,
n to the fifth and so on. Actually, almost any mathematical expression containing n can
be wrapped around a big O. And is big O notation valid. Now, we want to talk about some properties
of Big O. So to reiterate w
hat we just saw in the last two slides, Big O only really
cares about what happens when input becomes really big. So we're not interested when n
is small, only what happens when n goes to infinity. So this
is how and why we get the first two properties. The first that we can simply remove constant
values added to our big O notation. Because if you're adding a constant to infinity, well,
let's just infinity, or if you're multiplying a constant by infinity, yeah, that's still
infinity. So we can i
gnore constants. But of course, this is all theoretical. In the
real world. If your constant is of size, 2 billion, probably, that's going to have a
substantial impact on the running time of your algorithm. However, let us look at a function f, which
I've defined over some input size. And if f of n is seven log of n cubed plus 15 n squared
plus two n q plus eight. Well, big O of f of n is just n cubed, because n cubed is the
biggest, most dominant term in This function. All right, now let's look
at some concrete examples of how big
O is used. So both of the following run in constant time with respect to the input size,
and because they don't depend on n at all. So on the left, when we're just adding or
doing some mathematical formula, yes, that's constant time. And on the right, okay, we're
doing a loop, but the loop itself doesn't depend on n. So it runs also in a constant
amount of time. So as our input size gets arbitrarily large, well, that loop is still
going to run in the same am
ount of time regardless. Now let's look at a linear example. So both
the following actually run in linear time with respect to the input size,
and because we do a constant amount of work, and times. So on the left, we're incrementing, the counter
AI by one each time. So f of n is and, and clearly, when we wrap this in a big go get
a big O of n. On the right, a little bit more complicated, we're not incrementing by one,
we're incrementing by three, so we're going to finish that loop three times f
aster. So
f of n is n over three. So our second example is two algorithms that run in quadratic time.
So the first one seems obvious, we do n work at times. So n times as big O of n squared.
But observe the second one, I changed the zero with an eye. So pause the video and try
to figure out maybe why that's big O of n squared. Okay, let's go over the solution.
So first, just focus on the second loop. The first loop isn't as important. So since I
go from zero to n, the amount of looping done is g
oing to be directly related to what AI
is, and remember is changing from the outer loop. So if we fix AI to be zero, we do n
work. We fix it one, we do n minus one work. If we fix it, too, we do n minus two work
and excetera. So the question then becomes what is n plus n minus one plus n minus two
plus n minus three and so on? Well, this is a well known identity, and turns out to be
n times n plus one divided by two. So if we wrap this in a big O, we split our equation,
we can see that this is b
ig O of n squared. Now let's look at perhaps a more complicated
example. Earlier, you may have wondering, how do we ever get these log Eurythmics or
linear rhythmic time complexities? So here's a classic algorithm of doing a binary search,
which yields actually a logarithmic time complexity. So what this algorithm does is it starts by
making two pointers, one at the very start in one at the very end of the array, then
it selects a midpoint between two and checks if the value we're looking for wa
s found at
the midpoint, and then has either found it or not, if it has found it, it stops. Otherwise,
it discards one half of the array, and adjusts either the high or the low pointer. remark
that even in the worst case, we're still discarding half of the array, each iteration. So very,
very quickly, we're going to run out of a range check. So if you do the math, the worst
case is you will do exactly log base two of N iterations, meaning that the binary search
will runs in logarithmic time. Ver
y cool algorithm, a very powerful algorithm. Here's a slightly
different example worth going over. So first, notice that there is an outer loop with a
counter I that does and work, then notice that there are two inner loops, one that does
three n work another one that does, too, one work. So the rule we use to determine the
complexity of this algorithm is to multiply loops on different levels and add those that
aren't the same, generally speaking. So, so using the rule above, we can see that it
takes
n work to do the outer loop, multiply by three n plus two n for both inner loop Which gives
us five n squared, which is big O of n squared. All right, so this next one looks very similar,
but it's actually quite different. So on the outer loop with AI, we have AI going from
zero to three n. So there's three n work done on the outside. But we have to multiply that
but what's going on on the inside, so the inside j goes from 10, to 50. So that does 40 loops
exactly every loop. So that's a co
nstant for the amount of work. Plus however, the second
loop, so j is less than and cubed, but j is J equals j plus two, so it's accelerated a
little, so we're going to finish that loop a little faster. So we're gonna get on the
inside 14, plus n cubed divided by two, but we have to multiply that by three n. So we
wrap that in a big O. So big O of f of n, is going to give us big of enter the force,
because And the fourth is the dominant term in our function, f of n. For some other classic
exampl
es of Big O. So if we have to find all the subsets of a set, that takes an exponential
amount of time, because there are two the N subsets, finding all permutations of a string
takes big O of n factorial. Another classic one is merge sort. So we have n times log
in for your merge sort. And if we have to iterate over all the cells of an array of
size n by n, we have big O of n m. Time. All right, let's talk about arrays, probably the
most used data structure. This is part one of two in the array
videos. The reason the
array is used so much is because it forms a fundamental building block for all other
data structures. So we end up seeing everywhere. with arrays and pointers alone, I'm pretty
sure we can construct just about any data structure. So an outline for today's video.
First, we're going to begin by having a discussion about arrays and answer some fundamental questions
such as what where and how are arrays used? Next, I will explain the basic structure of
an array and the common
operations we are able to perform on them. Lastly, we will go
over some complexity complexity analysis and look at some source code on how to construct
a dynamic array using only static arrays, discussion and examples. So what is a static
array? So static array is a fixed length container containing elements, which are indexable,
usually on the range of zero, inclusive to n minus one also inclusive. So a follow up
question is what is meant by being indexable? So answer this is this means that ea
ch slot
or index in the array can be referenced with a number. Furthermore, I would like to add
that static arrays are given as contiguous chunks of memory. Meaning that your chunk
of memory doesn't look like a piece of Swiss cheese with a bunch of holes and gaps. It's
continuous, all the addresses are adjacent in your static array. Okay, so when and where
is a static array used? Well, they're used everywhere, absolutely everywhere. It's hard
to make a program that doesn't use them. In fact, her
e are a few places you may or may
not know that do use arrays. So of course, the first simple example is to temporarily
store objects most common use of arrays that you're probably familiar with. Next is that
we use arrays as buffers to store information from an input or an output stream. Suppose
you have a really large file, for instance, that you need to process but that file is
so big, it doesn't fit in all in memory. So we use a buffer to read small chunks of the
file into The buffer or the
array, one at a time. And so eventually we're able to read
the entire file. I also like to use arrays as lookup tables because of their indexing
property. So this is a really easy way to retrieve data from the lookup table if you
know where everything is at supposed to be and at what offset. Next, we also use arrays
as a workaround in a programming language that only allows one return value. So the hack we use then is to return a pointer
or a reference to an array, which then contains all the re
turn values that we want. This last
example is a bit more advanced, but arrays are heavily used the programming technique
called dynamic programming, with tabulation to cache already computed subproblems. So
a classic example of this might be the knapsack problem, or the coin change problem. All right,
I'm talking about some complexity. So the access time for static array and a dynamic
array is constant because of a property that arrays are indexable. Searching, however,
it can take up to the li
near time because we potentially have to traverse all the elements
in the array in the worst case, such as if the element you're looking for does not exist.
Inserting appending. And deletion from a static array doesn't really make sense. The static
array is a fixed size container, it cannot grow larger or smaller. When inserting with
a dynamic array, this operation can cost up linear time, because you potentially have
to shift all the elements to the right, and recopy all the elements into the n
ew static
array. This is assuming we're implementing a dynamic array using static arrays. However,
appending though, is constant, Doesn't that seem a little strange? Well, when we append
elements to a dynamic array, we have to resize the internal static array containing all those
elements. But this happens so rarely, that appending becomes constant time. deletions
are linear, for the same reasons that insertions are linear, you have to shift all of the elements
over and re potentially recopy eve
rything into your static array. Okay, so we have a
static array A I've defined at the top. So a contains the values 4412 minus 517 6039
100. Currently, all the elements are distinct, but this is not a requirement of an array.
Also remark that the very first element 44 has index position zero in the array, not
one. This confuses many, many intro computer science students, you have no idea. The confusing
part is that most, if not all, mathematics is one base to work computer science is one.
Now, i
f we look at a you can see that it contains the values 4412 minus 517 6039, and 100. Currently,
all the elements are distinct. However, this is not at all a requirement of the array.
Also remark that the very first element 44 is indexed, or positioned at index of zero
in the array, not one, zero. This confuses a lot of intro computer science students.
The confusing part is that most if not all, mathematics is one based while computer science
is zero based. This is what causes the confusion. But
Worst of all, is quantum computing. I
did research one summer in quantum computing during my undergrad, and the field is a mess.
It tries to please mathematicians, computer scientists and physicists all at the same
time. And indexing just doesn't work well. Anyways, back to arrays. I should also note
that elements can be iterated over using a for each loop something that's offered in
some programming languages. It doesn't require you to explicitly reference the indices of
your array. Although th
e indexing is done internally, behind the scenes, the notation
of the square brackets denotes indexing. So array A square bracket zero, close square
bracket is equal to 44, meaning a at position zero is equal to the value 44. Similarly, a position
one is 12, a, four, six and seven is nine. But a index nine is out of bounds. And our
program will throw an exception. Unless you're in C, it doesn't always throw an exception,
unfortunately, okay. Now if we assign positions zero to B minus one that ha
ppens if we assign
index five B to and if we assign index x to be 25, let's look at operations on dynamic
arrays. So dynamic arrays can grow and shrink in size as needed. So the dynamic array can
do all similar get set operation static arrays can do but unlike the static array, it grows
inside as dynamically as needed. So if we have a containing 34, and four, there, if
we add minus seven, it gets appended to the end. If we add 34, again, then it will add
it to the end. And we can also remove ele
ments. So you see here, our dynamic array shrink
in size. Pretty cool, right? Okay, so we already talked about this a little bit. But how do
we formally implement a dynamic array? Well, the answer is typically this is done with
a static array. But this is not the only way, of course. So first, we create a stack if
you're a with some initial capacity, usually nine zero. So as we add elements, we add elements
to the underlying static array, keeping track of the number of elements added. Once, we
h
ave to add an element with exceeds the capacity of our internal static array, what we can
do it again, double the size, copy all the elements into this new static array, and add
the new element we need to add, let's see an example. So suppose we create a dynamic
array with an initial capacity of two, then we begin adding elements to it. So the little
circle with the slash through it is a placeholder for an empty position. Okay, so we add seven,
everything's fine, we are nine, everything's fine.
But once we add three, it doesn't fit
in our internal static array. So we doubled the size of the array, copy all the elements
in, and now we can add three. Now, if we add 12, everything's still okay, we're doing good.
And if we add five, okay, we have to do a resize again. So double the size of the container,
copy all the LLM elements into this new, larger array, and then finish off by adding five.
And similarly, we can add six without any issues. All right, time to look at the dynamic
array so
urce code. This is part two of two in the array series. So the source code for
this video can be found at the following link github.com, slash my username slash data dash
structures. Also make sure you saw the last video so you know what's going on with this
array implementation. Alright, here we are in the array class. So I've designed an array
class to support generics of type T. So whatever type of data we want put in this array, that's
fine. So here are three instance variables that we care
about our, which is our internal
static array. Len, which is the length the user thinks the array
is. And capacity is the length of the actual array, because sometimes our array might have
more free slots. And we don't want to tell the user that there's extra free slots that
are available. That's just our internal knowledge. So there's two constructors. The first one
will initialize the array to be of size 16. The other one you give it a capacity, the
capacity of course, has to be greater than o
r equal to zero. And then we finish Why's
our array and cast it to a type T also noticed I put this suppress warnings and unchecked
just because of annoying generics in Java. So here are two simple methods size, get the
size of the array and check if the array is empty, pretty self explanatory. Similarly
forget inset, we just index into the array if we want the value or set it if we want
to set the value, technically, I shouldn't be doing a bounds check for both of these,
but I'm not apparently.
Okay, clear. So here, we just remove all the data in our array and
reset the length. Simple. This next one is the Add method where things actually get a
little bit more complicated. So this condition says, Hey, if the length plus one is greater than or equal to the capacity,
then it's time to resize my right. The way I'm resizing is I'm doubling the size of the
static array. You don't have to double the size, but I've decided that doubling the size
is convenient. So when I double the size, I ha
ve to create a new array with the new
capacity. copy everything in this is what this line or these lines are doing, it's copying
the old values back into this new array, and then it sets the old array to be the new array.
And lastly, here, we just copy the new value into our right. So this remove AZ method will
remove a particular value at a given index. First, we check if the index is valid. If
it's not through index out of bounds exception, otherwise, grab the data at the Remove index.
initial
ize a new array of size length minus one. Now copy everything into the new array, except for when it's at that remove index. Now I'm
sure there's easier ways to do this. But I decided to do the following maintain two indices
i and j. increment i and j as you go. But when i is equal to the Remove index, then
we skip over the Remove index by fixing j temporarily and using j to lag behind, if
you will while is still in the original array. So I guess pretty clever overall. And then
set the array to
be the new array we just generated. Reset the capacity and return the
data we have just removed. So this next one remove, we scan through the array. If we find
the object we're looking for, remove it at the index and return true otherwise return
false. Index of is very similar. If you find it return I otherwise return minus one. Contains
just checks if index of is not equal to minus one. All right, this next one, I return an
iterator for the array. So an iterator allows us to iterate over the ar
ray providing an
abstraction for it. So I need to override two methods. And this is has next. So there
exists an x element in our array, if the index is less than the length of the array. I should
also be checking for concurrent modification here, just in case someone decides to change
the array for some reason why we're modifying it might add that later might not. Also, there's the next method, which just returns
the next element in the array relative to the iterator. Okay, and lastly is the to
string
to get a string representation of this array. Nothing too complicated. Alright, so this
is a very simple data structure. It's just a dynamic array. If you look at Java's ArrayList
It looks very Very similar to this. Today we're going to talk about singly and doubly
linked lists one of the most useful data structures out there. This is part one of two and the
second part we will be looking at some source code on how to implement a doubly linked list.
So for today's outline, in the first s
ection, we're going to answer some basic questions
concerning singly and doubly linked lists, namely, what are they and where are they used.
Next, we'll cover some terminology concerning linked lists, so everyone knows what I mean
when I say the head of linked lists versus the tail of the weight class. Then last in
the discussion, we'll talk about the pros and cons of using singly and doubly linked
lists, then how to insert and remove elements from both singly and doubly linked lists as
well as
some source code. So stay tuned. Alright, discussion. So what is the link list linked
lists is a sequential list of nodes that hold data which point to other nodes also containing
data. So below is an example of a singly linked list containing some arbitrary data. Notice
that every node has a pointer to the next node. Also notice that the last node points
to no meaning that there are no more nodes at this point. The last node always has a null reference to the next note,
for simplicity, I will o
mit this in the following slides. Okay, so where are linked lists use.
One of the places they get used the most is actually in the abstract data type implementation
of lists, stacks and queues because of their great time complexity for adding and removing
elements. You also see linked lists and things like circular lists, making the pointer of
the last node points the first node. So circular linked lists are used to model repeating events
cycles, maybe like having a round robin ordering on a bun
ch of elements or representing corners
with polygon. So definitely allow use as they're linked lists can also be used to model real
world objects such as a line of train carts that could be useful. And moving on some more
advanced examples. We also heavily use linked lists and hash table separate chaining, and
for adjacency, lists and graph so we'll get to those in a later video. Okay, a bit of
terminology surrounding link lists. First thing you need to know when creating a linked
list is that w
e always maintain a reference to the head of the link lists. This is because
we need somewhere to start when traversing our list. We give a name to the last element
of the linked list. Also this is called a tail of the list. Then there are also the
nodes themselves, which contain pointers. pointers are also sometimes called references.
And these pointers always point the next node. You should also know that the nodes themselves
are usually represented as structs, or classes when actually impleme
nted. We'll get to this
when we look at the source code. Okay, singly versus doubly linked lists, sort of concerning
the kinds of linked lists that there are there are two types, singly linked and doubly length.
singly linked lists only contain a pointer to the next node, while doubly linked lists
not only contain a pointer, the next node, but also to the previous node, which makes
it quite useful sometimes. This is not to say we cannot have triple or quadruple the
length lists, I just wouldn't
know where to place additional pointers, pros and cons of
doubly linked lists. So there are trade offs. Between picking a singly and a doubly linked
list. What are they? If we look at the singly linked lists we observed that uses less memory.
Why? Well, pointers to nose can actually take up a lot of memory. If you're running on a
64 bit machine references use eight bytes on a 32 bit machine four bytes each. So having
a singly linked list means you only need one pointer for each node, hence, twic
e as much
memory is saved. a downside however, is that you cannot access previous elements because
you don't have access to them. You would need to traverse from the head of a linked lists
all the way up to the previous node defined it. Now concerning doubly linked lists with
a grid Pro is that having access to the tail we can easily traverse the list backwards,
something we can't do with a singly linked list. Also having a reference to know Do you
want to remove, you can remove it in constant t
ime and patch the hole you just created.
Because you have again access to both the previous and an ex notes, this is something
you cannot do with a singly linked list, you would leave the list severed into a downside
however, to the doubly linked list is it does use twice as much memory. Okay, let's go into
some implementation details on how we can create linked lists and remove elements from
linked lists. Starting with a singly linked list. So here is a singly linked list. I've
outlined where t
he head and the tail is. Now we want to insert 11. At the third position
where seven is currently, let's walk through an example. So the first thing we do is we
create a new pointer which points to the head. This is almost always the first step in all
linkless operations. Now what we're going to do is seek up to but not including the
node we want to remove. So if we seek ahead, and we advanced our traverser pointers, setting
it equal to fives next node, which is now 23. And now we're actually re
ady already where
we need to be to insert the next node. So we create the next node, that's the green
node 11. And we make 11 elevens. Next pointer, point
to seven. And the next steps that change 23 is next pointer to be 11. Remember, we have
access to 23 is next pointer, because we have a reference to it with the traverser. Okay,
and now we flatten out the list, we can see that we've correctly inserted 11 at the right
position. And we're done. Okay, all right time to insert with a doubly linked
list.
This is much trickier because of all the pointers flying around. But it's the exact same concept.
Notice that the dump doubly linked list not only has pointers to the next node, but also
to the previous node, we also have to change those in the insertion phase. Okay, create
a traversal pointer, which points to where the head is, and advance it until you are
just before the insertion position. So we advanced the traversal by one and now we're
just before the node so we start traversing let
's create the new node which is node 11.
Point elevens, next pointer to equal seven. Also point leptons previous pointer to be
23. Which we have a handle on because of the traverser. How we make sevens previous pointer
equal to 11. So we can go backwards from seven to 11. And the last step, make 20 threes next
pointer equal to 11. This is so that we can go forwards from 23 to 11. So in total remarked
that we changed exactly four pointers. So we've flattened out the list, you can see
that 11 has
been inserted in the correct position. All right now how to remove elements from
a singly linked list. So suppose we want to remove the node with a value nine. How do
we do this? Well, the trick we're going to use is not to use one pointer but two, you
can use one but for visual effects, it's easier to show you how it is done by using two. So
we create pointers, travel one and track two for traverser one in Traverse e two respectively.
So traverser one points to the head, traverse or two points
to the heads. next node. Now
we're going to do is advanced traffic to until we find the node we want to remove while also
advancing draft one. Okay, we have found node nine. So this is the stopping point. I'm going
to create another pointer to the node we wish to remove so we can deallocate its memory
later. Okay, so now I had to have advanced traffic to to the next node Note nine has
turned red. I've done this to indicate that this will that at this point, node nine is
ready to be removed. I'm
just keeping it around for the visual effect. Okay, so now set trav
once next pointer to be equal to trap two. And now is an appropriate time to remove the
temporary pointer because doing nothing with it and their temp has been deallocated. Make
sure you always clean up your memory to avoid memory leaks. This is especially important
in C and c++ and other programming languages where you manage your memory. Now you can
see that nine is gone, and our singly linked list is shorter. Okay, now for th
e last bit
of implementation, let's look at how to remove nodes from a doubly linked list, which is
actually easier in my opinion to removing from singly linked lists. The idea is the
same, we seek up to the node we wish to remove. But this time, we only need one pointer. I
mean one traverse or pointer. Because each node is singly linked list has a reference
to the last node, so we don't need to manually maintain it like we did with the singly linked
list. So let's start travel at the very begin
ning
and seek until we hit nine. Okay, we have reached nine. And we want to remove it from
the list. To do this set for us pointer to be equal to 15 with access to four and 15
because they are traves, previous and next pointer respectively. Similarly set 15 previous
pointer to equal four. Notice that drive is now read, meaning it is ready to be removed.
So we get rid of travel. And now if we flatten out the doubly linked lists, we see that it
no longer contains nine. Let's do a bit of complexity
analysis on linked lists how good
actually are linked lists. On the left column, we have singly linked lists. And on the right
doubly linked lists. The complexity for searching in a linked list is linear in the worst case,
because if the element we're looking for is not there, we have to traverse all of the
n elements. inserting it the head, though, is constant time, because we always maintain
a pointer to the head for a length list, and hence we can add it in constant time. Similarly
for the t
ail. To remove the head of a singly linked lists, and a doubly linked list is
also constant time. Again, because we have a reference to it, so we can just move it
at and advanced the head by one. However, removing from the tail is another story. It
takes linear time to remove elements from a singly linked list. Can you think of Why?
Well, even if we do have a reference to the tail in a singly linked lists, we can remove
it but only once. Because we can't reset the value of what the tail is. So w
e had to seek
to the end of the list and find out what the new tail is equal to. W linked list however,
do not have this problem, since they have a pointer to the previous node. So we can
continually remove nodes from the tail. And finally, removing somewhere in the middle
is also linear time because in the worst case, we would need to seek through n minus one
elements, which is linear. Alright, time to look at some double e linked list source
code. This is part two of two in the linked list ser
ies. So the link for the source code
can be found on GitHub and github.com slash Williams he's a slash data dash structures.
Please start this repository if you find the source code helpful so that others may also
find it. And also make sure you watch the first part of the linkless series before continuing.
Here we are in the source code. We're looking at the implementation of a doubly linked list
in Java. So first, notice that I have declared a Few instance variables. So we are keeping
track of
the size of the link list, as well as what the head and the tail currently are.
To begin with the head and the tail are no meaning link list is empty. Furthermore, we
will be using this internal node class quite excessively, because it contains the data
for our nodes. And it also contains the previous and next pointers for each node since this
is a doubly linked list. So we have one constructor for the node, namely the data and the previous
and the next pointers themselves, we need both otherwi
se, we can't do much. So his first
method I have declared here is to clear the list, it does so in linear time by going through
all the elements and deallocating them last time deallocates them by setting them equal
to No. So we started to reverse or at the head, we loop while the traverser is likely
to no meaning there are still elements less and then we do our deallocation business.
And at the end, we set the size equal to zero and reset the head and tail. Perfect. These
size and empty methods
are pretty self explanatory, get the size and check if the size of our
linked list is empty. Okay, so here I have a public method to add an element by default,
we're going to append to the end of the linked list or at the tail. But I also support adding
to the beginning of the linked list. So how do we do this, if this is the first element,
make a list is empty, then we set the head and the tail to be equal to the new node,
which has, you can see here both previous and next pointers set to No.
Otherwise, if
the list is not empty, then we say the heads previous pointer is equal to this new node.
And then we set the head, the head pointer to be whatever hands previous is. So we backup
the pointer in a sense. And we also don't forget to increment the size. A very similar
thing is done when we want to add to the tail length list, except we're moving the tail
pointer around. OK, let's move to peak. So peaking is just looking at
either the element at the beginning of the linked list or at t
he end of the linked list.
And we throw a runtime exception if the list is empty, because doesn't make sense to peek
an empty list. Okay, now we get to the first more complex method, which is remove first.
So this is if we want to remove the head of the linked list. So we can't do much if the
list is empty. Otherwise, we'd get the data, we extract the data at the head, and then
move the head pointer forward, we decrease the size by one. So if the list is empty,
we set the tail to be known as wel
l. So both the head and the tail are now No. Otherwise,
we deallocate the memory of the previous node that we just removed. This is especially important
if you're in C or c++, make sure to free or delete pointers, then at the end, we return
the data. Very similar thing is done for last except we're using the tail this time to remove
from the tail the linked list and not the head. Okay, and here's a generic method to
remove an arbitrary node remark that I set this to private because the node clas
s itself
is private so the user shouldn't have access to the node. That's just something we're using
internally inside the linked list data structure to manage the list. So if the node that we're
removing is either at the head or the tail, detect that and call our methods either remove
first or remove last. Otherwise, we know we're somewhere in the middle of linked list. And
if we are we make the pointers adjacent to the to our current node equal to each other.
So we're effectively skipping over
the current node and And of course, don't forget to clean
up your memory and return the data. So we have to temporarily store the data. Of course,
before we delete the node, otherwise, we've deleted the node and the data is already gone.
Now, suppose we want to remove a node of a particular index and our linked list. Yes,
we can do this. Even though our nodes are not explicitly indexed, we can pretend that
they are. So first check that the index is valid, otherwise throw an illegal argument
exc
eption. So here, we are trying to be a bit smarter than just naively going through
a linked list either when we start searching from the front of the linked list to find
our index or from the back depending on if the index is closer to the front or to the
back, although this method remains linear. So for the Remove method, we want to be able
to remove an arbitrary value from my linked list, which is object. So we're going to also
support searching for null values in case someone decided that the
value of the node
should be no, that's fine. So we checked that special case. Otherwise, we traverse through
the length list until we find a null element and then remove that node and return true
we return true if we actually found the element we want to remove. Otherwise, we return false
down here. In this else statement, we search for the element we want to remove, we use
the dot equals method to check if we found the element. If so, remove that node and return
true. Okay, here we have a rela
ted method which is index
of. So this is not remove at an index, or remove value, but get whatever index this
object is at. Again, sports searching for null. So even if our values No, we'll just
return the first place we find no element. So again, first link list. Otherwise, search
for a non null element and also increment the index as we go. We can use the index of
four our method contains to check if an element is contained within a linked list because
we return minus one if the element is not
found. Something that's useful sometimes is
to have an iterator for the link list. This is also trivial to implement, just start a
pointer, traverse at the head and traverse until you reach the end. Notice I'm not checking
for concurrent modification error, but if you want to, it's pretty easy to do that.
And lastly, at the very bottom, we have to the to string method to print a string or
to get rather a string representation of our linked list. May I begin by saying that the
stack is a remarka
ble, absolutely remarkable data structure, one of my favorites. In fact,
this is part one of three in the stack videos. Part Two will consist of looking at a stack
implementation, and part three will be some source code for how a stack is implemented
using a linked list. So here are the topics that we'll be covering in this video as well
as the others. First we will answer the question about what is a stack and where is it used?
Then we will see some really really cool examples of how to solve p
roblems using stacks. Afterwards,
we will briefly look at how stacks are implemented internally and the time complexity associated
the stacks operation. And lastly, of course, some source code. Moving on to the discussion
about stacks. So what is a stack? A stack is a one ended linear data structure which
models the real world stack by having two primary operations, namely, push and pop.
Below you can see an image of a stack I have constructed. There is one data member again
popped off the top o
f the stack and another data member getting added to the stack. Notice
that there is a top pointer pointing to the block at the top The stack. This is because
elements in a stack always get removed and added to the top of the pile. This behavior
is commonly known as LIF o, or last in first out. Let's look at a more detailed example
on how a stack works and how things are added and removed from a stack. So let's walk through
an example on the left side, I have some instructions on what we need to
add and remove to the stack.
So the first operation is a pop. So we remove the top element from the stack, which is Apple.
So boom, Apple is gone. Now we push onion onto the stack. So we add onion to the top
of the stack. Next x instruction says to push celery onto the stack. Next is watermelon,
which we put on top of the stack. Next operation says to pop so we remove the element at the
top of the stack. This is the watermelon we just added. The next operation also a pop.
So we removed from the
top of the stack. This is celery. And last operation push lettuce
onto the stack. So we add lettuce on the very top of the stack. So as you can see, everything
operates on the top of the stack, we don't have access to anything else but the top of
the stack. This is critical in our understanding of how a stack works. So when in Where is a stack
used, stacks are surprisingly used absolutely everywhere. They're using text editors to
enter tags you've written in browsers to navigate backwards or fo
rwards. They use some compilers
to make sure you have the correct number of matching braces, and in the right order. stacks
are used to model real world stacks such as books, plates, and even games like the Tower
of Hanoi, which we'll get to shortly. stacks are also used behind the scenes to
support recursion by keeping track of previous function calls. When a function returns it
pops the current stack frame off the stack and rewinds to the next function that is on
stack. It's rather remarkable
that we use stacks all the time in programming and never
even notice it. Something else you can use stacks for us to perform a depth first search
on a graph. A depth first search can be done manually by maintaining your own stack, or
by using recursion. Well guess what both use stacks as we have just discussed complexity
analysis. So the following complexity table assumes that you implemented a stack using
a linked list. Pushing takes constant time because we have a reference at the top of
the s
tack at all times. Well the same argument goes for popping and peeking. Searching however,
still takes linear time, the element we're searching for isn't necessarily at the top
of the stack. So we may need to scan all the elements in the stack, hence require a linear
amount of work. Now here's the cool example of a problem using stacks problem. So given
a string made up of the following brackets, round brackets square brackets curly brackets
determine whether the brackets properly match. So anal
yzing examples below to understand
what type of bracket sequences valid and which ones are invalid. So before I show you the
solution, try and come up with a solution yourself. Hint use a stack Okay, in this first example, consider the following
bracket sequence. As we are processing the string from left to right, I will be displaying
the current bracket and the associated reversed bracket. So let's begin. For every left bracket
we encounter we simply push those on the stack. So this is the left
square bracket that I
have highlighted in light blue. So we push this one on the stack. Same goes for the next
left square bracket and for this left curly bracket Okay, this is a right square bracket. So we encountered
a right square bracket. We need to do two checks. First we check if the stack is empty.
If so the bracket sequence is invalid. But if there are still things in the stack that
we pop the top element and check if its value is equal to the reversed current bracket.
Right now the top
element is equal to the reversed bracket. So we are good. Next is
a right square bracket again. So is the stack empty. No, it isn't. So we're good. Is the
top element of the stack equal to the reversed bracket? Yes, it is. So let's keep going around
a left bracket. Let's push it onto the stack. A right bracket. Is the stack empty? No. Okay,
so we're good. Is the top element of the stack equal to the reverse bracket? Yes. Okay. So
we're good. And other right bracket? Is the stack empty? No. Okay
, good. And there's the
top element of the stack equal to the reverse bracket. Yes. Okay, good. And now we're done
processing the string, we need to make sure the stack is empty. Now. Why is that? Well,
in case the last few characters and the bracket sequence were left brackets, they would still
be in the stack, right? But our stack is empty. So we can conclude that this bracket sequence
is indeed valid. All right, let's do another example with another bracket sequence. So
trying to work this on
e out. So the first bracket is a left bracket, so we push onto
the stack. The second bracket is also a left bracket. So we push onto the stack. This next
bracket is a right bracket. So let's check if the stack is empty. No, it's good. And
is the top element of the stack equal to the reverse bracket? Yes, it is. This next bracket
is a right bracket. Is the stack empty? No. So we're good. And is the reverse bracket
equal to the bracket at the top of the stack? No, it isn't. So this bracket sequenc
e is
invalid. So here's the pseudocode of the algorithm we just ran through. So if we let us be a
stack, and for every bracket in our bracket string, we can get the reverse bracket for
that current bracket easily. So if our bracket is a left bracket, push it on to the stack.
Otherwise, we check if the stack is empty. And if the element at the top of the stack
is not equal to the reversed. If either of those conditions are true, then we return
false otherwise, we return whether the stack is empty
or not. And if it is empty, then
we have a valid bracket sequence otherwise we do not. I want to take a moment and look
at the Tower of Hanoi, a very popular game amongst mathematicians and computer scientists.
To see how it relates with stacks. The game is played as follows. You start with a pile
of discs on the first peg on the left. And the objective of the game is to move all the
disks to the right most this pile, and each move, you can move the
top desk of a pile to any other pile with a r
estriction that no disk be placed on top
of a smaller desk. So we can think of each peg
as a stack, because we're always moving the top element in a peg and placing it on another
peg. So shall we play, I will let the animation run. And you will see how each peg acts like
a stack. It's pretty cool. So you just saw how transferring elements
from one page to another is exactly the same as popping a disk from one stack and pushing
that same disk onto another stack, given that the disk you're placing
on top is smaller.
So that's really cool. Welcome to part two of three in the stack series. This is going
to be a short video on one way to implement a stack. So those stacks are often implemented
as either arrays singly linked lists or even sometimes double linked lists here We'll cover
how to push nodes onto a stack with a singly linked list. Later on, and we will look at
the source code, which is actually written using a doubly linked list. Okay, to begin
with, we need somewhere to start to
be in our link place, so we're going to point the
head to a null note. This means that the stack is initially empty. Then the trick to creating
a stack using a singly linked list is to insert the new elements before the head and not at
the tail of the list. This way, we have pointers pointing in the correct direction when we
need to pop elements off of the stack. As we will soon see, the next element however,
we need to push onto the stack is a two. So let's do that. To create a new node, adjust
the head pointer to be the newest node and then hook on the nodes next pointer to where
the head was before. And we use the same idea for five and also 13. Now let's have a look
at popping elements. This isn't too hard either, just move the head pointer to the next node
and deallocate the last note. So here we pop the first node off the stack and set the nodes
reference to be null. So they will be picked up by the garbage collector if you're coding
in Java. And it will since there are no other
references pointing to it. If you're in another
programming language that requires you to explicitly deallocate free memory yourself
like C or c++, now is the time to do that. Or you will get memory leaks. Getting a memory
leak in a data structure is one of the worst kinds of memory leaks, especially if it's
a custom data structure that you intend on reusing. So keep watching out for that not
only in this video, but in all the data structures that we will be covering. If you see in an
implementa
tion and not correctly cleaning up my memory, please please point out to me,
or send a pull request to the GitHub repository so we can patch that. Okay, so we keep proceeding
by removing the head and bouncing the head pointer down to the next node. Pop again,
and pop again. There we go. We've stopped popping we've reached last note and the stack
is now empty. Welcome to part three of three in the stack series videos. Today we'll be
looking at some source code for a very simple stack. So the sour
ce code can be found on
github.com slash William fiza slash data dash structures. Make sure you understood part
one and two of the stack series before continuing. So you actually know how we implement a stack
using a linked list. Also, if you like this video series, and the implementation of the
stack data structure I'm about to present to you then please start this repository on
GitHub so that others can have an easier time finding it as well. Here we are in the stack
source code the senate imp
lementation in the Java programming language. So the first thing
you may notice is here I have an instance variable of a length list. This is the linked
list provided by Java. In fact, it's a doubly linked list provided by Java. This is a little
lengthy list they provide in their package Java dot util that I will be using today,
instead of the one that I created, and the link list videos, this is just for portability, in case
you want to use this stack for whatever reason. So we have two constru
ctors, we can create
an empty stack, or we can create a stack with one initial element. This is occasionally
useful. First method we can get the size of the stack. So to get to do that, we return
the size of the internal links list with all the elements of our stack easy. We also check
if the stack is empty by checking if the size is zero. So this next one is just push so
to push an element onto the stack, we just append that element as the last element in
our internal linked list. Easy enough w
e can also pull element of the stack. So to do this,
we check if the stack is empty. If it is, then we throw an empty stack exception because
we can't pop an element off of an empty stack. That doesn't make sense. Similarly, the same
thing with peak, we can't observe what the top element of the stack is, if the stack
is empty. And if it's not there, we can pick the last element of our list. And lastly,
we can return an iterator to allow the user to iterate through our stack. This iterator
return
s an iterator for our linked list, which supports concurrent modification errors. So
this was a really, really quick implementation that was static. It's only like 50 lines of
code. Let's talk about queues are probably one of the most useful data structures in
computer science. So this is going to be part one of three in the Q series. So the outline
of things we'll be looking at. First, we're going to begin by talking about queues and
what they are. Then we're going to go into some complexity an
alysis concerning queues.
Then we'll discuss the implementation details of n queuing and D queuing elements from a
queue followed by some source code at the very end in the last video. So a discussion
about queues. So what exactly is a queue? So below you can see an image of a queue.
But a queue is just a linear data structure that models a real world queue. Having two
primary operations we can do which are enqueuing, and D queuing. So ever queue has a front and
a back end, we insert elements th
rough the back and remove through the front. Adding
elements to the back of the queue is called n queuing. and removing elements from the
front of the queue is called D queuing. However, there's a bit of terminology surrounding queues
because there's not really any consistency or when we refer to as queuing D queuing,
many people will use multiple different terms. So and queuing is also called adding but also
offering a similar type of thing happens when we're talking about D queuing. So this is
when we remove things from the front of the queue. This is also called polling elements.
However, some people will also refer to this as removing an element from the queue. But
the problem with saying that is that can cause some ambiguity, did they mean removing from
the front of the queue specifically, or from the entire queue? Make note that if I say
removing I'm going to be referring to removing from the front of the queue unless I say otherwise.
So let's look at an example of how queue work
s in detail. However, first, notice I have labeled
the queues front and back ends, so you know where I'm going to be in queueing and D queuing
from respectively to avoid confusion. first instruction says in queue 12, so we add 12
to the end of the queue, then dq, so we remove the first element from the front of the queue,
which is 55. Another dq operation, this time we removed minus one from the front of the
queue. Next, and Q seven, so add seven to the back of the queue. dq, so remove the front
element being 33.
And lastly, and Q minus six, so at the back of the queue, just like that. So now that we know where a queue is, where
does this data structure actually get used? Well, a classic example of where cuneus gets
us is to model a real world queue where you're waiting in line at a movie theater or in the
line at a restaurant. For instance, have you ever been to say McDonald's, where all the
caches are full. As soon as one of them gets fried, the next person in line gets to order
food
? Well, that's a cue. So queues are can also be really useful if you have a sequence
of elements coming in, but you only keep track of say, the x most recent elements, while
you can add those elements to queue, and once your queue gets larger than x elements, just
dq essentially queues are also often used in server management. So, suppose for a moment
that you have a web server, that's idly waiting for requests from people to use your website,
that at any given moment, you can simultaneously ser
ve up to five people. But no more. If 12
requests come in, in a short amount of time, you're not going to be able to process all
of them as new ones come in. So what you do is you process the five that you're able to,
and the remaining seven gets a chill and a queue waiting to be served. And whenever you
finish processing a request, you dq and the next request, and then you start processing
it, and you do this until the queue is empty. While you're doing this, more requests come
in to access you
r web page, while you just add them to the end of the cube. queues are
also using graph theory. To perform a breadth first search traversal on a graph, which is
actually really useful. We're going to see this example in the next video. All right
now concerning complexity analysis of a queue. So as we're seeing, it's pretty obvious that
n queuing and D queuing operations are constant time. There's also another operation on a
queue I have not mentioned yet, and this is peaking. peaking means that
we're looking
at the value at the front of the queue without removing it, the source or cost and time.
However, checky if an element is contained within the queue, is linear time since we
would potentially need to scan through all the elements. There's also element removal
In this sense, not in the sense of D queuing or polling, but in actually removing an element
from the queue internally. This also requires linear time, since we would have to scan through
all unknown elements in the worst case
. In this video, we're going to have a look at
how queues are used to do a breadth first search. And then we're going to look at the
actual implementation details of how n queuing and D queuing elements works. Okay, onto the
breadth first search example. So breadth first search is an operation we can do on the graph
to do a graph traversal. If you don't know what I mean, when I say graph, I mean a network
not a bar graph or a line graph or anything like that. But first, I should explain the
brea
dth first search in the breadth first search. The objective is to start a node and
traverse the entire graph, first by visiting all the neighbors of the starting node, and
then visiting all the neighbors of the first node you visited and then all the neighbors
of the second node you visited and so on, so forth, expanding through all the neighbors
as you go. So you can think of each iteration of the breadth first search as expanding the
frontier from one node outwards at each iteration as you go
on. So let's begin our breadth first
search at node zero. So I'm going to label node zero as yellow and put it in the frontier
or the visiting group. And now I'm going to visit all the neighbors of zero being one
and nine and add those to the frontier. And then we resolve the neighbors of one and nine
being only eight. Similarly, for eight, so seven, and visit all the neighbors of seven.
So I added a bunch of elements, my friend here, and now visit all the neighbors of the
yellow nodes. And now
we're done our breadth first search
because we no longer have any elements on frontier. Notice that there's 12 that is the
unvisited because 12 is a lunar node on an island all by itself. So we are not able to
reach it for the breadth first search, which is fine. Suppose you want to actually code
a breadth first search, how would that be done? Well, the idea is to use a cube. So
first, we add the starting node to our queue. And then we mark the starting node as visited.
And while our queue is no
t empty, we pull an element from our queue or D queuing. And
then for every neighbor of this node, we just D queued if the neighbor has not been visited
yet mark the neighbor is visited and added to the queue. So now we have a way of processing
all the nodes in our graph in a breadth first search order. Really, really useful, very,
very popular graph traversal algorithm. So now let's look at implementation of queues.
So let's begin with an queueing. So it turns out that you can implement the que
ue abstract
data type in multiple different ways. But the most popular methods are to either use
arrays, singly linked lists, or doubly linked lists. If you're using an array, you have
to make sure your arrays are going to be big enough. If it's a static array, if it's a dynamic array, then you'll
be fine. Right here, I'm going to show you how to do it with a singly linked list and the source code,
we're using a doubly linked list. So stay tuned for that. In a singly linked list, we're
going to
have a head pointer and a tail pointer. So initially, they're both No. But as we n
q, we add, well, we just add the first note, so nothing really interesting is going on
right now. But as we add more elements, you can see that we're pushing the tail pointer
forward. So we're adding a node and then getting the tail pointer point to the next node. Now
dq is a bit of the opposite. And so pushing the tail forward, we're going to be pushing
the head forward. So push the head forward one, and then the
element that was left over
was the one we want to dq and return to the user. So why don't we push the head pointer
forward, we said in the last note to know so that it can be picked up by the garbage
collector if you're in Java. And if you're in another programming language, which requires
you to explicitly deallocate or free memory, yourself like C or c++, now's the time to
do that. So you see, as we keep D queuing, we're just pushing the head forward and forward
again. And at the very end, if
we add a bunch of elements, then remove them all then the
head and the tail again, point to know which is where we started. All right, now it's time
to have a look at some source code for Q. So I implemented a queue and you can find
the source code at the following link on github.com slash my user name slash data dash structures.
Also make sure you have watched and understood parts one and two from the Q series before
continuing. Alright, here we are looking at some source code for a queue. So
this source
code is in the Java programming language, although you can probably translate it into
any programming language that you need. So the first thing to remark is I have an instance
variable here of a linked list. So this is a Java's implementation of a doubly linked
list. We also use this in the stack implementation, as you'll see the queue and the stack implementations
are very, very similar. So here I have two constructors, one create just an empty queue.
And another optional one, whic
h will create a queue but with a first element. In fact,
I should probably check if this is no, but we we might want to allow no elements. So
let's just leave it like that. So the next method is the size, it just gets the size
of the link list. And similarly, this checks if the length list is empty. Those are both
pretty self explanatory. Okay, next interesting method is the peak method, the peak method
returns the element that's at the front of the queue, but it will throw an error if your
queu
e is empty because you can't peek anything when your queue is empty. Similarly, for poll,
this will pull the element to the front of the queue, but unlike peak will actually remove
the element from the queue. So next, we scroll down a little bit, I have offer which adds
an element to the back of the queue. I guess I am allowing for no elements. So if you don't
want elements, just put an if statement and throw an error or something. So the poll removed
from the front. And you can see offer ads to
the back. So remove first and add last.
And the last method. And here is an iterator in case you want to be able to iterate through
all the elements in your queue. Very short and very simple implementation just under
50 lines of code, you can create a queue. Although there are faster ways of creating
queues, especially with arrays. The idea with arrays, especially static arrays,
if you know the maximum amount of elements that will be in your queue at any given time,
then you can create an array
of that maximum size and have pointers to the front and the
back positions in your queue. And add and remove elements based on the relative position
of those pointers. If you ever get to a position where you're running off the edge of your
array, then you can loop around to the front of the array and keep processing elements
like that this a lot faster than having to maintain references to the next node, such
as in a linked list. So I guess I'll say that's your homework is to create a static ar
ray
based queue. Today, we're going to talk about everything to do with priority queues from
where they're used to how they're implemented. And towards the end, we'll also have a look
at some source code. Along with all the priority queue stuff. We're also going to talk about
heaps since both topics are closely related, although not the same. So the outline for
the priority queue series is we're going to start with the basics talking about what are
priority queues and why they're useful. And the
n we'll move on to some common operations
we do on priority queues and also discuss how we can turn min priority queues into max
priority queues followed by some complexity analysis. And we'll talk about common ways
we have implemented priority queues. Most people think heaps are the only way we can
implement a priority queue, or that priority queues somehow are heaps, I want to dispel
that confusion. Next, we're going to go into some great detail about how to implement the
priority queue using
a binary heap. there we'll look at methods of sinking and swimming
nodes up and down our heap. These terms are used to get and shuffle around elements in
a binary heap. As part of the implementation explanation, I also go over how to pull and
add elements. So there's a lot to cover. So let's get started. discussion and examples.
This is going to be part one of five in the priority queue series. So what is a priority
queue? a priority queue is an abstract data type that operates similar to a norm
al queue
except for the fact that every element has a certain priority. So elements with a higher
priority come out of the priority queue first. As a side note, I'd like to remark that priority
queues only support elements that are comparable, meaning that the data we insert into the priority
queue must be ordered in some way, either from least to greatest or raised lease. This
is so we can assign relative priorities between elements. Okay, let's go into an example.
So suppose we have all these
values that we inserted into a priority queue on the right,
and that we impose an ordering on the numbers such that we want to order them from least
to greatest. So the smaller numbers have a higher priority than the bigger ones. So they
will be removed from the priority queue first. Suppose we have now a list of instructions.
So what the poll operation means is remove the element that has the highest priority
in our priority queue. So let's see how this works. So if I say Paul, then I remove th
e
element with the highest priority, which happened to be one. Now I say add two, so we add two
to our priority queue. And Paul, so next report of smallest elements in our priority queue,
and that happened to be the two we just added. Next, we add for all this smallest, this is
three, add five, oh, also add nine and then pull the rest. So as I pull the rest, I'm
continuously grabbing the smallest element in the priority queue. So it turns out that as we
added, and Paul numbers, we got an ordered
sequence. This is a coincidence. Actually,
as we added the pull numbers from the priority queue, we do not necessarily end up getting
an ordered sequence, we are only guaranteed that the next number that is removed from
the priority queue is the smallest number that was currently in the priority queue.
So how does the priority queue know, which is the next smallest number to remove? As
humans, we could see the numbers visually inside the priority queue and look and know
what one was the smalles
t the whole operation was going to return. But fundamentally, how
does the machine know this? Does it resort all the elements inside a priority queue before
each pull operation? No, in fact, that would be highly ineffective. Instead, it uses what
is called a heap. So the next big question then is what is a heap? Usually I make up
my own definitions, but I really liked this one from wiki. A heap is a tree based data
structure that satisfies the heap invariant also called the heap property. If a i
s a parent
node of B, then A is ordered with respect to B for all nodes A and B in the heap. What
this means is the value of the parent node is always greater than or equal to the value
of the child node for all nodes. Or the other way around that the value of the parent node
is less than or equal to the value of the child node for all nodes. This means we end
up getting two types of heaps, Max heaps, and min heaps. So max heaps are the one with
where the parent node is always greater than its c
hildren. And the min heap is the opposite.
So both of the following are actually binary heaps binary because every node has exactly
two children. And the children cannot see or no values I have not drawn in. So why are
we interested in heaps? heaps form the canonical underlying data structure for priority queues?
So much so that priority queues are sometimes called heaps, although this isn't technically
correct, since the priority queue, again, is an abstract data type, meaning it can be
impleme
nted with other data structures also. Okay, we're going to play a little game, I'm
going to give you some structures. And you need to tell me whether it is a heap or not.
So inspect the following structure and try to determine whether it's a heap or not, you
can pause the video if you'd like. But I'm just going to give you a short moment here.
No, we have a violation of the heap invariant and this tree. So it's not a heap. Is this
structure a heap? Yes, it is a heap because it satisfies a heap i
nvariant, and it is a
tree. So we often see heaps like these and why we're called binomial heaps. Note that
heaps aren't necessarily binary heaps. They can have any number of branches. On to our
next one. So is this a valid heap? Yeah, it's a valid heap. Because even though this one
is strangely, strangely structured, we're free to move around the visual representation
of the nodes as we please. So yeah, it's a valid heap. How about this one? No, this structure
is actually not even a tree becaus
e it contains the cycles. All heaps must be trees. What
about this one? Yeah, so heap. What about this one? Also heap because it satisfies the
heap invariant that all all nodes are less than or equal to or greater than or equal
to its parent. How about this one? No, it's not onto the heap and because it does not
satisfy that heap invariant. However, if we do change the root to be 10, then we can satisfy
the heap invariant via a min heap. Or rather sorry, a max heap. So when and where
is a priori
ty queue and he used probably one of the most popular places we see priority
queues is in Dykstra shortest path algorithm to fetch the next nodes we explore. priority
queues are really handy anytime you also need a behavior in which you need to dynamically
fetch the next best or the next worst element. They're also used in Huffman encoding, which
is often use for lossless data compression. Many best first search algorithms use priority
queues in their implementation to continuously gain grab the
next most promising node in
the graph as it's being traversed. And finally, we also see priority queues in prims minimum
spanning tree algorithm on directed graphs. So it seems priority queues are really important,
especially for graph theory algorithms, which is where we see them often. Okay, on to some
complexity with priority queues implemented as a binary heap. To begin with, there exists
a method to construct a binary heap from an unordered array in linear time, we're not
going to cover th
is algorithm, but it's pretty cool. And it forms the basis for the sorting
algorithm heapsort. Next is pulling and, or rather removing pulling or removing an element
from the root of the heap. This takes logarithmic time, because you need to restore the heap
invariant, which can take up to logarithmic time. So peaking or seeing the value at the
top of our heap can take constant time, which is really nice. Adding an element to our heap
takes logarithmic time, again part because we possibly have t
o reshuffled heap by bubbling
up the value as we will see in a later video. Then there are a few more operations we can
do on priority queues. So removing a an element, which is not the root element. So the naive
way of removing an element in a heap is to do a linear scan to find the items position
and then remove it. The problem with this is it can be extremely slow in some situations,
especially if you're removing a lot. But generally don't do this. And it's not a problem, which
is why most im
plementations are just Yes, lazy and do the linear scan solution. However,
there does exist a way to reduce the removing time complexity, which I will go over later
in a later video in great detail in this video series. So stay tuned for that this method
uses a hash table to reduce the removing time complexity to be logarithmic, which is super
critical, actually, if you're removing as much as you are adding. Now the naive method
to check containment within a heap. binary heap is linear. Again, y
ou just scan through
all the elements. But with the help of a hash table, we can reduce this to be a constant
time, which is super neat. Especially if we use the hash table implementation for the
removing speed up we get as a free bonus. The downside however, to using hash table
implementation is that it does require an extra linear space factor. And it does add
some constant overhead because you're accessing your table a lot during swaps. Today, we're
going to talk about turning min priority qu
eues into max priority queues. This is part two,
a five in the priority queue series. So you may already be asking yourself, why is it
important that I know how to convert a main priority queue into a max priority queue?
Well, here's the problem. Often in the standard library, most programming languages, they
will only provide you with either a max priority queue or a min priority queue. Usually it's
a main priority queue, which sorts elements by the smallest element first, but sometimes
we need
the max priority queue depending on what we're programming. So how do we do this?
How do we convert one part one type of priority queue To another type. Well, a hack we can
use is to abuse the fact that all elements in a priority queue must implement some sort of comparable interface, which we can simply
negate, or invert. To get the other type of heap. Let's look at some examples. Suppose
for a moment that we have a priority queue consisting of elements that are on the right
side of this scree
n, and that these are in that min priority queue. So if x and y are
numbers in the priority queue, and x is less than or equal to y, then x will come out of
the priority queue before y. the negation of this is x is greater than or equal to y.
And so y then comes out before x, because all these elements are still in the priority
queue. Wait a moment you say, isn't the negation of x is less than or equal to y, just x greater
than y, and not x is greater than or equal to y? Well, not for competitor
s. You see if
x is equal to y, whether or not the competitor is negated, x should still equal y. So now
let's see what happens when we pull all these elements out of priority queue with our negated
comparateur. So first, we will get 13 because it's the greatest. Next comes 11 753. And
two. An alternative method for numbers is to negate the number before you insert it
and negate it. Again, when it comes out. This is a hack specific to numbers, but it's pretty
nice. So let's see how that works. So
let's negate all the numbers inside a priority queue.
And now we can see that definitely minus 13 is the smallest so should come out first.
So minus 13, and indeed comes up first. But now we have to remake the data and we 13.
Everything is good so far. Next is minus 11. So really positive 11. And so on my seven,
seven, and so I'm just so keep polling and then arena, get the value to get it out of
the priority queue. It's a pretty neat hack. Okay, now let's look at my other examples
using string
s. So suppose Lex is a competitor for strings, which sorts strings in lexicographic.
Order, this is the default for most programming languages, then, let's call n Lex be the negation
of Lex. And also let's assign s one and s two to be some non null strings. So below,
you can see that our comparateur assigns minus one if s one is less than s two Lexa graphically
zero. So equal x so graphically, and plus one if s one is greater than s two lexicographically.
And then n lacks is just the negation of
that. So just to break it down, ALEKS sorts strings,
like so graphically, but we're interested in gaining legs so that longer strings appear
before sort of shorter strings and also that strings with letters at the end of the alphabet
appear before those containing letters. At the beginning of the alphabet, I think I said
that right. This was in effect turn a man he into a maxi beep. Let's look at a concrete
example. So by adding all the numbers on the right to a prayer queue with the lexicograp
hic
comparateur, here's the ordering we should expect. First we get a because it's the shortest
string that has characters appearing closer closest to the start of the alphabet, then
it comes at B, then f Zed, then x then x R and x x. So now let's do the same thing with
n Lex. And we should hopefully obtain the opposite sequence in reverse order. And then we get x x x r x f, Zed mi N A. So
it did exactly what we intended to do. Today we're going to talk about adding elements
to a binary heap. Th
is is part three of five The priority queue series, we'll get to adding
elements to our binary heap shortly. But first, there are some important terminology and concepts
leading to that, which we need to go over prior to add elements effectively to our priority
queue. Okay, so a very popular way to implement a priority queue is to use some kind of heap.
This is because heaps are the data structure, which give us the best possible time complexity
for the operations we need to perform with a prior
ity queue. However, I want to make
this clear, a priority queue is not a heap. A priority queue is an abstract data type
that defines the behavior a priority queue should have. The heap just lets us actually
implement that behavior. As an example, we could use an unordered list to achieve the
same behavior we want for a priority queue. But this would not give us the best possible
time complexity. So concerning heaps, there are many different types of heaps including
binary heaps, Fibonacci heaps
, binomial heap, pairing heaps, and so on, so on. But for simplicity,
we're just going to use the binary heap. binary heap is a binary tree that supports the heap
invariant. In a binary tree, every node has exactly two children. So the following structure
is a binary heap, which satisfies the heap property that every parent's value is greater
than or equal to the child that every node has exactly two children. Well, no, you may
be thinking the bottom nodes, also known as leafs don't have childre
n. Well, actually,
Yes, they do. Here they are. They're the knowledge children in gray. But for simplicity, I won't
draw those because they're annoying to draw. Okay. The next important bit of terminology,
we need to understand is the complete binary tree property. The complete binary tree property
means that at every level, except possibly the last is completely filled, and that all
the nodes are as far left as possible in the binary tree. As you will see, when we insert
nodes, we always insert
them at the bottom row. As far left to meet this complete binary
tree property. Maintaining the complete binary tree property is very, very important, because
it gives us an insertion point, no matter what the heap looks like, or what values are
in it. The next note, we insert will go into the hollow circle that and the next one will
go to the right of it, and so on until eventually we fill up the row, at which point we need
to start a new row. So the insertion point is a very important. One la
st thing before
we actually get into how to add values into a binary heap, is we need to understand how
to represent one of these binary heaps. And there is a canonical way of doing this, which
is to use an array actually. So using an array is a very convenient actually, because when
we're maintaining this complete tree property, the insertion position is just the last position
in our array. However, this isn't the only way we can represent the heap, we can also
represent the heap using objects
and pointers and recursively add and remove nodes as needed.
But the array construction is very elegant, and also very, very fast. So on the left is
the index tree just to help you visualize the position of each node in the array. And
on the right is the actual tree remark that as you read elements in the array from left
to right, it's as though you're pacing through the heap, one layer at a time. So if we're at no nine, which is index zero,
we're at the top for node eight. We're at position one
. And as I keep moving along,
you can see that we're just pacing through the array going from left to right. So it's
very convenient, that way When not also another interesting property of story, I'm binary
heap in array is that you can easily access all the children and parent nodes. So suppose
I is the index of a parent node, then the left child is going to be at index two times
i plus one, and the right child of that node is going to be at two i plus two, this is
zero base. If it's one base,
then you would just subtract one. So suppose we have a node
seven, which is highlighted in purple. Well, its index is two. So by our formula, the left
child of seven should be located at two times two, plus one, or, or five. If we look at
index five, we expect to get one and if we look at the right child, we should expect
to get two times two plus two or six. If we look in our array, this gives us the value
of two. So using this technique, we have all we need to manipulate the knowns now array
i
n the source code I will be presenting in Part Five for the series, we will see that
I use this representation for simplicity. All right. So now we want to know, how do
I add nodes to a binary heap and to maintain the heap invariant, because if we I noticed
our binary tree and we don't maintain the heap property? Well, the binary heap is useless.
We'll do some examples. On the left, I have some instructions, which tell us what values
we need to insert into the heap. The first value is a one, whi
ch we can see, which should
actually appear at the root, since we're dealing with a min heap. But instead of inserting
wire the route directly, what we do is we put one at the bottom left of the tree in
the insertion position I was talking about and performance call bubbling up as my undergrad
prof loves to say, this is sometimes called swimming or even sifting up all really cool
names for a really neat operation. So we answered one and the insertion position. But now we're
in violation of the h
eap invariant since one is less than seven, but one is found below
seven. So what do we do, what we do is we swap one and seven, like so. But now we're
still in violation of the heap property, since one is a child of six, but one is less than
six. So what do we do we swap them, but yet again, violation the property. So we swap,
and now one is at the root where it should be. And now the heap invariant to satisfy
and we can stop swimming or bubbling up or whatever you want to call it. So the next
one is 13. So as usual, begin by putting it in the insertion position. And now, we need
to satisfy the heap invariant, so bubble up 13. Notice that we're no longer in violation
of the heap property since 14 is less than 13, and 13 is less than 12. So 13 is actually
in its correct position. Sometimes we don't have to bubble up our elements that much.
The next values we need to insert are for zero and 10. Try seeing where these end up,
pause the video if you need. It's a good little exercise. But
I will keep going for now. So
forgoes in the insertion spot left of all the nodes, it's there, and we bubble it up
until we can't anymore. And we start with the property is satisfied. Next zero, my favorite
number. Of course, I've arranged that zero be at the top of the tree as you will see
right now in violation of the heap invariant. So let us bubble up and like magic zeros at
the top of the tree. Next is insert 10. Next numbers 10. So we put out an insertion position.
And however this does no
t violate the heap invariant, so we do nothing. Today we're going to look at how to remove
elements from a binary heap. This is Part Four or five in the priority queue series.
Make sure you watch the last video so we understand the underlying structure of the binary heap. So let's get started. In general, with heaps we always want to remove
the root value because it's the node of interest. It's the one of the highest priority is the
smallest or the largest value when we remove Route, we call it
polling, a special thing
about removing the route that we don't need to search for its index. Because in an array
implementation, it has position, or index zero. So when I say pull in red, we have the
node we want to remove, and in purple is the one we're going to swap it with. So the note
in purple is always going to be the one at the end of our array, which we also have its
index. So we swapped them, we get rid of the one. And now, since 10 is at the top, well,
we're not satisfying the heap in
variant. So we need to make sure that the heap invariant
is satisfied. So we need to do what's called bubbling down now instead of bubbling up.
So what we do is we look at 10s, children, five and one, and we select the smallest,
and we swap with the smallest. So 10 would go to one. So make sure you default, selecting
the left node in case there was a tie. So as you can see, 1010s children are two and
two, they're both equal. So we're going to select the left node to break tight. And now
we bubbl
e down 10 again. And now the heap invariant is satisfied. Now we want to remove
the value 12. So pulling was removing the element at the root. But 12 is not at the
root. However, we still want to be able to remove 12. So what we do is we have to search
for 12 in our tree, even though we don't know its position yet. So we start at one, and
we do a linear scan through all our elements until we find 12. So five is not 12, two is
not 12, and so on until find 12. And now we found 12. And now we know
where its position
is. So we can mark it as the node we want to remove, and also swap it with the purple
node being the last node in our tree, swap them remove the 12. And now we're in violation
of the heap invariant. So now we want to bubble up three, until the heap invariant is satisfied.
And now we've satisfied the heap invariant so we can start. Now we want to remove three,
same thing as last time search for three in the tree. Three wasn't far it was just two
nodes away. So now to remove an
element, again, swap it with the last node in the tree. Drop
it. But now the question is, do we bubble up or bubble down the value because you don't
really know what the value of the node and last position is when you're swapping it in.
So this is we bubble up a bubble down Well, we already satisfy that heap invariant from
above. So we need to bubble down 15. So So five was smaller, so we swapped it with five,
now eight are smaller, so we swap it with eight. And again, the heap invariants are
sa
tisfied. Now we want to pull So Mark, the root node, red swap it, remove the one. And
now we want to buckle down. And the heap invariant is satisfied. Now we want to remove six. So
search for six in our tree. Okay, we have found six and do the swap. Remove
six. Now do we bubble up a bubble down? The answer is neither the heap invariant is already
satisfied. So we don't need to touch our node. We got lucky. So from all this polling and
removing we can conclude the following that polling takes log
arithmic time since we're
removing the root and we already know where to find it. And also that removing a random
node can take up to linear time. So We have to actually find the index of that node we
want to remove before removing it. However, if you're as dissatisfied with this linear
removal as I am, you'd figure out that there has to be a better way. And indeed there is.
And I'm about to show you a hack to improve this complexity to be logarithmic in the general
case. So stay tuned. Okay, so
now we're looking at how to remove nodes on my heap with the
improved time complexity. To do this, we'll need to make use of a hash table, a data structure
I have not yet covered. So buckle up, things are about to get a little wild. I promise
I'll cover the hash table thoroughly in a later video. But right now, it's going to
look like magic. Back to the central issue, we have a bunch of nodes scatter across our
heap at some positions, and instead of doing a linear scan to find out where the nod
e we
want to remove is, we just want to be able to do a lookup and figure that out. The way
we're going to do this is that every node is going to be mapped to the indexes found
that. So when we want to remove a particular node, just look up its index and started doing
it. Linear scan. Sounds good, right? That sounds great, except for one caveat. What
about if the heap has multiple nodes with the same value? What problems will that cause?
Well, just a few but nothing we can't handle. To begin wit
h, let's talk about how we can
deal with the multiple value problem. Instead of mapping one value to one position, we will
map one value to multiple positions. And we can do this by maintaining a set or tree set
of indices for which a particular node value or key if you want maps to. So can I example.
Okay, so observe the blue heap remark that has repeated values. Namely, we can see that
the two is there, three times seven is there twice, 11 and 13. Once the low I have drawn
the index tree, a tr
ee, which can help us determine the index position of a node in
the tree 11, for example, is that index 313 at index five, and the first to index zero.
On the left is the hash table. With the key value pairs. Notice that two is found in three
positions 02, and six, while seven is found two positions, one and four, and so on. So
this is how we're going to keep track of the positions of the values in the tree. If notes
start moving in the tree, we also need to keep track of that. For example, if a
bubble
up or a bubble down cursory to track all those movements, and where the swabs go to so we
can update the index position in our map, we swap 13. And the last seven, for example,
the following should happen. So we look at where seven and 13 are in our table. And then
I've mapped those respective positions as red for the seven and yellow for the 13. And
this is what happens when we do a swap, we do a swap in the tree but also in the table.
And that's it. Okay, so this all sounds great. We k
eep track of repeated values by maintaining
a set of indices apart to the node, the particular value was found out. But now let's ask a further
question. If we want to remove a repeated node in our heap, which node do we remove
and doesn't matter? Because if we look in our heap right here, there's three possible
two values we can remove doesn't matter which one we remove. The answer is no, it does not matter. As long
as in the end, we satisfy the heap invariant and that's the most important thin
g. So let's
look at an example. Not just for removing but also of adding and pulling elements with
this new scheme I just proposed. It's not that hard, trust me. So first one, we want
to insert three. Sweet place three at the bottom of the heap in the insertion position.
We also need To track where the new node is, so we add three to our table long with its
position, which happens to be at index seven. Look at the index, tree and grade confirm
this. Now that three is has been inserted, we need t
o make sure the heap invariant satisfied,
currently it is not. So what we do is we bubble up three, the parent of three is 11, which
is greater than three, so we need to swap those two notes, I have highlighted the seven
and the index three because it maps to three in the heap and three in the index three,
because it maps to the value 11. Now swap those both in the tree and in the table. Awesome.
Now the heap variants still not satisfied. So do a similar thing. For the note above,
we grab the pa
rent and do a swap in the tree on the table. And now the bat invariants are
satisfied. So three has been inserted successfully. The next instruction is to remove to from
the heap. So which tool should we remove? Well, as I said, it doesn't matter as long
as we satisfy the heap invariant in the end. If we remove the last two, we can immediately
satisfy the heap invariant with one swamp. But for learning purposes, I will simply remove
the first one found in our set, which happens to be located at
index zero. So we want to
remove the two at position zero. So how do we remove a note again, so we did a look up.
So we didn't have to do that linear scan, which was nice. And now we swap it with the
end, the last node, which happens to be 11, we remove the last node. Now we need to satisfy
the heap invariant. So we need to bubble down 11. So we look at 11 children, which happens
to be three and two, two is smaller. So it's the one we're going to swap with. So swap
it in the table and, and the t
ree. Now, we are still not in satisfaction of the human
variant. So look at 11th children being 13 into two smaller, so swap it with two. And
that's it, we're done. The last instruction says the poll, so we get the value of the
route, just two and swap it with 11. Get rid of the two and bubble down 11. So as you can
see, we're constantly updating our table and but still doing the same operations. This
is part five of five and the priority queue series. And we're going to have a look at
some sour
ce code for the priority queue today. So if you want the source code, here's the
GitHub link. With all the data structures in the series, the priority queue is one of
them. Also make sure you've watched parts 124, so you can actually understand what's
going on. Alright, let's dive into the code. Alright, here we are inside the source code.
So notice that inside my priority queue, the types of elements I'm allowing inside my priority
queue have to be comparable elements as we talked about. So if
they implement the comparable
interface, then we are going to allow them inside our queue. So this is anything like
strings, integers, anything with a comparable interface. So let's have a look at some of
the instance variables. So I have the heap size. So this is the number of elements currently
inside the heap. But then I also have another instance variable, which is the heat capacity.
So this is going to be the size of the list. That we have four elements which may be larger
than the heap siz
e is. And this is the actual heap. And we're going to be maintaining it
as a dynamic list of elements using Java's list. Next, for our logging of and removals, I'm
also going to keep track of this map as you want to map an element to a tree set of integers.
So these won't be all the positions inside our heap, which we can find this element T.
All right. Next, I've got a few different constructors for our priority queue. We can
just Create a brighter queue. And by default, I'm creating and initia
lly empty priority
queue with a capacity of one. But I also allow you to create a priority queue with a defined
initial capacity, which actually is more useful, because then we don't have to keep expanding
our dynamic list every time. So I would recommend this. But also, even better, is if you know
all the elements that are going to be inside your heap, you can actually construct the
priority queue in linear time, using an operation called heapify, which I didn't talk about
in the slides. But bo
th can be very, very useful. So So this just has all the usual
setup stuff. Here, I'm adding all the elements to the math and but also to the heap. And
then here's the heapify process. So we start at halfway through the heap size, and then
we start decreasing, and then we sync all the elements. And you're like, Wait a second,
isn't the seek sink? a logarithmic removal? Well, yes, it is in the general case, but
not if we're doing it in this fashion. I put a link to this paper appear just because
the
heapify operation isn't quite intuitive why it has this linear complexity. And if you
look at the analysis in the paper, you will end up seeing that the complexity boils down
to a convergent series. And that's why we constantly say it's linear time. But in general,
this isn't what you might be tempted to do. If you're given a collection of elements,
you would initialize the heap, and then you would just use our add method to add the elements
one at a time. And this will give you an N log N b
ound. But definitely use the heapify
whenever possible. Okay, now some pretty simple methods we have is empty, just returns true
or false if the heap is empty or not, then clear. So when we clear the heap, we remove
all the elements inside our heap array, but also inside our map. So that's why he called
map dotclear. Size returns a size, it's pretty simple. peek, the first really useful method
just looks at the top of our primary priority queue. And if it's empty returns No. Otherwise,
we do a l
ook up at the first index inside our heap and return it because it's the root
node poll. Similar to pique, except that we're going
to remove, remove the very first element. And we're also going to return it because
you want that information. x contains. So because we have a map with our elements, we
can do a map lookup to see if our elements inside heap. And this reduces our complexity
from linear, in which case, we have to scan through a linear scan through all elements
to check containment to
constant time, which is remarkable. But you have a job in case
people don't usually maintain this map. I just wanted to do it. Just to show you guys
that is possible. Although the map does add a lot of constant overhead and you may or
may not want. I personally find that the overhead is quite a lot. And I usually don't really
remove things very frequently for maps. So it might not be entirely worth it. But it's
up to you. If you're doing as many additions as you are removals then definitely wort
h
it. Alright, now let's have a look at the Add method. So, so this element, sorry, this
method adds an element to the prayer queue. And that element cannot be no. So what we
do is, first we check if our heap size is less than capacity. Otherwise we have to expand
our capacity of the heap. Next, we make sure we add it to the map. So we keep track of
it. And then we swim it up. Remember that we have to swim. I know Rob because we add
it to the very end of our list. And so we had to, like adjust w
here it goes inside our
heap by swimming upwards. This next method called less is is a helper method, which helps
me check if node i is less than or equal to node j. And this uses the fact that both elements
node one and node two are comparable. So we can invoke the Compare to method. If we go
back up that comes from this interface, the comparable interface, which we needed. So,
let's go back down. Alright, so returns true if i is less than or equal to J. Awesome.
So next, this is the bottom up
notes. So we are going to try to swim node k. So first,
who grabbed the parent of node k, we can do that by solving for the parent. So remember
that I'm working in. I'm starting at zero, some people like to start the heaps index,
that one, I like everything's zero based. So I get the parent, which is that this position
k minus one divided by two, because we're going upwards. And while K is still greater
than zero, so we're not at the root, and we're less than our parent, then we want to swim
upw
ards. So we're going to swap the nodes parent and K. And then K is when we can become
the new parent. And then we have to get the parent of K once more. And then we'll keep
doing this going up our heap and swapping notes. So that's how you do the swim. So the
sink is similar, but a little different. So top down node sink. And here, we want to sync
node k. And what we do is I grab the left node, but I also grab the right node. Remember,
we're working zero based, so plus one plus two instead of a
plus zero plus one. And then
I need to figure out, which is less either is going to be the left one or the right one.
And I assumed to start with that the left one is going to be smaller than the right
one. And here I correct that hypothesis, in case it was false. So I checked that the right
notice within the size of the heap. And if the right node is less than the left node,
then the smallest ones are going to be the right node. And our stopping condition is that we are
outside the bounds of th
e tree, or we can't sink any more. And we can do a similar thing,
we swap and then case the next smallest here like like we did in the last method also.
So the swap method, I made a an explicit method to swap because I also have to swap things
inside the map and then set the values also. And this is really what adds a lot of overhead
for the math is the fact that every time we call the swap method, we also have to swap
things inside the map, which, which can be a lot of overhead really, it techn
ically maps
our constant lookup times but the fact that you're doing all this internal hashing and
collisions and whatnot, it can get costly. So remove. So if the element is now returned
false, we know we don't have any no elements inside our heap. So this is how you would
do a linear removal and linear time, I commented out in case you want to revert back and remove
the map and whatnot, is with scan through all the elements. And once you find the element
you were looking for, just remove it at
that index and return true. Otherwise, we're going
to use our map. So get the index so wherever the element one of the elements are. And if
it exists, then remove it at that index. Okay, now let's have a look at the Remove add method.
So this is what I covered in the last video. So if our heap is empty, well, can't really
remove anything. Otherwise, we're going to swap the index Why remove with the last element,
which is going to be at heap size. And then we're going to kill off that node and al
so
remove it from our map. So if it so happened that I was equal to the heap size, meaning
we just removed the very last element in our heap, just remove, return the removed data.
Otherwise, when we did the swap, we have to either sink that node up or down. And I'm
caring too lazy to check whether I need to sink or swim. So I just tried both. So first
I try sinking and then if nothing happened, then I try swimming downwards. And in either
case, return the Remove data. Perfect. So this just readj
usts where, where the swap
node position goes, either bubble up or bubble down. This method is just a method I use in
my testing framework to make sure everything is good. So it checks essentially the integrity
of the minimum heap. So initially, you call this method with K equals zero, and that starts
at the root as you want to recursively go down the tree and check are we maintaining
the heap invariant property which we need. So our basis, you want to be that k is outside
the heap size. And if
so, we're just going to return true. Otherwise, get our children
left and right. And now, we're going to make sure that k is less than both our children.
And if that's not the case, return false. And if we ever returned false, because we
have an and statement when we're recursing, right here, that that gets propagated throughout the recursion,
and this whole method will return false. Otherwise, if everything returns true and hits the base
case, then we know for sure it's in the minimum heap. Oka
y, these last few methods are just
map helper methods. So things to add things into the map, things, how to remove elements
from the map, and so on. And what I'm doing here, as I'm using a tree set to add and remove
elements, because I know the tree set implementation, Java has a Balanced Binary Search Tree. So
all operations on tree sets are logarithmic, which is really nice. So you guys can have
a look at those in more detail. It just gets values removes values. And lastly, do a map
swap. So y
eah, it swaps values in the heap, or in the map rather. So yes, have a look
at those in more detail. But I pretty much covered everything about the priority queue.
All right, time to talk about the union find also sometimes called the disjoint set, this
is my favorite data structure. So let's get started. So an outline of things we'll be
covering about the union fight. First, I'll be going over a motivating example, magnets.
Just to illustrate how useful the union find be. Then we'll go over a c
lassic example of
an algorithm which uses the union find specifically crucibles a minimum spanning tree algorithm,
which is very elegant, and you'll see why it needs the union find to get the complexity
it has. Then we're going to go into some detail concerning the find in the Union operations,
the two core operations, the union find users. And finally, we'll have a look at path compression.
What gives us the really nice amortized constant time the unifying provides? Ok, let's dive
into some dis
cussion examples concerning the union find. So what what is the union fine.
So, the unifier is a data structure that tracks elements which are split into one or more
disjoint sets. And the union find has two primary operations. Find an union. A word
find does is given an element, the union find will tell you what group that element belongs
to and you Onion merges two groups together. So if we have this example with magnets, suppose
all these gray rectangles you see on the screen are magnets. And
also suppose that the magnets
have a very high attraction to each other, meaning they want to merge together to form
some sort of configuration. So, if I label all the magnets and give them numbers, and
we start merging, the magnets have the highest attraction, first, we're going to merge six
and eight together since the closest. So in our union find, we would say union six, and
eight. And when we do a lookup on to find out which groups six and eight belong to,
they will belong to the blue grou
p. Now, perhaps two, three, and three and four are highly
attracted to each other, so they would form a group. So they would form the yellow group.
And perhaps 10 1314, would also form a group. And this keeps on going, and we unify magnets
into groups. And perhaps we merge some magnets onto already existing groups. So we unify
grey magnets, which is just a magnet in its own group, and add to an already existing
group. But also we can unify groups of Magnus, which are different colors, and we ass
ign
it an arbitrary color. So blue, so suddenly, everything in the yellow group went into the
blue group. And now when we would do a lookup in our union find to determine which group
say two, three or four. Now we'd say you're in the blue group, and the union fine, does
all this merging and finding in a very efficient manner, which is why it's so handy to have
around. Now explaining currently how that works. We'll
get into that in the later video. This is just a motivating example. So where are
other
places that the union find is used? Well, we will. Well, we see the unifying again in
carousels, minimum spanning tree algorithm. In another problem called grid percolation,
where there's a bunch of dots on a grid, and we're trying to see if there's a path from
the bottom of the grid to the top of the grid, or vice versa, then the union find lets us
do that efficiently by merging together paths. Also, similar kind of problem in network activity
are two vertices in the graph connected to ea
ch other through a series of edges. And
also perhaps in some more advanced examples, like the least common ancestor in a tree,
and also image processing. So what kind of complexity can we attribute to the union fight?
The complexity is excellent. So it's construction is linear time, which isn't actually bad at
all, then the union find getcomponent. And check if connected operations all happened
in what's called amortized constant time. So almost constant time, although not quite
constant time. A
nd finally, count components where we can determine how many components
are in our magnetic examples, how many different groups of nine that's we have. And we can
do that in constant time, which is really, really great. Okay, let's talk about a really
neat application of the Union find, which is crew skills, minimum spanning tree algorithm.
So you might be asking yourself, what is a minimum spanning tree? So if we're given some
graph with some vertices and some edges, the minimum spanning tree i
s a subset of the edges
which connects to all the vertices and does so at a minimal cost. So, if this is our graph,
with some edges and vertices, then a possible minimum spanning tree is the following and
has edge weight 14 or total edge weight 14. Note that the minimum spanning tree is not
necessarily unique. So if there is another minimum spanning tree, it will also have a
total weight of 14. So how does it work? So we can break it up into three steps essentially.
So the Step is easy. Just tak
e our edges and sort them by ascending edge edge weight. Next
thing we want to do is we want to walk through the sorted edges and compare the two nodes
that the edge belongs to. And if the nodes already belong to the same group, then we
want to ignore it because it will create a cycle in our minimum spanning tree, which
we don't want. Otherwise, we want to unify the, the two groups those nodes belong to.
And keep going. I will keep doing this process until either we run out of edges, or all the
vertices have been unified into one big group. And you'll soon see what I mean by a group,
which is when our union find the structure is going to come into play. So if this is
our graph, then to run crystals algorithm on it, first, let's scale the edges and sort
them. So on the left side, you see I have all the edges and their edge weights sort
in ascending order. So next, we're gonna start processing the edges one at a time, started
starting with the top, so I two J. So I've highlighted the edg
e, it Jane, orange. And
you can see that a connects nodes, i and j, i and j currently don't belong to any group.
So I'm going to unify them together into group, orange. Next is edge eight, he, so he don't
belong to any group. So I'm going to unify them together into group purple. Next is CGI.
So I belongs to group, orange. But see doesn't have a group yet. So see can go into group
orange. All right, E to F, F doesn't belong to a group. So F can go to group purple. Next, H and G, knee, neither ag
e nor g belong
to a group. So I'm going to say you guys belong to the red group. And next we have the Ruby.
They also don't belong to a group. So give them their own group, let's say group green.
Alright, and now, I believe this is when things get interesting. Now, we're trying to connect
C to J. But notice that C and j are a both belong to group orange. So we don't want to
include that edge, because it's going to create a cycle, so ignore it. And to check that they
belong to the same group, we
would use the find operation in our union fine to check
what group they belong to. So this one, the unifying really comes into play. Next is edge
did he So know that he belongs to purple, and D belongs to group green. So now we want
to merge them together because they don't belong to the same group. So either the purple
groups wanted to become the green group, the green groups and the purple group. And it
doesn't really matter, so that we would merge them. And this is when the union operation
in
our union find becomes useful. It allows us to merge groups of colors together very
efficiently. And that's the important note. Next edge would be d, h, h belongs to group
red, and the purple. So merge the groups together, let's say they both become group purple. Next
up, we want to add edge, add, but add already belong to the same group. So that would create
a cycle. So we don't want to include that edge. So skip. next rounds include edge BTC
BTC belong to different groups. So merge the two gr
oups into one larger group. So we have
found the minimum spanning tree using crystals minimum spanning tree algorithm. Pretty neat,
right? So the underlying data structure, which allows us to do this is the union find it
allows us to merge groups of colors together efficiently, but also to find out which groups
nodes belong to so that we don't create a cycle. So so that's crystals algorithm. It's
super simple. Given that you know how the union find works. So I'm going to go into
some detail in n
ext video, explaining how the Find and the union operations work internally,
and how We can actually implement it in some useful way. Okay, so now we're going to talk
about the union and find operations we can do on the union find, or the disjoint. Set.
This is the video where I demystify how those actually work internally. So to create our
union find, the first thing we're going to do is we're going to construct a by ejection,
or simply a mapping between our objects and the integers in the rang
e zero inclusive to
N non inclusive, assuming we have n elements. So this step in general is actually not necessary.
But it's going to allow us to create an array based unit find, which is very efficient,
and also very easy to work with. So if we have some random objects, and we want to assign
a mapping to them, then we can do so arbitrarily, as long as each element maps to exactly one
number. So that is my random by ejection. And we want to store these mappings perhaps
in the hash table, so we
can do a lookup on them and determine what everything is mapped
to. Next, we're going to construct an array. And each index is going to have an associated
object. And this is possible through our mapping. So for instance, in the last slide, a was
mapped to five, so slot five, or index five is going to be a slot. So what you see in this picture is at the
top is that array we just created, which contains our mapping. And in the center is just a visual
representation of what's going on. The value i
n the array for each position is currently
the index, which it is at. This is because originally, every node is a root node, meaning
it maps to itself. But as we perform the instructions on the left of unifying groups together, or
objects together into groups, we're going to find that we're going to change the values
in our array to map to other letters. And specifically, the way we're going to do it
is first some index i, in our array, index eyes parent is going to be whatever index
is at posit
ion I. So for instance, if we want to unify C and K, we look at C and K. And
we discover that C has a root node of four, and K as of nine. So either C's won't come
case parent, or K is going to see its parent. And I chose that case parent is going to be
seat. So now at index nine, which is case position, I'm going to put a four, because
I know that C is a four. So next, F and E are going to do a similar type of thing. And
I'm going to say that f is going to F parent is going to be E. So at F pos
ition, which
is one I'm going to put a zero because he is zero. Similar thing for a and J. But here's
where things get a bit more interesting. So now we want to unify A and B. So if I look
ay ay has a six in itself, but six is J. So I know that A's root node for group greens
j and B is a single node because it's a self loop. And in general, I'm going to merge smaller
components into the larger ones. So now, bees are point to J, because the green groups root
node was J. So I write mode C and D. S
o I find the root node of D which is the and find
the root note of C which is C and I'm going to merge the smartcompany into the into the
larger component, which is the orange group. Now these want to be part of the orange group.
Now I want to merge D and I. So similar thing happens. And I now points that C. Now, I want
to merge L and F. So F's parent is E. So I'm going to merge L and B into the red group.
Now I want to merge C and A. So this is an interesting example. So I find a C's root
node,
which happens to be C, I find a root node which is J. Now, component, orange has
four elements, big component, green only has three. So I'm going to merge the green component
into the orange component. So now j is going to point to C. So I want to unify A and B.
So I do. So if I go up and follow the parent nodes until I reach a root node, as parents
as Jay Jays, parents see, so I know that a belongs to the orange group. And if I do a
similar thing with B, I also discovered that B's parent is al
so C, which is the orange
group. So I don't need to unify them, they're already unified together. So H and J, G, they
currently don't have a group, so I'm going to arbitrarily merge them into a new group.
So it'd be the blue group. So H and F. So if I look, h is parent ID, G, and s parent
is he the right component is larger, so I'm going
to merge the blue group into it. And now since g was the root node, I make it point to E,
which was also the root node. Now I want to merge H and B. So H is roo
t node is E, if
we follow up the chain from H to G, D, and B's root node is C, because we go from B to
J to C. So now since the orange component is larger than the red component, we're going
to point the root of the red component to the root of the orange component. So he now
points to see. Okay, and note that in this example, I'm not using a technique called
path compression. This is something we're going to look at in the next video, which
is an optimization on the union find. To summarize, if
we want to find out which component a particular
element maps to, what we have to do is find the root of that component by following all
the parent nodes until we reach self loop or a node whose parent is itself and that
will uniquely determine which component that element belongs to. And to unify two components
together, what we do is we find the root nodes of each component. And then if the root nodes
are different, because if they're the same, then they belong to the same component already.
So if they're different, make one of the root nodes point to the become the parent of the
other room. So just some remarks about this union find data structure. So in general,
we don't really aren't union elements. Just because this would be inefficient as we'd
have to update other children which point to that note, we don't have access to those.
But we could probably in theory, keep track of that. I just don't see any application
currently. But there there might be also remarked that the number
of components in our union
find is going to be equal to the number of root nodes remaining. Because each root node
is responsible for a component and also remark that the number of root nodes never increases,
that always decreases because all we do is unify components, so components only get bigger
and bigger and bigger. Now we want to talk about the complexity of the Union find. So
I said in the first video that the US plane has an amortized time complexity. However,
the implementation of this
show you does not have an amortized time complexity. Not yet
Not without the path compression, which is something we're going to look at in the next
video, which is what makes the human fine an absolute beast of it. structure, you must
watch the next video. But just as an example, if we need to check, if H and B belong to
the same group or a different group, it's going to take five hops in the worst case,
and while potentially much more, so H, we find the root node, which is C, and then we
go f
rom B, and then find the root node is also C. So this takes quite a few hops. Let's
talk about path compression. Now, this operation is really what makes the union find one of
the most remarkable data structures, because it's how the union find gets to boast in its
efficiency. So let's dive right in. But before we get started, it's critical that you watch
the last video so that you understand how the find in the Union operation work. Otherwise,
you want to understand what's going on and what's u
p with path compression, and how we're
going to get that really nice, amortized constant time. Alright, suppose we have this hypothetical
union find, I say hypothetical, because with path compression, I'm almost certain it's
impossible to achieve a data structure, or a structure that looks like this. Nonetheless,
it's a good example. So now suppose we want to unify nodes, E and L. Or just unify groups,
orange, and blue, but we have access to E and L. And that's what we're calling the unify
opera
tion on? Well, we would have two pointers that start on E and L. And where we would
want to do is find the root node of e and find the root node of L, and then get one
of them to point to the other. But with path compression, we do that, but we're also going
to do something else. So let's start by finding the parent note of E. So E's parent is D,
and D is C, C to B, and B, A, and then eight F. So we found the root node of E. But with
path compression, here's what we're going to do. Now that we h
ave a reference to the
root node, we're going to make e point to the root node. And similarly DS are important
to the root node and C and B, and a. So now everything along the path got compressed,
and now points to that root node. And in doing so, at every time we do a lookup, on either
A, B, C, D, or E in constant time, we will be able to find out what the parent or the
root node is for that component. Because we immediately point to it, we don't have to
traverse a sequence of other nodes. And
we can do this because in a union find, we're
always unifying things together and making them more and more compressed. We're never
unusual unifying things, if you will. So if we do the same thing for L, we find LS parent.
So we traverse up all the parents until we find the root. And then we compress the path.
So j points to G, I points to G, H points to G. And so we compress that path. But we
also have found the parents now. So make one point to the other. And we've unified both
groups. So now
the group that was once with E, and once with l have now been merged into
one single group. But the only difference is we've compressed along the way as we've
done this, and now it's so much more efficient. Now let's have a look at another example.
And this one, I want to compare and contrast, the regular union find operations where we're
doing the last video to the last compression version, we now know. So if I run all those
union instructions, this is what happened. So it's the beginning all t
hese pairs of components,
and then executing the instructions on the right. And this is the final state of our
union find and know that if I'm trying to determine what groups say a and j or n, then
I have to traverse a whole bunch of different nodes. So j goes, I use h h goes to eat. But
now if I include path compression, let's see what happens. So I still have all those components.
But now I as I execute the instruction on the right hand side, this is what happens.
So I get the green group. But
then, because of path compression, that j merged into the
H, so already that path is a little shorter. And then I keep executing more instructions.
But as I'm doing it, the path gets compressed dynamically. So so I'm getting more and more
compression. And even up to the very end. So on the last example, we haven't even finish
all the instructions, and we have reached the final state. But with path compression,
as long as there's something compressed along our path, we get to compress the path a
long
the way, pointing it to the root. So right now, we only have one, root, B, E, and almost everything in constant time, points
to E. So we do a lookup on any of our nodes. And we know that the route is easy. So we
know the last component. And this structure becomes very stable eventually, because of
this path compression. This is why the union find with path compression is so efficient.
Let's have a look at some of the Union find source code. So here's the link to the source
code. And you can
find it on my GitHub repository github.com slash William fees, slash data
structures. I also have a bunch of other data structures from past videos. And before we
dive into the code, and make sure you watch the other videos pertaining to the union find,
as I will be making some references to them. Okay, let's dig in. Here we are inside the
code. And I have a class called union find and see a few instance variables. So let's
go over them. The first is size, just how many elements we have in our
union find. There
are at least two arrays, one called ID and one called size. So the interest, while the
more interesting was ID, and idea eyes, that array I talked about which at index i points
to the parent node of AI. And if Id add AI is equal to AI, then we know that AI is a
root node. So we're essentially keeping track of all these like tree like structures right
inside an array, which is practical. And also because we create a by ejection between our
elements. And some numbers, this is how
we're able to access them through this ID array.
And just for convenience, I track the number of components, that's sometimes some useful
information you might want. So if you create a union find, well, you need to know how many
elements are going to be inside your union find. And I make sure that we have a positive
number of elements, otherwise I throw an exception. Now go ahead and initialize some instance
variables. And I initialize ID to equal i. So initially, everyone is a root node, and
e
very component has a size of one. So find is pretty simple. It's given a a node, it
finds which component it belongs to, and also does path compression along the way. So if
we're at p and want to find the root node of P, what we're going to do is we're going
to find the root node using this one while loop. So we initialize a new variable called
root to P. And then we loop until the root is not equal to ID at root. So aka This is
a root node or a self loop that we found and so we can stop and the
root is stored in the
variable called root. And then what we do is we do the path compression. This is what
I talked about in the last video. So starting back at p, we assign everything from idmp
to equal the root and this is what compresses the path gives us that nice amortized time
complexity, you can also do this recursively. But I don't like having the overhead and doing
things iteratively you can be slightly faster. Okay, so now, we have these simple methods,
we can call. So if p and q are
inside the same component, this will return true, because
the root P and the root q will be equal. Otherwise, this will return false. And just calling find
will do path compression. So even if we're just checking if two components are connected,
and it calls the find method, and we're compressing the path, same thing here, if we decide to
find the component size, relative to some index p, then when we index into the size
and call find, we'll find Well, we'll find the root but at the same time,
we'll also
do path compression, which is nice. And I would just like to note that the the
root nodes are the ones who will always contain the size because they're the ones that are
found, well, she won't think about it like at the end of the chain. Size just returns
a number of components and our you can find disjoint, set components number components
is pretty self explanatory. And the unifying method is the last interesting method. So
this is the one that will unifies two components together.
So so first of all we do is we find
what the root node for P is and what the root node for Q is. And if the root nodes are equal,
then they're already in the same group, and we don't do anything. Otherwise, by convention,
I just merge the smaller group into the larger group. Although I know some people like to
keep the theoretical largest path in a component, and then merge according to not and that may
be slightly more efficient, but it's more work. So I just like to merge the smaller
group int
o the larger group. And also, since the roots are different, and we're emerging,
we know that the number of components or sets must have decreased by one. So that's why
at the bottom here, I, I say the number of components, subtract that by one, because
we know we're gonna have one less component. So this whole time, inside this class, I've
been treating being q as integers and not as elements, like letters that I that we saw
in our in the slides. And this is because by ejection, I would do a lo
okup to find out
what maps to the element P and that should give me an integer, and what maps to the element
queue. And then I could use those with this union find data structure created, and turn
everything into the realm of numbers instead of dealing with objects and having all this
more complexity, it's much easier to have an array based union find. You could also
have a pointer based union find with node objects. But this is really nice, and it's
a really clean implementation. All right, I w
ant to start talking about a very exciting
topic, and that is trees. And you'll soon start to realize that there are tons and tons
and tons of tree data structures. But today I want to focus on a very popular country
and that is binary trees. And with binary trees, we must talk about binary search trees
as well. So today's tutorial is going to focus on binary trees and binary search trees where
they're used and what they are. And then later tutorials, we're going to cover how to insert
nodes int
o binary search trees remove nodes, and also do some of the more popular tree
traversals which can apply to other general trees. Also, not just binary trees. Okay.
So I'm going to give you guys a quick crash course on trees before we get started. So
we say a tree is an undirected graph, which can satisfy either of the following definitions,
there are actually multiple definitions, but these are the most popular ones. So trees
non directed graph, which is connected and a cyclic, a cyclic means th
ere are no cycles.
Another definition is we have n nodes, but we have n minus one edges. And lastly, for
any two vertices, there's only one path between those two vertices, you can have two different
paths, they would imply that we don't have a tree because there's another route to get
to another node. And so we'd have a cycle. Okay, and context is trees, we can have something
called a rooted tree, which is the topmost node of our tree, you can think of it that
way. And if you're trying to route
the tree, and you don't have a route yet, it doesn't
matter which node you pick, to route the tree, because any node you pick can become the root
node, think of it as picking up the tree by that node. And suddenly, it's the new root.
We have something called are the concept of child and parent nodes. So child node is a
node that extends from another node. Think of it as going down or it's an A parent node
is the inverse of this. So you're going up towards the root. So we have an interesting
que
stion, and that is, what is the parent of the root node? The answer is that the root
node has no parent, although sometimes it may be useful to say that the parent of the
root node is itself. And we see this often when programming, for instance, a file system,
tree has this property. So if I open up the command line, I'm in some directory, so I
can say pwd at the current directory. So I'm somewhere in the file system tree. And if
I want to go up to my parent, I can say CD dot dot slash, and now
I'm up in another directory,
or in to the parent directory. And I can keep doing this and going up and up in the file
system tree directory system. But if I go directly to the root node, which is slash,
I see that I'm in the folder slash, so the very top at the root of the directory, and
I say CD dot, dot, then, if I check out where I am, I'm again at the root. So in this context,
for a file system, the parent of the root is the root. Pretty cool. So just as an example,
if we pick the node zero,
and has two children, three and two and a parent four, we also have
the concept of something called a leaf node, this a node which has no children, and these
have been highlighted in orange. So they're just at the very bottom of your tree. Think
of it that way. And there's also another term, which is sub tree, this is the tree entirely
contained within an under tree. And we usually use triangles to denote sub trees. It's possible
for a sub tree to consist of only a single node, so that's fine.
So if this so tree with
that circle node being the root, and we pick particular sub tree and look what's inside
of it, then we get another tree, which contains nodes and more sub trees. Then we pick another
sub tree, look with inside of it, and then we get another tree. And eventually, we're
going to hit the bottom. So now we can pose the question, what is a binary tree? And this
is this simple. This is a tree in which every node has at most, two children. So both those
trees below are binary tr
ees, because they have at most two children. You can see that
the tree on the right, eight has one child, and that's fine, because the criteria is at
most two. Now we're going to play a game. I'm going to give you some various structures,
and you're going to have to guess whether it is a binary tree or not. So is this a binary
tree? Yes. Yes, it is every node has at most two children. How about this one? No, you
can see that seven has three children. So it's not a binary tree. How about this one
?
let you guys think about it a bit more. Yes, this is a binary tree. It may be a degenerate
one, but it's still a binary tree. Alright, let's move on to binary search trees. So what
is a binary search tree? Well, first of all, it's a binary tree. But Furthermore, it also
satisfies what's called the binary search tree invariant. And that is that the less
left subtree has smaller elements than the value of the current node. And the right subtree
has larger elements than that of the current node.
So below are a few binary search trees.
We're going to play the same type of game, and I'm going to give you some trees, and
you're going to have to guess whether they're binary search trees or not. What about this
structure? And this one, we could say it depends on whether you want to allow duplicate values
inside your tree. It so happens that binary search tree operations allow for duplicate
values. There's nothing wrong with that. But most of the time, we're only interested in
having unique v
alues inside our tree. So this particular tree depends on what your definition
of a binary search tree is. How about this? tree? Yes, this is a binary search tree. How
about this tree? Notice I've only changed the elements within the tree. Yes, this is
also a binary search tree. We're not limited to only having numbers within our binary search
tree, we can place any type of data which is comparable and can be ordered. How about
this tree? No, this is not a binary search tree. And the reason is t
he the node nine
is in the wrong position. When we would be inserting nine, we would have to place it
in the right subtree of eight because nine is larger than eight so belongs in its right
subtree. How about this structure? This structure isn't even a tree actually, because it contains
a cycle. And the requirement to be a binary search tree is that you must be a tree. And
this last one, I'm going to give you a bit more time to look at this one. Because it's
not a regular type of structure you m
ight think of as a binary search tree. And the
answer is yes, this structure does satisfy the binary search tree invariant that every
right subtree contains larger elements. You will you'll see that that is true. And also
every left subtree that contains smaller elements. It doesn't look like a tree, but it satisfies
our definitions of a tree and a binary search tree. Okay, so we've been talking about binary
trees, binary search trees, where are they used? Why are they useful? So in particular,
binary search trees are used in many implementations of abstract data types for sets, and maps
and so on. They're also used to implement balanced binary search trees, which we'll
probably get to in a later video. Can you see binary search, or sorry, binary trees
in binary heaps? We've already seen this in priority queues when we're making a binary
heap. But we also see binary trees and things like syntax trees, so you're parsing an arithmetic
expression, then you place it in an abstract syntax t
ree. And then you can simplify expression.
compilers and calculators use this to evaluate expressions. So wherever you punch in your
calculator gets person to binary tree and evaluated. And lastly, I just threw in a trip,
which is another type of probabilistic data structure. So now let's look at the complexity
of these binary search trees because they looked very interesting and also very useful.
And indeed they are. So in the average case, when you're just given some random data, the time comp
lexity is growing. Be logarithmic,
which is really good. So if you're inserting nodes, deleting nodes, removing nodes searching
for a value tree, whatever, the average time complexity is going to be the logarithmic.
And these binary trees, or binary search trees are very easy to implement. So this is really
good. However, in the worst case, if your tree and D generates to being a line, then
we can have some linear behavior going on, which is really bad. So there's some trade
offs. If you want to
use just a pure binary search tree, that it's going to be easy to
implement. And on average, it's going to have this logarithmic behavior. But in the worst
case, you might be looking at some linear stuff, which is not so good. Okay, let's have
a look at how to insert some elements into a binary search tree. So let's dive right
in. So first, to add elements to our binary search tree, we need to make sure that the
elements we're adding are actually comparable, meaning that we can order them in so
me way
inside the tree. Meaning at every step, we know whether we need to place the element
in the left subtree or the right subtree. And we're going to encounter essentially four
cases. So when inserting an element, we want to compare the value to the value of the current
node we're considering to do one of the following things. Either, we're going to recurse down
the left subtree, because our element is smaller than the current element, or we're going to
recurse down the right subtree because
our element is greater than the current element.
Or it might be the case that the current element has the same value as the one we're considering.
And so we need to handle duplicate values, if we're deciding to add duplicate values
to a tree or just ignoring that. And lastly, we have the case that we've hit a null node,
in which case it's time to create a new node and insert it in our tree. Let's look at some
animation now. So on the left, I have a bunch of insert instructions. So we have all th
ese
values we want to insert into our binary search tree. And currently the search tree or the
binary search tree is empty. So first, we want to insert seven. So seven becomes the
root of the tree, because it's the first note. Next, we want to insert 20. So 20 is greater
than seven, so insert it to the right. Next we want insert five. So we always start at
the root when we're inserting, and that's an important point. So you start at the root,
and then you move your way down the tree to figure ou
t where you want to insert the node.
So we start at the root, and then we're like, oh, five is less than seven. So we're going
to insert it to the left. Now 15, so the route go to the right, because 15 is greater than
seven, then to the left ps 15 is less than 20 at 10. Now four, so four is less than seven,
move left, four is less than five, move left, create the new node. Now we have four again.
So let's see what happens here. So seven moved to the left and moved to the left. Now we've
encounte
red a value that already exists in our tree. So as I said before, if your tree
supports duplicate values, now's the time to add another node. And you would either
pick by convention if you want it on the left or on the right. Otherwise, you'd do nothing.
And I'm going to choose to do nothing. Insert 33. So start at the root, go to the right,
because 33 is greater, go to the right again. Now insert two, so two smaller than everything
in our tree, so it would go all the way to the left. Now try an
d see where 25 would go.
So 25 simply go to the right, then we're going to go to the right again, because it's greater
than 20. But we're going to go left now because it's less than 33. And finally sex so once left, once right,
and we've inserted everything into our binary search tree. So on average, the insertion
time is going to be logarithmic. But in the worst case, this behavior could degrade to
linear time. Let's have a look at that. So if our instructions are the following insert
12345, an
d six. So we start at one, then go and insert two sets to the right. Okay, now
let's insert three, well, three is greater than everything. So I have to place the right
again, how about four, okay, four is also greater than everything. Oh, looks like we're
creating this line now. And now six, six is still greater than everything. So as you can
see, this type of linear behavior is really bad. And we don't want to create lines like
this. Because if we want to query where six is in the tree, or if w
e want to remove five,
or do anything, are buying your surgery, first thing to find the node, that's going to take
linear time. And this is this is bad. It's one of the reasons why people haven't invented
things like balanced binary search trees, or self balancing trees, which balance themselves
as you insert nodes will begin to that later. But that's it for insertion. It's really simple.
Don't overcomplicate it. Alright, now that we know how to insert elements into a binary
search tree, we migh
t also want to remove elements from a binary search tree. And this
is slightly more complicated, but I'm going to make it very simple for you guys. So when
we're moving elements around binary or substrate, you can think of it as a two step process.
First, we have to find the element we wish to remove within the binary search tree, if
it exists at all. And in the second stage, we want to replace the node we're removing
with its successor, if one exists, in order to maintain the binary search tree
invariant,
let me remind you with a binary search tree invariant is it's that the left subtree has
smaller elements than the current node and the right subtree has larger elements than
the current node. Okay, so let's dive into phase one the find phase. So if we're searching
for an element inside our binary search tree, one of four things is going to happen. The
first thing is we hit a null node, meaning we've went all the way down our binary search
tree and have not found the value we want. So
the value does not exist inside our binary
search tree. Another thing that can happen is the competitor value is equal to zero.
And what I mean by a competitor is, it's a function that will return minus one if it's
less than zero if it's equal to or wine if it's greater than the current value. So it
tells us essentially, which subtree we need to go down, or if we found value that we're
looking for. So if it's equal to zero, then we found the value. If it's less than zero,
the value might exist.
And if it does seem to be the left subtree if I compared to returns
value greater than zero, then if the value exists, it's going to be the right subtree.
Okay, let's have an example with some information. So suppose we have four or five queries, find
1425 37, and 17. And that's our binary search tree there on the right. So if we're trying
to find 14, as usual, we need to start at the root and 14 is less than so go left. 14
is greater than 10. So go right, 14 is less than 15. Go left. 14 is gre
ater than 12. So
go right again. And now we have found the value that we are looking for. Alright, and
move on to 2525 is greater than 20. Go right, Wi Fi is less than 31. And now we found 25.
See if you can do 37 on your own. Pause the video or something. Okay, here's her go, go right, go right, your
left, go right. Alright, now let's have a look at 17. So 17 should be on the left. Now
should go right. Maybe we should go right again. And, oh, we've hit a point where we
know that 17 does not exi
st in our tree. So that's another possibility the value simply
doesn't exist. Because we would go to the Left of 90, and the left of 19 is a null node.
And null known means we haven't found the value we're looking for. So now that we found
the value that we're looking for, we're at the Remove phase. And in the Remove phase,
there are four cases, essentially. And the first case is that the node we want to remove
is a leaf node, so it has no sub trees. Cases two, and three, as I like to call them,
is
there's only one subtree meaning there's a right subtree, but no left subtree, or those
the left subtree and no, right subtree. Or finally case for is that we have both a left
and a right subtree. So I'm going to show you guys how to handle each of these cases,
it's really, really not that hard. So case, one, we have a leaf node. So if you have a
leaf node and you want to remove it, then you can do so with a side effect, which is
really nice. So suppose we have the binary search tree on the
right, and we want to remove
eight. So we do phase one, we find out where eight is. Oh, and it is a case one, because
it's a leaf node. So we know we can remove it without side effect. So we remove it. Perfect.
That wasn't hard. Now let's look at cases, two and three. Meaning that either the left
or the right child is a subject. In this case, the successor of the node we're trying to
remove is going to be the root node of that left or right subtree. Let's look at an example.
Suppose we want to r
emove nine. So first, let's find nine. Okay, we found nine. Now
we encounter a case two with a left subtree is nine doesn't have a right subtree. So the
successor is going to be the root node of that left subtree. So seven. So now I can
remove nine, and seven becomes the successor of nine. Perfect. Now let's do another example
where we want to remove four. So first, let's find four. So we find four. And this is our
other case where we only have a left subtree. But no, right. subtree. So where do
we do,
the successor of four is going to be the root node of that left. subtree. So three, so we
can remove four and our three becomes the successor. Alright, that wasn't so bad, was
it? Alright, so now for case four? Yeah, we want to remove node which has both a left
subtree and a right. subtree now the question is, in which subtree will the successor of
the node we're trying to remove? b? And the answer is both the successor can either
be the largest value in the left subtree or the smallest
value in the right. subtree and
here's perhaps some justification for why there can be two successors. So the largest
value that we can find in that left subtree would satisfy the binary search tree invariant,
if it becomes a successor, because it would be larger than everything in the left subtree
and this is immediate from being the largest left subtree and also it would be smaller
than everything in the right subtree because we had found it in the left. subtree similarly,
if we find the small
est value in the right, subtree It would also satisfy the biosurgery
invariant for the following reason it will be smaller than everything in the right subtree
and this is immediate from being the smallest than the right subtree and also larger than
everything in the left subtree because it was found in the right subtree and we know
things in the right subtree are well greater than everything in the left subtree. So we
come to this conclusion that well there must be two possible successors. So w
e can choose
either or fantastic. So if we have this tree, and we want to remove seven, well seven is
a case four, because it has both a left subtree and right subtree also seven, so find seven
vireo found it. And now we get to either pick the successor in our left subtree or our right
subtree, I'm going to find the smallest value in the right subtree what we do is we go into
the right subtree and then we dig as far left as possible, go left and go left again. And
now we don't have a left child,
so we could stop. And this node is going to be the successor,
it is the smallest node in the right subtree. And you can see that quite clearly 11 is smaller
than everything in that right subtree. So now what we want to do is, we want to copy
the value from the node we found in that right subtree 11 into the node we want to originally
remove, which is seven. So now I replaced seven with 11. Now we have a problem, which
is we have 11 twice in our subtree. And we want to now remove that element 11
which is
highlighted because it is the successor it shouldn't no longer be inside the tree. And
luckily for us, the successor node we want to remove is always going to be either a case
one two or three removal. So in this case, it would be the case where there's a right
subtree But no, left subtree. And you would just do this recursively. So so just call
itself and it would hit one of those three cases. Okay, so it's right subtree. So I want
to do is swap it with the root node of that right sub
tree and then remove it. So remove
1114 becomes a successor. And now just going to rebalance the tree like that. Alright,
now let's have a look at another case. In this example, let's remove 14. So first, let's
find 14, if it exists within our tree, go right. Alright, we found 14. Now we either
want to find the smallest value in the right subtree like we did last time, or the largest
value in the left subtree. Now let's do the ladder this time and find the largest value
in the left subtree. To d
o this, what we do is we would go into the left subtree and dig
as far right as possible this time going to the left subtree in digges far right 913.
Now 13 is the largest value in the left subtree. And now what we want to do as before is we
want to copy the value of the successor 13 into the node we want to remove which is 14.
So now 14 becomes 13. And now we want to remove the remaining 13. So now just remove it and
replace it with its successor. Perfect. Okay, I just want to go over some more
examples.
So just these two short examples, because removing is not quite so obvious. Alright,
let's start with the one of the left. So let's see if we can remove 18 from this strange
looking binary search tree. So start at the root find a teen so dig all the way down.
Okay, we found 18. And now we want to literally be it's one of those case two or threes. It's
the one where there's a left subtree. So the successor is just going to be the root node
of that left subtree or 17. So remove 18 and r
eplace it 17. So 17 is a new successor. Perfect.
Now let's look at the other case where we want to remove minus two. So now first find
minus two so go to the left. Okay, we find it found it. Now there's two subtrees pick
one of the two to become to find that successor and we're going to go to the right. Alright,
copies value in now remove one, remove minus one. And that's it. Okay, and that is removals
in a nutshell. All right, I want to finish off binary trees and binary search trees with
some
tree traversals in particular, pre order in order post order and level order. You see
these traversals come up now and again, so they're good to know. I want to focus on pre
order in order and post order to begin with, because they're very similar. They're also
naturally defined recursively. And you can sort of get a feel for why they have their
certain names that they do. So pre order prints before the two recursive calls, in order will
print between recursive calls, and post order will print a
fter the recursive calls. So if
you look at the three functions on the left, the only thing that's different between them
is where the print statement is. So let's move on to some detail on how preorder works.
So on the right, I'm going to maintain a call stack of what gets called. So when we're recursing
backup, we know what caused us to know what node to go to. And what you need to know about
pre order is that we print the value of the current node and then we traverse the left
subtree followe
d by the right subtree. So for order where we're going to do is insert
a, print A, and then we go left, or B, and we go down to D, go down to H. And now we
recurse back up, so we push node h off the call stack and go to D and then we go to I
and then we explore AI. Now we're at the bottom, so we recurse back up, so we push AI off the
call stack. We've already processed D and go back to B we also are a process B but now
we have to do B's right subtree. So we go and explore II. Now we've explored
ease. So
push frame off the stack explored B push that off the stack. And now a now we need to explore
the right subtree of a. So we would print C then F then j and then right at the bottom,
so recurse and then level K and our bottoms are recursive push node k off the stack, push
node f off the stack, and now explore node C's right subtree. So g now l and now we're
done. We curse all the way back up and we would exit our function. And at the bottom,
you can see what the pre order traversal is. O
kay, now let's cover inorder traversal. So,
how inorder traversal works as we traverse the left subtree that we print the value.
And then we traverse the right subtree. And for this example, I'm going to be using a
binary search tree and not just a generic binary tree. And you'll see something interesting
happens. We push node 11 on stack because it's our route. Then we go left, then we go
left again and go left again. And notice as I was going left, I would push those on to
the stack but I'm no
t printing the value yet because when I call in order, the very first
instruction in the in order is to go left. I only print once I've traversed the entire
left subtree if I'm a leaf node, like I am now one is a leaf node, then I've already
explored the left subtree if you will, so I can print the current value. Then I recurse
and then I print three because I've explored threes left subtree now we go right now and
I've explored both sub trees I can print five recurse. Now I can print six becaus
e I've
explored six is left subtree I can go to eight then recurse then print 11. Now we need to
finish elevens are right subtree. So go Right, left, go left may explore 12 recurs and we're
going to print 13, then go down to 14, print 14. Also, because 14 has no sub trees up.
So push 14 out the stack, push 13 out of the stack, print 15, because we will explode 15th
left subtree go Right. Right. And now, the last thing we need to do is finish our function
by just pushing everything off the stack.
So go back up. And now, did you notice something
interesting happened when we did our inorder traversal? Well, what happened was we printed
the values of the nodes in increasing order, which is why it's called an inorder traversal.
If you do it on a binary search tree, it prints the values in increasing order, which is really
neat. That's quite a nice property of the inorder traversal. Now, let's look at the
post order traversal. And the post order traversal says, okay, traverse the left subtre
e, then
traverse the right subtree. And after you're done doing both of those, only then print
the value of the node. So this means if you look at our tree right now, the last value
we're going to print should be 11. Because we need to process a lemon's entire left subtree
elevens, entire right subtree. So let's start at 11 and explore its left
subtree. So go all the way down, then print one, because we've explored both its left
and right subtree. Don't print three because we haven't explored it
s right subtree yet,
print five, because we've explored both sub trees which don't exist. Now we can print
three because we've explored both of its sub trees. And then similarly, they go down to
eight, print eight recurse. And print six, don't print 11, because we still need to do
it's right subtree, go 15 1312, print 12. Go up to 13, print 14, go back up to 13. And
now we can print it. Don't print 15 yet, because we haven't explored all of its right subtree
and go to 1719 and then pop everythin
g off the stack and print on the way back up. And
you can see that Levin is indeed the last node we have visited. And that's pre order
in order and post order a nutshell. Now want to look at level order traversal, which is
special, it's quite a bit different from the other two, a level order traversal is we want
to print the nodes one layer at a time. So start with 11. They want to print six and
15, then three 813 17 and one 512 14 and 19. And you're like oh, how am I going to do that.
And the w
ay we're going to obtain this ordering is by doing something called a breadth first
search from the root node all the way down to the leaf node. So she knows what a breadth
first search is, from graph theory. And this the same thing, a tree is a type of graph,
it's no different. So what we're going to do to do our breadth first search is we're
going to meet in a queue of the nodes we have left to explore. And how it's going to work
is our queue is originally going to contain only the root node.
And we're going to keep
processing, I pulling out the first value in our queue until our queue is over. A bit
more detail on that. So here you can see the queue. On the right, I've inserted node 11.
And so I would pull out note 11. And I would add elevens left child and right child to
the queue. So they would go at the end of the queue. And I've also removed 11. So so
the next notes process would be six followed by 15. And then I would keep adding children
to the queue so that I reach them event
ually. So let's have a look. So I've pulled 11 from
the queue. Now added six and 15 to the queue. Now the next thing on the top of the queue
is six, so it removes six and add six as children three and eight the queue, then 15. Next up,
add 15 children which are 13 and 17 to the queue, and next up in the queues three. So
I would add threes children to the queue, and then move on, explore eight, eight has
no children, so I can't add them. Next 13. As you can see that as I'm exploring nodes,
I just
add the children but keep pulling the most recent thing in the queue. And this gives
us the level order traversal that we want. And this is how you do a breadth first search
in general. So not too complicated, except we had to use that special trick of using
a queue instead of using a stack. So we don't do level order traversals recursively, we
need to do them iteratively with a queue. Okay, finally time to look at some source
code for binary search tree. So the source code I'm about to show yo
u can be found at
the following link. The link should also be in the description at the bottom of this video,
please like the repository so that other people can also find it more easily. And now let's
dive in. All right, here we are inside the source code for the binary search tree. The
source code is written in Java. Okay, so first thing you will notice is I have a class representing
a binary search tree. And it takes as a type anything that is comparable, we need things
that are comparable, s
o we know how to insert them accordingly within the binary search
tree. So to start us off, I have a few instance variables, actually only two, in fact, one
to keep track of the number of nodes in the binary search tree, and another one, which
is the root of this binary search tree because this binary search tree is a rooted tree.
Next, I have a private node class, which contains a left node and a right node, as well as some
data of type T and that type T comes from here. So it's some comparable
type T. Okay,
next I have a method to check if our binary search tree is empty. It simply checks if
the size is zero. And size simply returns the node count, which gets either incremented
or decremented as we add or remove elements. Okay, here's the public method to add elements
to this binary search tree. I also have a private method down here, as you will notice.
And I use the private methods to do the recursive business and the public method to just check
if the element is already contained
within the binary search tree. This insertion method
will return true if we successfully insert a new element into the binary search tree
and it will return false if there's already something inside the binary search tree. So
elements have to be unique inside this binary search tree, okay, so supposing this branch
doesn't get executed, meaning the element does not already exist in the tree, then we're
looking at this branch. And I'm saying okay, add this new element to the binary search
tree rec
ursively and also up the node count by one and return true because well, this
element doesn't yet exists. So let's look at the recursive method now. So our base case
is that we found the null leaf we want to insert our element at so we will create a
new node object with two null children. But with the value of the element, we want insert
Otherwise, we're in this branch. And we choose which sub tree we want to place our element
inside. So either it's going to be the left subtree of the first bran
ch, or the right
subtree. The second branch. Okay, now let's look at removing. So here's the public method
for removing. Now, I'm only going to remove the method if it exists within the tree. So
I check if it's within the tree before, otherwise, it is going to return false, meaning we have
not removed anything from this binary search string. And if it is contained, I'm also going
to decrement the node count. Now let's look at this recursive method to remove the node.
So this recursive method sim
ply has a normal base case, it's now returned now. And in the
first step to removing a node, we first have to find it. And we know it exists. Because
of this check, we know that our element is within the tree so we can remove it. And that
is these two cases. So this is like the find Phase I was talking about in the later video.
So I get a comparative value, and I check if it's less than, so we're going in the left
subtree, or we're going inside the right subtree. It's going to be one of these tw
o cases, otherwise,
we found the node we want to remove. So this finds the node. And here's where we do the
actual removal. So I talked about four cases in my slides. But in fact, you can think of
it more or less as three cases even two cases, because two the cases are very similar. So
the singleton case, where there's only one node, it's really that that can also be thought
of as a left subtree right subtree case, this case is the case where the left subtree is
no but the right subtree is not.
And this case, the right subtree is now but the left
subtree isn't. And this case down here, which I'll get to later, we have both subsidiaries.
So let's go to the very top. So if we only have a right subtree, then I'm going to say
that the successor node is just going to be the root node of that right subtree. So node
dot right. And then I return that, I also destroy the data within this node and the
node itself. So similarly, for the other case, where I only have a left subtree. What I'm
going
to do is, I'm going to reach into that left subtree and grab the root node. And I'm
going to return that note. And I'm also going to destroy this node because we know we don't
want it anymore. So those two cases are fairly easy. Now let's look at the key. So we have
a left, a sorry, a left subtree and a right subtree. So as I mentioned in my slides, we
can either find the largest node in the left subtree, or the smallest node in the right
subtree. And here, I find the smallest node in the right
subtree. So I go down to the
right subtree. And I dig left. And this is the node or the successor node if you will.
So we copy the data in it. And then we recurse and call ourselves to remove the successor
node. If you wanted to find the largest value in the left subtree then you can just uncomment
this code and it would do just that. So that's removing a nutshell and they also had these
two helper methods to do a dig left and a dig right. Moving on, I also have this method
that checks contains
an element. So given an element return
true or false, depending on if that element is within this binary subtree. And this is
very simple. This is equivalent to the fine phase, if we reach a null node, we would definitely
know that that element doesn't exist, otherwise get our comparative value, which is either
going to be less than if we need to go to the left subtree, meaning this case, or greater
than zero, we need to go and right subtree. Or if we found the element, then that's zero
case. A
nd we're in this case. So pretty simple. Just as a bonus, I also threw in a height
function. So this will calculate the height of the tree, it will do so in linear time,
height, we'll call the private recursive height method. And all this does is it's fairly simple.
So if we reach a leaf node, we're going to return zero. Otherwise, we're going to return
the maximum height of the left subtree or the right subtree. Because one of our sub
trees might be greater than the other, and that's going to b
e the one that we want the
height from. When every time we recurse, we add plus one. So this corresponds to a depth.
So the biggest depth is going to be whatever the height of the tree is, you want to go
that way. And now for our traversals, I have created this method called traverse. And what
it does is it takes a a tree traversal order, which is an enamel type, I'm going to show
you in a minute. And and that's an order and then I pick whichever order you give me and
I return an iterator for th
at particular order I want to traverse. So if you tell me I want
to traverse this binary search tree in a pre order fashion that I'm going to return you
this iterator which is this method. If you want to traverse the tree in order for it
to return you this inorder traversal iterator. Let's have a look at what this tree traversal
order is. Let me open that up right here. So that is simply an e&m type you can see
right here. So it's either one of these four things, it's pre order in order post ord
er.
So nothing too crazy. And I decided to do these traversals iteratively. So that you
don't have to do them recursively, which would be slightly slower, and perhaps less convenient,
this is more flexible. Although I don't really want to dive into the code because it is fairly
nasty. If I just open up, say the inorder traversal, then it does look pretty gross,
I have to maintain a stack manually. And I have to check for concurrent modification
errors, etc, etc, etc. These are actually great int
erview questions like how do i do
an inorder traversal iteratively? Or how do I do a post order traversal, iteratively,
and so on and so on. So, so those are definitely great to practice. If you want to actually
see what's hidden inside these methods, just go on the GitHub repository and have a look.
But I, I showed you guys how to do treat reversals in the last slides. So you should be good
to do that. Most of the time we do it recursively. Anyways, I just want to be a bit more fancy
and go all
out intuitively. So this is about it for the binary search tree. Today we're going to be talking about hash
tables, one of the most remarkable data structures of all times, if I had a subtitle, it would
be one data structure to rule them all. So let's get started. There's going to be a lot
of things to cover and the hash table series. We're gonna start off with with a hash table
and what the heck is a hash function, and why do we need one. Then we're going to talk
about some popular collision r
esolution methods. In particular, separate chaining, open addressing
those are the two most popular ones, although there's a ton more, there, we're going to
a separate chaining with linked lists. This is a really popular implementation. And there's
a ton of stuff and open address in any state covered. So we're going to be talking about
linear probing, and quadratic probing how that's done. I'm not even giving lots and
lots of examples for those just because it's not super obvious how they work.
And to finish
off for me going over double hashing, and finally removing elements from the open addressing
scheme, because that's not so obvious, either. All right. So to begin with, what is a hash
table. So the hash table is a data structure that lets us construct a
mapping from some keys to some values, via a technique called hashing, which we'll talk
about shortly. So the key could be any value, as long as it's unique, which maps to a value.
So for instance, keys could be names to people name
s, and the value could be someone's favorite
color. So William's favorite color is green. Mike is this purple, Katherine's is yellow,
and so on. And we call these key value pairs, so each key is associated with the value.
So all the keys have to be unique, the values don't have to be unique. For example, Micah's
favorite color is purple, but also Leah's favorite color is purple. We often use hash
tables to track at frequencies. For example, Below is a frequency table of the number of
times a cer
tain word appeared in Shakespeare's Hamlet, which I Parson obtained this table.
So you can see Hamlet appeared 114 times the word, Lord 223. But the word cabbage did not
appear. So hash tables are often used as do track frequencies, which is really handy.
But you can really construct a mapping between any key value pairs given that the keys are
hashable. This is something we're talking about shortly. So to be able to understand
how we construct the mapping within the hash table, we need to under
stand what a hash function
is. So simply put a hash function, which I will denote from now on as h of x, for some
key x is a function that map's X to a whole number in some fixed range. So that's pretty
simple. Let's have a look at an arbitrary hash function. So if our hash function is
h of x equals some polynomial, x squared minus 6x, plus nine modulo 10, well, all integer
keys get mapped to the range zero to nine inclusive. No matter what integer number I
plug into h of x. So below, I'm pluggi
ng in certain values, or keys fuel into the hash
function and getting an output. So notice that the output is not unique. In that there
could be two values that are plugged into the hash function which yield the same value,
for instance, h four and H have to map to the same thing, and that's completely fine.
So we can also define arbitrary hash functions, not just on integers, but on arbitrary objects,
which makes this really powerful. For instance, suppose we have a string s, and we're gonna
sa
y H of s be the hash function that I ever defined below. So this hash function, all
it does is it sums the ASCII values of the characters within the string, and then at
the end, says, module 50. So that will for any given string just output a number. And
that's a hash function. So, for example, some simple inputs, so h of BB gets 66, or 66.
And then we mined that by 50. So we have 32. The empty string gets zero because we don't
even execute the loop and so on. So we've effectively converted a st
ring to a number.
Great. Alright, here's a mini challenge for you. Suppose we have a database of people
below. And each person has three fields. They have an age, a name and a sex. I want you
to define a hash function h of a person We're going to be given a person object as an argument.
And I want you to create a hash function that's going to map a person to the set of values,
zero through five inclusive, you can pause the video, just create any hash function that
does this given the three field
s. Alright, so here's my attempt at creating such a hash
function. Turns out, there's an infinite amount of possible hash functions we could define
for the hash of the person. And here's a simple one. So I'm gonna say,
Okay, first, let's start with the age. So whatever there is, that's the starting value
hash, then I'm going to add on the length of the person's name, again, an arbitrary
choice, I'm going to increment by one if they're males, and then mod by six. So as you can
see, hash functions
are a bit arbitrarily defined. At this point, we're going to get
into some more sophisticated hash functions later. So this particular hash function yields
those values. Alright, something very important, very, very important that we need to talk
about our properties of hash functions. So if we have a hash function, and two objects,
say x and y, and their hash values are equal, then those two objects might be equal. We
don't know for sure. We have to explicitly check x against y. Yes, the hash
function
tells us that there was a collision right there. But if the hash functions are not equal,
that x and y are certainly not equal. Here's a good question. How can we use this to our
advantage to speed up object comparisons? The answer is that instead of comparing X
and Y directly if we've already computed their hash values, and first let's compare the hash
values of x and y, instead of comparing X and Y explicitly, and in this next example,
you'll see why that's important. Consider the pro
blem of trying to determine if two
very large files have the same contents. This is something you want to do in an operating
system context all the time. So if we've computed the hash values for file one and file two,
first, we should compare the hash values together to see if they match or don't match, because
this requires constant time work, which is super fast. So if possible, we don't want
to open the files at all this would be comparing X and Y directly or file one against file
two, but we
will have to compare them byte per byte if their hash values are equal. So
actually, hash functions for files are much more sophisticated than those that we use
for hash tables. Instead, we use very sophisticated hashing algorithms called cryptographic hash
functions and checksums. Alright, another property of hash functions is that they must
absolutely be deterministic. This means that if h of x produces a y, then h of x must always,
always always produce y and never another value. This is sup
er critical, because this
will screw up our hash table completely If this happens, we do not want this. So an example
of a non deterministic hash function is something that introduces say, a global variable or
some kind of randomness, you don't want that. So in this particular hash function, the first
time I run it, the hash value for five is six. But then if I call it again, we've incremented
the counter and now the hash values seven. Oh, not good. Something else but hash functions
is we try ve
ry hard to make them uniform. To minimize the number of hash collisions,
we'll get into why hash collisions are bad shortly. But a hash collision is when two
objects X and Y hash to the same value that is h of x equals h of y. And the reason why
to be uniform, so that all the values of our hash function are hit, so that we can fill
the table uniformly. So for example, the Table Table we generated earlier, William and key
they have a hash collision. Okay, so now we're able to answer a central que
stion about the
types of keys that we're allowed to put in our hash table. So what makes a key of type
t hashable? Here's the answer. Since we're going to implement our hash table using these
hash functions, we need those hash functions to be deterministic. And to enforce that behavior, you'll see a
lot of programming languages in force that the keys you use, be immutable, meaning you
can't change them. They're fixed. They're constants. So they're things like immutable
strings, integers, but not
things like sets or lists or things you can add or remove things
from, they're immutable. And if we have that condition, and we have a hash function that
is defined for all keys of type t, then we say that, that that type is hashable. Okay.
Alright, so now let's get into the interesting details. How does the hash table work? Well,
ideally, we want our hash function to have really quick insertion look up, and the removal
times. And, remarkably, we can achieve all that using the hash function as
a way to index
into a hash table. And a hash table is just a fancy word for an array. Think of it like
that every time I say hash table think array. So again, we use the hash function as a way
to index into the hash table. So the hash function gives us an index to go look up inside
a table. And yes, we can do this in constant time. Given that we have a uniform hash function,
super essential. Okay, think of the hash table on the right as an indexable block of memory.
So as I said, an array, we ca
n access these entries with the hash function, h of x. So
suppose we're going to be inserting integer string key value pairs into the table that
are going to represent rankings of users to their usernames, say in an online programming
competition. And this is the arbitrary hash function, I chose x squared plus three, Montana.
Alright, let's get started. So if I want to insert the key value pair, three by eater,
so in the contest, the user whose username is by eater got a rank of three, and we wa
nt
to insert that into the hash table, then what we do is we hash the key, which is three,
and we get h of three being three squared plus three montagner. Two, so in our hash
table, or array, index two, we're going to put the key B three, and the value to be byte
either. Alright? Now, this next user will say, which is usually what I call myself an
online programming competitions, he is rank one. And if we hash one, then we get four.
So at index four, then we're going to put the key in the value
for William, or will that fees. Next, if we want to insert Lauren 425, we've got rank
32. Again, we hash 32, and we put her in the table where she goes, then we can do the same
thing for ternary wizard has rank five. And insert orange Knight, again began by hashing
the key finding out where it goes. So as we keep doing this, you keep filling the table,
but eventually we're bound to have a collision. Now, in the event that we want to do a lookup,
for a user that has a rank R, all we need to do is
compute the hash value for R and
then look inside your hash table see where this user is. So say I want to find the user
with reign 10. And I hash 10. Figure out its index is three
and then I get the value which is the orange Knight has rank 10. Great. However, how do we handle hash collisions?
For examples, use Those who have ranks two and eight have the same value. This is problematic.
Because we don't have another place to put them inside our table, they would index into
the same position. Y
eah. And the answer is we use one of many hash collision resolution
techniques to handle this. There are many, but the two most popular ones are separate
chaining and open addressing. The idea behind separate chaining is that we deal with hash
collisions by maintaining a data structure, usually a link lists to hold all the different
key value pairs, which hash to some particular value. Open addressing is slightly different.
It deals with hash collisions by probing and finding other places in the
hash table offset
from the place where we originally hash to. So an open addressing, everything is kept
inside one big array. And separate chaining, you have multiple auxiliary data structures.
So the time complexity, the hash table is actually pretty remarkable. In fact, it's
amazing. So on average, we can achieve constant time insertion, removal and search. But if
you have a terrible hash function that's not uniform, then you can get linear time, which
is really, really bad. Okay, let's talk
about something super, super cool. And that is hash
tables, and separate chaining. Alright, let's dive right in what is separate chaining. Separate
chaining is just one of many, many hash collision resolution techniques. And how it works is
when there's a hash collision, meaning two keys hash of the same value, we need to have
some sort of way of handling that within our hash table. So it's still functional. Or what
separate chaining does is it maintains an auxilary data structure to essentially
hold
all the collisions. And so we can go back and look up inside that bucket or that data
structure of values for the item we're looking for. And usually, we use the length list for
this. But it's not limited to only linked lists, we can use arrays, binary trees, self
bouncing trees, or even a hybrid approach. Okay, so suppose we have the following hash
table, it's just a fancy name for an array of key value pairs of age and names. And we
associate an arbitrary hash value that has been compute
d with some hash function. So
those are hash values, they don't really matter. For now, we're just going to see how we can
use separate chaining to handle the collisions. Okay, so on the left is our hash table. So
our array, and I'm going to be inserting all of these key value pairs into this hash table
via separate chaining, and you'll see how easy it is. Okay, so our first person is will
his ages 21 and he hashed to three. So we're going to put in, in that slot three, Leah,
age 18, hash to fou
r. Since she hash to four, we're going to put her at index four. So the
hash is a way to index into the array. Okay, rig age 61 hash two and put them there. And
re age 25 hash one. Okay, we're starting to get a little bit full in our hash table here.
So you might get some collisions pretty soon. Okay. Lera, age 34, hash to four, but we say
Oh, shoot, Leah is already there. What do we do? Well, in separate chaining, we just
chain along. So essentially, each, each position in the array is actually
a linked list data
structure. So we're going to scan through the link list and see if alera exists and
if Lera does not exist, then we're going to add lira at the very end of the chain. Okay, so Ryan also hash to one but then we look
and Ryan is not equal to Ray so we need to add a new entry at position one. alera age
34 hash to four. So Nope, an O layer already exists. So in our hash table, so we're good.
And she says the same age, so we don't need to update it. So fin age 21, hash to three.
S
o easier hash collision with will who also hash to three. So what we're going to do is
are going to append a fin to the end of the length list chain. So note that even though
that will and fin both hashed to the same value, that is index three, and they have
the same age, we tell them apart, because we store both the key and the value as an
entry in our linked list block. So that's how we're able to tell them apart. Okay, now
I want insert mark, whose age is 10. And who has to four. So scan thro
ugh the linked list
at index for for Mark, and is not found. So we have to append mark at the very end. All
right, now let's have a look at how we can do lookups in this structure. So it's basically
the same thing. So we're going to do is given a name, we want to find what the person's
age is. So suppose we have Ryan and Ryan, when we hash him, we get one. So we suspect
that Ryan should be in bucket one. When I say a bucket, I just mean, whatever data structure
we're using, at index one, in our
case, the link lists. So if you scan this linked list
for Ryan, so sorry, Ray, no. So here, we're comparing the key. So we're comparing the
key Ryan to the key array. And there's not a match. So keep going. Compare Ryan, Ryan,
there's a match. Okay, we found Ryan and then inside in that entry, say, oh, his age is
56. So return 56. Let's do another one, find the age of Mark hash mark. And since our hash
functions are deterministic, we know that if there's a mark, then it's going to be found
in po
sition four. Okay, so we look in the bucket for scan through, oh, last one is a
mark. So returning age 10. So it might happen that the value or the key looking forward
doesn't exist. And so it doesn't exist Your turn now. Okay, so here's a good question.
How do we maintain a constant time insertion and lookup time complexity? If the hash table
gets really full, and I have some long linked list chains? Good question. And the answer
is that once there's a lot of elements within your hash table, yo
u'll actually want to create
a new hash table with a larger capacity and rehash all your items and re insert them into
this new, larger table, because tables are fixed size. Okay, and another good question,
how do I remove key value pairs from my hash table with separate chaining? Well, the answer
is basically the same procedure, you hash your key. And instead of doing a lookup while
you would remove a particular entry and the length list, that's another question. How
do I use? Can I use another
data structure to model the bucket behavior? Yes, of course.
And common data structures are using several linked lists include arrays, binary tree,
so by self balancing trees, Java uses a hybrid approach and our hash map. So once they get
to a certain chain length, they switch to a binary tree, or maybe a cell balance spanning
tree I'm not too sure. However, these alternative methods are a bit more memory intensive and
complex to implement, which is why they may be less popular, but they might
be a lot faster
to hadn't actually implemented them myself. All right, it's time to have a look at some
separate chaining source code for the hash table. So here I am on my GitHub repository
with MCs slash data structures. And you can find the source code here under the hash table
folder. I have multiple implementations of the hash table. Today, we're going to be looking
at the hash table with separate chaining and in the later videos, probably one of these
three, I haven't figured out which one
yet. So let's dive into the code. I have it here
on my computer. So let's get going. All right. So first things first, I have two classes,
one called entry the other one just separate chaining hash table. So let's have a look
at the entry class. First. These entries represent individual items or key value pairs, you would
want to be inserting into the hash table. So in Java, we have generics, so a generic
key, which has to be hashable, and some value. So when I create an entry, I give it the ke
y
value pairs. And I also compute the hash code. So there's a built in method in Java to compute
the hash code for a particular object. And you can override it to specify the hash code
for your particular object, which is really convenient. So compute the hash code and cache
it, you absolutely want to cache it. So you don't have to re compute this thing multiple
times, it should be cheap to compute. But for something like a string, hash code can
take linear time to compute, which is not good. So
here, I have an equals method, which
doesn't override the object equals method because I don't want to have to do any casting.
First, it checks if the hashes are not equal. If the hashes are not equal, we know from
the first video that they are absolutely not equal, so we can return false. Otherwise,
I compare the keys to each other. And that's about it for the entry class. Very simple,
very basic. Now let's get into the meat of thing. So the hash table itself. Okay, so
my default hash table ca
pacities and v3, so holds three, three items, and the load factor
by default if the user doesn't specify ones or be 0.75. So that's the maximum capacity
I'm willing to tolerate, then I have a few important instance variables, we need to go
over. So the maximum load factor, so once the load factor goes above this value, then
I resize the table. So there's a capacity, so the actual maximum number of items that
could or how big the table size is rather. So the threshold. So this is computed to be
c
apacity times the max load factor, so tells us, hey, you're above the threshold to resize
time to resize size, how many items are currently in the table. And this is the table itself.
So it's an array of linked lists, which themselves have entries, pretty simple. So there's a
few constructors, so you can construct a hash table, just using the default settings with
initial capacity, or the capacity at a max load factor. So this is a designated constructor,
let's have a look. So just save the max
load factor is compute default capacity. And make
sure you take the max of say the default capacity and capacity just so that I know you don't
want the capacity be too low. There may be weird things happening if the capacity is
set to one or something like that. Then calculate threshold and then finally initialize the
table. Really simple. All right, let's have a look at all these methods right here. So
size returns number of elements side hash table empty, is the hash table empty. So this
is th
e first really interesting method. So normalized index. And it's used when you want
to convert a hash value into an index. And it says in the comments here, essentially,
this strips the negative sign and place the hash value in a domain, zero to capacity.
Because the hash values are integers, they can be anywhere in the domain of an integer,
which is about minus 32. To the 31, sorry, to positive to the 31. around that. So what
this does is a mask, it strips off the negative sign from the hash va
lue, and then modified
by the capacity. So bring it into this domain, so we can actually use it as a lookup index.
Next up is clear. So we just weren't clear how the table that's straightforward. Contains
key and has here the same thing. So we're going to do is compute given a key. So we
want to find out does this key exist within the hash table. Right? So we're going to do
is compute the keys, hash, normalize it. And then now give us the bucket index. So which
bucket should this key appear? Sho
uld it be in a hash table? I'm just going to seek to
see if I'm going to seek the fire In the entry, if the entry is not equal to no exists, if
it's equal to Now it doesn't exist simple for add an insert are all common names for
just putting something into the hash table, or updating a value inside of the hash table
two. So we don't want to allow no keys, that's something we absolutely don't want. So just
throw an exception there. Otherwise, we're going to create a new entry, find the bucket
it
belongs to, and then insert it using this method we'll get to. Okay, get. So given a
key, I want to find the value associated with that key. Again, though, allow no keys. And
this will be pretty routine all the time always don't want to find which bucket this particular
key belongs to get the bucket index. Okay, find the entry. assuming it's not no, then
this is the entry you want every time its value. If it is no, well, the key doesn't
exist. So return null. Okay, suppose we remove the key now
from the hash table. So he's not
no find the bucket index, and we're gonna call his private remove entry method, which
is just down here. So given the bucket index, so which, which bucket does this keyboard
to, what we're going to do is we're going to seek for the entry inside the link list
structure. And if the entry is not now, then we're going to extract the actual link list
and remove it. So this is a built in a datatype in Java. So this removed from that link this
data structure decrement t
he size and return the actual value, that's all we have to do.
Otherwise, it's not there. So return null. So insert bucket insert entry is a given a
particular bucket, we want to place this entry inside of it. Okay. So first, since we know
the bucket index, we can go in our table and automatically get the linked list structure.
Check out bucket. So bucket is now well, we have to create a new linked list. So we're
essentially lazy only allocating these linked list data structures, which is good,
because
we don't want to use more memory than we need to. So next up, I find the entry that already
exists. So this is in case you want to do an update, for instance. So if the existence
entry is now this means that we need to add a new entry to the end of the blank last.
So add it to the end of the link list, increment the size, and check if we're above the threshold,
and if so, resize the table. And yeah, use now to indicate that there was no previous
entry otherwise need to update that entry.
So then just update the value in the existing
entry and return the value. All right. So seek entry this method we've been using
pretty much everywhere, it finds the returns particular entry at a given bucket index,
if it exists, otherwise, just return now. They probably know what's going on by now.
So extract the length list at the particular bucket index. Otherwise return now if it doesn't
exist yet, otherwise, seek through each of the entries in the linked list and compare
the keys to the key
we're after. And if there was a match, return that entry otherwise return
no date. Here's the last really complicated method called resize table. So this resizes
the initial table doubling the capacity of the table. First we double the capacity. We
recompute the new threshold because now we have a higher threshold because we have increasing
capacity. Now create a new table with this new capacity. So this new table is bigger
than the current table. Then go through the current table. Look for lin
kless data structures
which are not know if you find one. We'll loop through all these entries, calculate
the bucket index. Find the bucket and essentially insert it into this new table and after that
completely remove the old data that was in the old table at the end, set the table to
be equal to the new table. Okay. And then these last two methods, so just return all
the keys and the values within a hash table. fairly simple. And these last two methods
are iterators. And to string which we don
't need to go over. So that's essentially, separate
chaining and nutshell, fairly simple to implement with the link lists much more difficult to
implement with, say, a balanced binary tree or something like that. I'm pretty excited,
we're going to be talking about the open addressing collision resolution technique for hash tables.
So let's get going. First, let's just do a quick recap on hash tables so that everyone's
on the same page. So the goal of the hash table is to construct a mapping from
a set
of keys to a set of values, and the keys need to be hashable. Now, what we do is we define
a hash function on the keys to convert them into numbers, then we use the number obtained
through the hash function as a way to index into the array or the hash table. However,
this isn't a foolproof method, because from time to time, we're going to have hash collisions,
that is two keys that hash to the same value. So we need a way to resolve this and open
addressing is a solution for that. Alright
, so what we're going to be using the open addressing
collision resolution technique, the one thing to keep in mind is the actual key value pairs
themselves are going to be stored in the table itself. So as opposed to say, an auxilary
data structure like in the separate chaining method we saw in the last video. So this means
that we care a great deal about the size of the hash tables, and how many elements are
currently in the hash table. Because once there are too many elements inside of the
ha
sh table, we're also going to be really hard to find an open slot or a position to
place our elements. So just an important piece of terminology, we say that the load factor
is the ratio between number of items in the table and the size of the table. So this means
we need to keep tabs on the load factor. Here's a neat chart from Wikipedia. So on
the chart, what you can see are two different methods. One of them is chaining, that is
separate chaining, and linear probing and open addressing techni
que. And we can see
from the linear probing, one is that once it gets to a certain threshold, it gets exponentially
bad. So you don't want it to go anywhere near that say point eight mark. In fact, we're
going to be keeping it a lot lower than that usually. And what this says is, we always
need to keep the load factor which we denote by the Greek letter alpha, below a certain
threshold. And we need to grow the size of our table once that threshold is met. Right,
so when we want to insert a key v
alue pair into our hash table, here's what we do, we
take the key, we use our hash function defined on our key and we hash the value and this
gives us an original position inside the hash table for where the key should go. But suppose
there's a hash collision, and there's already a key in that slot, well, we can't have two
keys in the same slot. That's not how arrays work. So what we do is we use a probing sequence,
which I will define as p of x. So now on to tell us where to go next. So we hash
ed to
the value, H of K, the original position. And now we're going to probe along using this
probing function in such a way that we're going to eventually find an open spots along
the way. So as it turns out, there are actually an infinite amount of probing sequences to
choose from. So here are a few of them. We have linear probing, which probes via a linear
function. So given an input parameter x, so when we're probing we start, usually x at
zero or one and as When unable to find free slot, th
en we just increment x by one. And
it works the same for all of these probing functions. for linear probing, we use a linear
function for quadratic probing, we use a quadratic function. And then there's double hashing,
which is super neat, actually, what we do is we define a secondary hash function on
our key, find its value, and then use that inside the probing function. And the last
one is the pseudo random number generator probing function that we can use. So given
a random number generator,
we're going to seed it using the hash value of our key, which
we know is deterministic. So it's always going to be the same thing. And then we can use
that inside our probing function, which is pretty neat, and increment by x each time
and we know x increments by one, so we're just getting the next number in the random
number generator, and then the next one after that. Alright, so here's a general insertion
method for open addressing, suppose we have a table of size n. And here's how the algori
thm
goes, first, we initialize x to be one. So x is a constant, or sorry, a variable that
we're going to use for the probing, and we're going to increment x each time we fail to
hit a free position, then we get the key hash just by hashing our key. And that is actually
going to be the index where we're going to look in a table first. So while the table
index is occupied, meaning it's not equal to No, we're going to say our new index is
the key hash, or the original position, we hash to plus the
probing function, mod n,
so that we always land back inside the table, and then we're going to increment x, so that
the next time we run this loop, well, we probe at a different position. And then eventually,
we're going to find a free position, we always set up our probing function in such a way
that we will always find a probe, a sorry, we will always find a free slot, because we
know that the load factor is being kept below a certain amount. Alright, so here's the big issue with open addressi
ng.
And it's that most probing sequences that we choose modulo n, are going to end up producing
some sort of cycle shorter than the table size itself. So imagine your probing sequence
just hops between three different values and cycles. And your table is of size 10. But
you're only ever able to hit three slots, because it's stuck in a cycle. And all of
those three slots are full. Well, you're stuck in an infinite loop. So this is very problematic,
and something we absolutely absolutely need to h
andle. Right, so let's have a look at
an example. So right here, I have a hash table. And using open addressing, it's got some key
value pairs already inserted. And assume that the circle with a bar through it is the no
token. Further assume that we're using the probing sequence p of x equals 4x. And suppose
we insert a new key value pair into the table and that the key hashes to eight. So that
means we want to insert that key value pair at position eight, but Oh, it's already occupied,
because
there's key five value five already there. So what we do well, we probe, so we
compute P of one, which is 4x. At one, so we get eight plus for my 12. Well, that's
zero. So then we go slot zero, and we see Oh, that is also occupied, because the key
one and value one is already there. So now we compute P of two, and then they gives us
16 moto just for and then, oh, that cell is already occupied, and then we keep probing.
And as you see right now we've entered a cycle. So we'll keep probing and pro
bing and probing,
and we always be getting back to the same position. So although we have a proving function,
it does not work. In this particular situation. The probing function is flawed. So that's
good. Quite concerning, because not all probing functions are viable. They produce cycles
are shorter than the table size. How do we handle this? And in general, the consensus
is that we don't handle this issue. Instead, we try to avoid it all together by restricting
the domain of probing functions
that we choose to be those which produce a cycle of exactly
like n. And those probing functions do exist. I have a little Asterix here and says there
are a few exceptions. And this is true. There are some probing functions we can use, which
don't produce a full cycle, but still work. And we're going to have a look at I think,
one of those in the quadratic probing video. Alright, so just to recap, techniques such
as linear probing, and quadratic probing, and double hashing, they're all subject to
this issue of the cycles. And what we need to do is redefine probing functions, which
are very specific that produce a cycle of length and to avoid not being able to insert
an element and being stuck in an infinite loop. So this is a bit of an issue with the
open addressing scheme, but it's something we can handle. Although notice that this isn't
something you have to worry about. If you're in the separate chaining world, just because
we have that auxilary data structure that just captures all
our collisions. Okay, this
is going to be a fun one, we're going to be talking about hash tables, and the linear
probing technique. To get started, let's recap how open addressing works. So in general, if we have a table of size
n, here's how we do open addressing, no matter what your probing function is. So we start
our constant, or sorry, variable x at one, the key hash
is just going to be wherever our hash function gives us four key. And our first index we're
going to check is going to be the
original hash position. And while the table at the
index now equal to null, meaning that position is already occupied, then we're going to offset
the index by where the key hash is plus the probing function mode. And every time we do
this, we increment the variable x so that our probing function pushes us along one extra
position. And then once we found a position, then we can insert the key value pair into
the table at that position. Alright, so what is linear probing? So linear
probing is sim
ply a probing method which probes according to some linear formula, specifically,
the linear function, p of x equals mx plus b. And we have to make sure that a is not
equal to zero, otherwise, we're just adding a constant which does nothing. Now have a
small note right here, which says that the constant b is obsolete. And if you know why,
post in the comments and have a discussion with the others. And as we saw in the last
video, there's a slight problem with this currently, and it's that some l
inear functions
are unable to produce a full cycle of order n. And we might end up getting stuck in a
cycle. Here's an example of that. If we picked our linear function to be p of x equals 3x,
and say our key k hashed to four, and for some reason our table size was nine, then
we would end up with the following cycle occurring. Assuming that positions, four, seven and one
are already taken by other key value pairs. So the fact that we're only probing at those
three locations means we're unable to
reach all the other buckets, which is really bad.
And hence, we're again stuck in an infinite loop, we cannot get stuck in this situation
ever. So how do we avoid it? So that's the question which values of the constant A and
P of x equals x produce a full cycle modulo M. It turns out that this happens when the
constant a and n the table size are relatively prime to each other. The two numbers are relatively
prime, if their greatest common denominator is equal to one So that is a an N have a GCD
of one. And when that happens, the probing function will always be able to generate a
complete cycle. And we will always be able to find an empty bucket. Awesome. Alright,
here's an example, with linear probing. So suppose we have an originally empty hash table,
and we want to insert some key value pairs. And we selected our probing function to be
p of x equals 6x. And our table size be n equals nine. And then we also selected a max
load factor of alpha equals about two thirds, and the threshol
d will then be six. So we
resize once we hit six elements. Okay, so based on the probing function we chose at
a table size, are we likely to run into an infinite loop while inserting? Based on what
I told you in the last few slides? And the answer is, yes, the greatest common denominator
of nine and six is equal to three, which is not one. So let's go ahead and attempt to
insert some elements regardless, we may or may not hit any problems. Okay. So first,
let's insert. So I guess on the left, I
have some key value pairs, I want to insert, and
we're going to insert k one first. So suppose that the hash value for K one is equal to
two, then we would say, Alright, the hash of K one plus the probing sequence, add zero,
my nine is equal to two. So we're going to insert that key value pair at position two.
Alright, next one. So now suppose k of two is equal to two again. So we're going to try
to insert this key value pair at position two. But oh snap, we have a hash collision.
So we're going
to increment x, and now try to offset the probing function at one and
so forming function at zero. So that is, instead of inserting it now at two, we're going to
insert it at eight. All right, so that went in because that slot was free. Now, let's
try inserting key three. So key three, suppose that hashes the three, then we can insert
position three, and there's no issue. Now, notice that we're trying to re insert
k two, which already exists within our hash table. So instead of inserting it, we
're actually
going to be updating its value because it exists in the hash table. Alright, so from
before, we know that the hash value for K two is two. So So then we look at position
two, and K two is not there, and it was hash collision. So we increment x offset by P of one. And
now we're able to find k two and now we can update valuate there. Let's go to K five.
Okay, so suppose k five has a hash value of eight. So eight is taken. So we're going to
probe and that leads us to index five. So we'
re going to insert the key value pair there.
Now sorry, insert case six. Suppose case, six hashes the five, then let's probe ones
now that, so five plus 699 gives us two. Okay, a hash collision, let's keep probing. So now,
five plus 12 mod nine is equal to eight. All right, another hash collision, so we have
to increment x and keep probing, and now we're back to five. So we've hit a cycle. Alright,
so we're trapped in a cycle. But we kind of expected this to happen because we knew that
we picked
two numbers whose GCD was equal to three and not one. So if we look at all
the possible AE values we could have selected instead of six, we see that the ones that
give a greatest common denominator of one with and the table size are 12457 and eight,
while any multiple of three would give us something else. So this comes to the realization
that for our choice of P of x, if we choose p of x to be one times x, then the greatest
common denominator of n, and one is always going to be one no matter w
hat our choice
of the table sizes, which is why p of x equals one times x is a very popular probing function
choice. All right, now suppose we have another hash table, and we wish in certain more key
value pairs, but this time, we're actually going to pick a probing function that works,
so that we don't ever have any issues. So I'm going to pick the table size to be 12
and our probing function to be 5x. So no cycle should occur. All right, so let's go with
inserting the first guy. So suppose tha
t k one has a hash value of 10, then at index
10, we'd put k one v one. Suppose k two has a hash value of eight, then slot eight is
free. So let's pop those right in there. So now, suppose k three is equal to 10, hash
collision with K, one in there, so we need to keep probing. Alright, so if we use our
probing function, which is p of x equals 5x. So they'll give us three module and when we
add it to the hash value, moving on insert k four. Now suppose the hash value for K four
is 10, then we hav
e to keep probing. So then we hit k three, which inserted last time.
And then oh, we also hit k two. But we're able to pull out eventually when we hit the
third probing value, however, now notice that we've actually reached the threshold of our
table, if I go back a few slides, we can see that I picked alpha to be 0.35. So n, which
is the table size times alpha gave us four. And we just finished inserting the fourth
element. So it's time that we resize the table. So how we usually resize the tab
le is VSM,
exponential fashion, such as doubling or tripling, or so on. But we need to double in such a
way that the GCD of N A still holds. So after a doubling, and is equal to 24, and the GCD
properties is still good. Alpha is a constant, so it's still 3.5. So our new threshold is
just going to be eight and we don't change the programming function. Alright, so let's allocate a new chunk of memory for
this new table. And now we need to place all the old elements in our old table into this
new t
able using the new table size for n, right. So we scan across all these elements,
we start at zero and nothing there and move along. So from before we knew that hash value
for K four was 10. So inside the new table is going to go at position 10. Scan along
nothing here. from before, we know that k three was 10. So it should go in position
10 in this new table, but we have a hash collision, so we have to keep probing. So if we add our
probing function at one, which is 5x, then we get 10 plus five
, which is 15. So we're
going to insert our position 15 and the new table, keep probing nothing here and nothing
here, nothing here. So eventually, we hit k two. So we know from before k two is equal
to eight. So we're going to insert position eight. Now we know k one is equal to 10. So
we're going to try and insert position 10. that's taken so to probe so the next position
is also taken. So at five again, that gives us 20. So insert k one v one at 20. Alright, so now we throw away the old table
and we
keep this new table and this the new hash table we're working with and we were at inserting
k five. Now suppose K phi is equal to two. And that spot is free. So we are good. So
here's a frequently asked question. Sweet I know how insertion works. Now how do I remove
key value pairs from the hash table using open addressing? And my answer this is that
this topic is going to merit its own video. And we're going to do it after we see all
the other open addressing techniques, because it's ac
tually non trivial. All right, let's
talk about hash tables and how quadratic probing works. Let's dive right in. So let's recall
how we insert key value pairs into a table of size and using the open addressing collision
resolution schema. So first, we initialize a variable called x to be one, which we're
going to increment every time we're unable to find a free slot, then we compute the key
hash. And that's going to be our first index going to check and we're going to the loop.
While we're unab
le to find a free slot, meaning the table at that index is not equal to null,
so it's already occupied. Every time that happens, we're going to offset the key hash
using our probing function, our probing function in our case is going to be a quadratic function.
And then we also increment x and eventually we will find an open slot to insert our key
value pair. So what is the idea behind quadratic probing? So quadratic probing is simply probing
according to a quadratic formula, or specifically whe
n our probing function looks something
like p of x equals A x squared plus bx plus c, and a, b, and c are all constants. And
we don't want a equal to zero, otherwise, we degrade to linear probing. But as we saw
in the previous videos, not all quadratic functions are viable because they don't produce
a cycle of order, and we might get stuck in an infinite loop. So as it turns out, most
randomly selected quadratic probing functions will end up producing a cycle. Here's an example.
Suppose that we
chose our probing function to be p of x equals 2x squared plus two, the
key we wanted to insert hash two, four, and the current table size was nine, then we would
end up with a two cycle. So first time around, we would probe at position zero, we would
get four probe at position one, get seven, suppose those two entries are full, and then
we would end up in a cycle again. So our probing function is only ever able to hit the buckets,
four and seven. So it makes it impossible to reach all the other
buckets, 012356, and
eight. So we're stuck in an infinite loop when four and seven are already occupied.
This is really, really bad. So so the question then is, how do we pick a probing function,
which always works? The answer is there are numerous ways. But here are the three most
popular ways I have found. So the first way is to select the probing function to be p
of x equals x squared. And to keep the table size a prime number greater than three, and
make sure our load factors always kept be
low one half or less than or equal to one half.
The other one is to say our probing function equals x squared plus x divided by two, and
make sure the table size is a power of two. And the last and final one says that p of
x should be the alternating sign of x squared, and keep the table size a prime number where
n is congruent to three mod four. For example, we can say that I were table size was 23,
because that's a prime number as congruent to three mod four. So any of these will work.
But as
you see, they're very specific and how they work and whether table size should
be and that's a common theme in quadratic probing. So we're going to focus on the second one
where p of x equals x squared plus x divided by two and the table size is a power of two.
Alright, suppose we have an initially empty hash table, and we want to insert some key
value pair and we're using the following quadratic probing function, p of x equals x squared
plus x divided by two, and the table size is a power of tw
o, so it's eight. And that's
like the load factor to be point four, therefore, the table threshold is going to be three.
So just to emphasize that the table size must absolutely be a power of two, otherwise, this
will not work. Okay, so let's insert the first guy. So suppose that k one hashes six, then
we're going to insert k one at position six. Right, next k two, suppose k two is equal
to five, then we're going to insert it at five, no collision there. Suppose k threes
equal to five, then we h
ave a collision, so need to handle that. So we're going to try
probing and offset one. So that brings us to six. So we probe again, and that brings
us to index zero, which is free. So we're going cert, k three, and V three key value
pairs there. Let's insert k four. Oh, but before we can do that, we've reached the table
threshold, so we have to resize the table first. Okay, so let's allocate a new block
of memory. And let's double the size of the table to keep it a power of two. So our new
table
size of 16 max load factor doesn't change. However, a new threshold is six, and the probing
function doesn't change. So we need to place the entries in the old hash table into the
new hash table. So from before, we know that k three hashed to five, so we're going to
put it at index five, so we don't have a collision there. And no element at position 123 or four,
then we could position five and find key two right there. So we know from before that key
to also hash to five. So we're going to try
a preposition five, there's a hash collision,
so we probe one, and then we get five plus one equals six, position six, or insert k
two. Now let's try insert k one. And we have them before k one hash two, six, but we can't
put it there because we have a collision. So we're going to probe along. So we're going
to put it in position seven. All right, and that does it for resizing the table. So let's
keep going, we still have a few more elements to insert inside our table. So suppose that
k four equ
als 35,410. So when we made that buy 16 years, this position to, so we're going
to say k four, V four at position two. So we've already seen k three, and we know its
hash value is equal to five. So since k three is already in our hash table, we're going
to be performing an update. So k three person probing functions, zero gives us five. So
let's update value to be V five instead of v3, which it was before. So suppose that the
key k six hash to minus 6413, so that hashes to three mod 16. So that'
s why it's free.
So just insert it. And the last one, case seven suppose hashes to well, we have a collision
right there. So let's probe along. So when we probe, our probing function gives us an
offset of one that's also taken, so probing can so now we are at position five, but that's
also taken. So keep probing again. So we probe for a fourth time scan offset of six. So that
gives us eight. That slot is free. We're going to answer that
there. Alright, let's talk about hash tables and the double
hashing, open addressing collision
resolution technique. So just a quick recap, for those of you who don't know how we do
insertions into hash tables are open addressing, we start with a variable x initialized to
one, we compute the hash value of the key We set that to be the first index that we're
going to check in our hash table to see if it's not now. So the goal is to find an empty
slot. So we're going to loop while we are still hitting spots where there are already
keys. And every time tha
t happens, we're going to offset our key hash using a probing function.
In our case, it's going to be the double hashing probing function. And we're also going to
increment our value of x so that we keep probing along further and further. Once you find a
free spot, then we can insert our key value pair into the hash table. Okay, so what's
the deal with double hashing? So double hashing is just a probing method like any other. But
what's special about it is that we probe according to a constant m
ultiple of another hash function.
So specifically, our probing function looks something like this, we give it as input,
the key which is a constant, and the variable x, and we compute x times h sub two of k,
where h 2k is a secondary hash function. So here's an important note, H 2k, must hash
the same type of keys as h one, okay. So if your key is a string, well, h two of K must
hash strings. If h one a K is hashes integers, the nature of K must also hash integers. So
here's a note I put at the
bottom that notice that double hashing actually reduces to linear
probing, except that the constant is unknown until runtime, because we dynamically compute
it with h two of K. So since double hashing reduces to linear probing at runtime, we may
be stuck with the same issue we saw with linear probing, which is that we get stuck in an
infinite cycle. So if p of x equals 3x, meaning our secondary hash function at runtime calculate
a value of three for the constant and say, h1 of K was four, at the
table size was nine,
that we might end up with the following cycle occurring. So the cycle produces values of
four, seven, and one, which makes it impossible to reach any of the buckets, 02356, and eight.
So if four, seven and one are already occupied, that means we're stuck in an infinite loop
inside our hash table. And we're not able to insert our key value pair because we're
not able to go anywhere else. So this is an issue we have to deal with. So to fix the
issue of cycles, with double has
hing, here's one strategy, we're going to pick our table
size to be a prime number, and we're going to compute a value called delta. So delta
is just going to be h two of K, my dad. But occasionally Delta might be zero. And if that's
the case, then we're going to be guaranteed to be stuck in a cycle because we're not going
to go anywhere when we probe because we're going to be multiplying by zero. So when this
happens, just set delta to be equal to one. All right. So here's the justification why
this works. So notice that delta is always going to be between one inclusive and non
inclusive. So the greatest common denominator between delta and n is going to be one, since
n is a prime number. So therefore, with these conditions, we know that the probing sequence
is guaranteed to have order and so we're going to able to hit every single slot in our hash
table. So given that there's at least one free slot in the hash table, which there will
be because we're keeping our load factor below a c
ertain threshold, that we're going to be
able to reach that slot. Okay, so here's a core question, how do we
construct our secondary hash function? Suppose the keys we're using have type T, and whenever
we use double hashing, and we need to construct h 2k, two hash keys that are also of type
T. So it would be really nice if we had a systematic way of generating these new hash
functions every time we need one, right? Here's we might be dealing with multiple different
data types of type T. So luck
ily for us in computer science, Almost every object we ever
create is constructed of the same fundamental building blocks, in particular integers, strings,
real numbers, fixed length, vectors, and so on. So we can use this to our advantage. Luckily,
there are many well known hash functions for these fundamental data types. And we can combine
them when we want to construct our new hash function h two of K. Frequently, when we compose
hash functions, we select them from cool hash functions called
Universal hash functions,
which generally operate on these fundamental data types, which is quite convenient. Alright,
let's have a look at an example with double hashing. So suppose we have an originally
empty hash table above, and we selected our probing function to be p of x equals x squared
plus h sub 2k, which is what we needed to be in our table size v n equals seven. Notice
that that's a prime number. And we set our max load factor to be alpha equals point seven,
five. So our threshold to
resize is five. So once we hit five elements, we need to grow
the table size. All right, so let's insert all these key value pairs on the left into
this hash table. So first key value pair is keyword and v one. Now suppose that the hash
value for the first hash function of k one is 67, and H 2k. One is 34. And first thing
we do is we compute delta. So we compute h two of K one modulo seven, which is the table
size, and we obtain six. And then we compute where this key value pair should go and s
hould
go at position four. So now k two, suppose h one of K two is two, and H, two of K two
is negative 79. So delta is five. But we're just going to insert a position two, because
there's no collisions. So h one is k three is two, and H 2k. Three is 10. These are just
arbitrary numbers I chose for these hash functions. So then delta would be three in this, in this
case, because 10 mod seven is three. So we have a hash collision, because we're trying
to insert our K three at index two, but k two
is already there. So what we need to do
is we need to probe, so we're going to compute the position to be our original hash function
position. So h one of K three a position two plus one times our delta value, mod seven,
that gives us five, so we're going to hop to position five right there. Now, right,
now, let's insert k four. So suppose now that h one of K four is equal to two, and h two
of K four is equal to seven. So let's compute delta. So h two of K four modulo seven is
equal to zero, be
cause seven months seven is zero. So when this happens, we know that
we need to set delta equal to one so that we don't get stuck in an infinite loop. So
now, each one of K four, plus zero times delta mod seven gives us two. So we have a hash
collision at the backend for index two, so you keep probing. So now we probed by multiplying
my delta, that gives us index three memory, we were able to insert. So now we're going to try insert k three,
which is already in our hash table, actually, but with
a new value. So we're going to be
performing an update this time. So suppose h 1k three is equal to two. Actually, we should
have known that from before. And HTF k three is 10. So compute delta. So we have a collision.
And we find k three which was already in there and update its value. Now suppose the first
hash function for K six is three, and the secondary hash function of k six is 23. Then
that gives us a delta value of two. So when we try to insert it, it goes to position So
we need to pro
be. So we compute one times delta mod seven, that gives us five. There's
another collisions, and then we compute the offset at two times delta minus seven, which
gives us zero. And we find a free slot. So we're able to insert our key value pair there.
So notice that we actually inserted five elements, so it's time to resize and grow the table,
because we've reached the maximum threshold. So one strategy when we're trying to resize
the table with double hashing, because we need to keep our table
size to be a prime
number is to double the table size, and then find the next prime number above this value.
So in our case, when we double, we get 14, and the next prime number 14 is 17. So 17
is going to be the new table size. So allocate a new table of size 17, and go through the
elements in the old table and insert them into the new table. So from before, a 12k,
six is three, and h two of K six is 23. We compute delta, and we know we're going to
put it at position three, and there's no hash
collision. Next up, and nothing in position
one, position two, we have k two, so the hash value for K to two and the secondary hash
value of k two is negative 79. from before, to compute delta to be six. So we're going
to insert that at position two with no collision. Next 1k four. So we know h one of K four is
two, h two of K four would be seven. So we compute our delta value. And notice that our
delta value now is not zero, which it was before, but seven because our mod is a 17
Delta seven. So
we have a hash collision here. But we need to keep probing. So compute the
offset at one times delta, which gives us nine months 17. Next one, insert k one, suppose
h one of K one is two and H 2k. One is 34, then compute delta. And that gives us zero.
So when delta is zero set to be one. So now, h one of K one plus zero times delta gives
us two. So there's a cushion ask need keep probing. now compute the offset at one times
delta, there's still a collision, keep incrementing, the x value. So no
w two times delta t is four,
and we find a free slot so and so that key value pair there. And the last 1k three. So
from before HR k three is two and h two of K three is 10. Delta is then 10. And we have
a hash collision. So one times delta now gives us 12. And that slot is free. And we reached
the end of the new table. So get rid of the old table and replace it with a new table.
And let's finish inserting our last key value pair from before which is k seven. Suppose
k of seven is equal to 15 an
d h two of K seven is three, then our delta value is three. And
we're able to insert it at position 15. All right, I know a lot of you have been anticipating
the Remove video for how we remove elements from a hash table using the open addressing
scheme. So let's have a look first at what issues
there are when we remove elements, if we do it naively, I think this is valuable. Because
suppose though we have an originally empty hash table, and we're using a linear probing
function, so p of x is sim
ply going to be equal to x. And we want to perform the following
operations we want to insert three keys, k 1k, two and K three, and then remove k two
and then get the value of k three after. And for the sake of argument, let's assume that
we have a hash collision between K 1k two and K three, all equal to one. This is a possible
scenario. So let's insert them. So key one hashes to one. So we're going to insert at
position one. So let's insert k two. So it has a hash collision with K one which i
s already
in the table. So we're going to linearly probe, so insert it in the next slot over, that's
inserted k three, it also hashes to one. So let's probe Okay, another hash collision.
So let's keep probing, and finally we insert it in position three. Now, we are going to remove k two, and we're going
to use the naive removing method, where we're just going to clear the contents of the bucket
to find k two, we hash K to get one. But k was equal to k two. So we haven't found the
element yet. So
then we probe and then we have found k two. So we're just going to remove
it, like so. Alright, now let's query the table and obtain the value for K three. Let's
see what happens. So when we hash k three, we get one and K one cycles here three. So
let's continue the search. And then we have no element. So what does this mean? So since
the value in the bucket at index two is now we're forced to conclude that he three does
not exist within the hash table, otherwise, we would have encountered it b
efore reaching
the null position. That's how finding an element works inside a hash table. So this method
is clearly flawed. The nave removing method doesn't work. Because k three clearly exists
within hash table, we can see it there in index three. So here's a solution to the removing. When we remove an element, we're going to place a unique
marker called a tombstone, instead of a no element to indicate that a specific key value
pair has been deleted or removed. And when we're doing a search, w
e're just going to
skip over the tombstones. So let's replace the deleted bucket with a tombstone like we
should have done and see what happens when we now search for the key k three. Okay, so
hash, three hashes to one. So k one style equal to k three. So keep probing. Alright,
we hit a tombstone, so we know that that element was deleted. So keep probing. All right, we
have found k three, and we can return its value v three as our answer for the search.
Awesome. Okay. So a quick question about t
ombstones. I have a lot of tombstones cluttering my hash
table, how do I get rid of them. So the thing with tombstones is that we're actually going
to count them as filled slots in the hash table. So they're going to increase the load
factor. And they'll all be removed when we resize the hash table. But there's also another
way to get rid of them is that when we insert a new key value pair, then we can replace
the buckets that have tombstones with a new key value pair. And I want to give you guy
s
an example of how that is done. All right. Suppose this is our hash table of size eight,
and we're using the quadratic probing function, p of x equals x squared plus x divided by
two. And let's see how tombstones come into play doing this when we want to do a lookup.
So suppose we want to insert the key k seven inside the hash table and the hash value for
K seven is five. So we're trying to find the key k seven. So k seven, hash to five. So
we check there first k seven cycle, the K four. So le
t's keep probing. So we probe quadratically.
And we're just going to go the next slot over and that's position six. Now, position six
is special because it's the first position we encounter which has a tombstone in it.
So we were going to want to store this position for later to perform an optimization. Okay,
keep probing because we haven't found a case seven yet. So when we probe at position two,
we go to position one, still no case seven because we have a tombstone so we have keep
going. Alrig
ht, hit and you Another tombstone. So let's keep probing, and Aha, we have found
our key k seven. So we can return the value v seven. Now, however, we can do an optimization,
because we don't want to probe an additional four times to find k seven, because we just
hit a bunch of tombstones along the way. So an optimization we can do is to relocate the
key value pair k seven v seven to the first position where there was a tombstone. So that
next time we do a look up, it'll be much faster. We call
this lazy deletion or lazy relocation.
Alright, so we replaced the first tombstone there with K seven v seven. And now we have
two copies of that key value pair. So we're going to want to replace the old one with
an old token. Okay. All right, today, we're going to be having a look at some source code
for a hash table that uses open two addresses as a collision resolution scheme. And you
can find all the source code on GitHub at William fiza, slash data dash structures,
and then just go to the h
ash table folder to find a whole bunch of hash table implementations.
I have three different open addressing implementations here. In particular, we are quadratic probing,
linear probing, and double hashing. They're all very similar to each other. So I will
only be looking at one right now. But if you're curious, you can go on GitHub and check them
out for yourself. The one that's really different, or slightly more different is the double hashing. But other than that, they are really essentially
the same thing. But today, I've decided that we're going to have a look at the quadratic
probing file. So let's dive into the code. Alright, here we are inside the code for the
hash table that uses quadratic probing. So let's dive right in. So I have a class called
hash table quadratic probing and notice that it takes in two generic types K and V. So
this is the key type, and this is the value type. So you're gonna have to specify these
when you instantiate an object, which is a hash table for
quadratic probing. So I have
a bunch of instance variables that we're going to need. The first is the load factor, this
is going to be like the maximum load factor that we're willing to tolerate the current
capacity of our hash table, the maximum threshold, we're willing to tolerate the modification
count, this is for the iterator. Next, I have two instance variables that keep track of
the total number of buckets being used. And the key count, which tracks the number of
unique keys are currently
inside the hash table. Now, since we're doing open addressing,
we are storing these key value pairs directly inside an array. So instead of having one
array with a wrapper class for an entry have decided just allocate two different arrays,
one for keys, and one for values, it makes the code a lot easier. And shorter. Actually,
this is just an instance variable we're going to be using, or rather setting when we call
the get method. So this is the tombstone I was talking about in the last video.
This
is the marker we're going to be using for deletions. So every time we delete an entry,
we're going to mark it with a tombstone. And we know this tombstone object is unique. Alright,
so these are just some default constants. So whenever you want to initialize a hash
table without any parameters, you can use these constants. So this is a default load
factor, I set it pretty low, but you can change it up as you like. So you can initialize it
with a default capacity. And this is the designated
constructor. So let's have a look. So the
capacity can't be negative or zero. And I check if the user pass in some sort of a weird
load factor, because we don't want that. So then set the max value for the load factor,
then calculate the capacity. And to ensure that the capacity is a power of two, I need
to essentially round up the capacity. So that's going to be with this method next to power.
Essentially it it finds the power of two. That is just Above this current number, or
it'll also take t
he default capacity, which itself is a power of two, so we don't have
to worry about that. So either way, it's going to be a power of two, then we compute the
threshold, which is just load factor times the capacity and initialize our tables. Alright,
so let's get rolling. So this is the quadratic probing function I chose. So P of x, so we
give it the variable x, and then we compute x squared plus x divided by two. So this is
a really important method, the normalize index. So given a hash value,
it essentially strips
the negative sign and mods by the capacity. So it dumps our hash value inside the domain,
zero to the capacity non inclusive. This is a clear method. And this is pretty self explanatory.
Just clear the contents of the cat of the hash table and start fresh. Some helper methods
sighs get the key count and check if our hash tables empty. And put add an insert or essentially
all the same method. So let's look at the insert method. This inserts a key value pair
inside the hash t
able or updates a value if the key already exists. All right, we don't
want to allow no keys. That happens, we throw an exception. If the number of buckets use
is greater than or equal to the threshold we're tolerating, we're going to resize the
table before doing anything else. Otherwise, we want to calculate the hash value from the
key using the built in hash code method. And you can override this for your particular
object. Okay, now I need to explain what I Jane xR. So I is going to be the c
urrent index
or at in the hash table, because we're going to be bouncing around this I value is going
to change a lot. And j is the position of the first tombstone we encounter if we encounter
one, otherwise, it's minus one. And we're going to be using this for an optimization
is an axis the probe offset, which I've set to one, initially. Okay, so this is a do while
loop, it's a little long, but it's pretty easy to understand. Alright, so first, we check in the key table,
have we hit a tombstone
? If we have and j is equal to minus one, that means we haven't
hit a tombstone yet. So save where we found this tombstone. Okay, so this next check checks
F, the key table has an entry that's not null, meaning there's a key inside of it. So we
have to check if the key our existing hash table. So that's what this does. It compares
the current key at index i with a key we're trying to insert this key. And if j is equal
to minus one, meaning we haven't hit a tombstone, then just update the value.
If we've hit a
tombstone, then we want to do a swap with the tombstone. And at the modification, count,
and always return like the old value that was there before just just in case why use
it. Alright, next up the current cells now so we can do an insertion. So j is equal to
minus one means that we haven't seen a tombstone so far. So increment number of use buckets
and a key count, and then store our key value pair. Otherwise, we have seen a tombstone
and instead of inserting where I element AI,
where the element is inserted where the deleted
token was found, so we're inserting at j instead of AI. So here we're inserting an AI, but
stay within sorry, J with tombstone is and we're gonna return null because there was
nothing there before. Okay, and if if we do a loop, so we get through all these if statements
and we haven't returned, that means that we need to keep probing we had a hash collision,
and we saw we landed on or it had something in it. So we need to probe so we need to offset
where we first originally hash to plus the probing index or the probe. Addition and increment
x at the same time, so this will just hop to the next spot. And we'll do this while
we haven't found an empty slot, and we will find an empty slot. Alright, so contains key and has key, just check if
a key exists within the hash table. And to do this, I'm being pretty lazy. And I'm just
calling the get method. And I'm setting an instance variable in there called contains
flag, which gets set to true or
false whether key is inside our hash table or not. Because
the Husky in the get method would look would have essentially the same code. So that's
the bad code smell. Alright, so let's look at the get method since it's getting used
in the Husky method. So Sinclair set up, find the original hash index is equal to the hash
set j and X to what they were before. So essentially do all the same stuff, or mostly except set
the flag, so that's different, we set the flag to be true, when we identify that
the
key is indeed inside the hash table. And here, our else condition is just shorter, we return
if we hit a null cell, instead of inserting a new element, and set the contains slide
to be false. Okay, so pretty easy. And the Remove method is actually quite a bit shorter.
Interestingly. So same setup, check, the key is now find the hash set x to be equal to
one here, we don't really care about tombstones too much. So we don't have a j position. So
we're going to start the hash index and quadrat
ically probe until we find a spot. So for every loop,
we're going to increment i or find a new offset position if this loop gets completed, so here's
what we do, we ignore tombstones, so just skip over those. So if this happens, if the
key was not found in the hash table, we can return now. Otherwise, the key we want to
remove is in the hash table. And we can do this check, because we check if it's null
before, so we're not going to get a null pointer. So decrement, the key, count up the modific
ation
count, extract the old value, and dump a tombstone here, and just wipe whatever value is in there.
So this is where we set this tombstones in the Remove method, and then just return the
old value for good measure. Alright, and, okay, these two methods are pretty self explanatory,
they just return the keys and return the values that are contained within our hash table.
So the next really interesting method is this resize table method. So this is this gets
called when we are inserting new el
ements, I mean, to grow the table size. And remember
that in this particular quadratic probing implementation, we always need the capacity
to be a power of two. But since we know that the capacity is already a power of two, multiplying
by two will keep it a power of two. So that's fine. So we compute the new threshold allocates
new memory for a new table, I call that old table, but it's actually going to be the new
table shortly. So here, I I perform any kind of interesting maneuver here, I swap
the current
table. That is a smaller table with this new table, which I call an old table. In order
to be able to call the insert method down here, we'll get to that. So swap the key tables,
swapped the value tables, or reset the key count and the bucket count. And the reason
I call it old key table was because since the swap, well, the then the new table is
actually the current pointer for what was the old table. That might sound confusing,
but I'm using the table we had before to do insertion
s on or the pointer to it. Alright,
so loop through the old table values. And if we encounter a token or a pointer that's
not know and not a tombstone, then we will inserted. So because we're avoiding reinserting tombstones,
we're getting rid of all the tombstones. So even though our table might have been cluttered
with tombstones, we're just getting rid of all of them here. Alright, so that's that
the iterator, you can probably have a look at this yourself. It's just looping through
all the key
s and returning them one at a time. That's a pretty standard to string method.
So that's how you do quadratic probing with open addressing. Today, I want to talk about
the Fenwick tree, also sometimes called the binary index tree, and you'll see why very
soon. This is one of my personal favorites, because it's such a powerful data structure.
And it boosts efficiency. And it's really, really simple to code. So let's dive right
in. So things my covering the family tree, video series, and just some
standard stuff.
So go over some motivation, why this data structure exists, analyze the time complexity,
and then go into some implementation details. So in this video, we'll get to the range query,
and later videos, how to do point updates and how to construct the Fenwick tree in linear
time. You can also do things like range updates, but I'm not going to be covering that in this
series. At least not yet. Okay, so what is the motivation behind the Fenwick tree. So
suppose we have an array of i
nteger values, and we want to query a range and find the
sum of that range? Well, one thing we can do would be to start at the position and scan
up to where we want to stop, and then sum all the individual values between that range.
And that's fine, we can do this. However, it'll soon get pretty slow, because we're
doing linear queries, which is really bad. However, if we do something like compute all
the prefix sums for the array, then we can do queries in constant time, which is really,
really
good. So if we set the array p, index zero to be zero, and then we go in our array
and add that element to the current prefix sum, we get five and then five, plus or minus
three, give us two, and then six plus two is eight, and so on. So this is an elementary
form of dynamic programming right here, calculating out prefix sums. And then if we want to find
the sum, from say, two to seven non inclusive, then we can get the difference between those
two indices in the PRA. And that's a constant time
thing to compute. The sum of the values
between two and seven are inclusive. So that's really great. However, there's a slight flaw
in this, which is what happens when we want to update a value in our original array A,
say, we want to update index four, to be three. Well, now we have to recompute all the prefix
sums. So this is really bad, because it can recalculate all those prefix sums. And to
get to linear time. This is why the February Chu was essentially created. So what is the
fennel tree
s, the family tree is a data structure that supports range queries on arrays and
also point updates. Also range queries, but we won't be covering that in this video. So
it's construction is linear time. Point out dates are logarithmic. range, some queries
also logarithmic range updates logarithmic, but you can't say add elements to the array
can't make the array bigger or entirely remove elements. Okay, so let's look at how we can do range queries
on this thing. First thing you need to know is t
hat unlike a regular array, a family tree,
each cell is not responsible for itself but rather for a range of other cells as well.
So we're going to say that a cell is responsible for other cells depending on what The value
of its least significant bit is its binary representation. So on the left, I have a one
based array. Fenwick trees are one base. That's very important. And I, on the side of that,
I also put the binary representation of each of the numbers, you can clearly see what they
look l
ike in binary. So if we have, say, the index 12, its binary representation is 1100.
So the least significant jet is that left most bit. So that is at position three, and
the binary representation, so that index is responsible for we're going to say two to
the power of the position minus one, so four cells below itself. Similarly, 10, has a binary
representation of 1010. And the least the least significant bit is that position two.
So it's responsible for two cells below itself. And 11 has this t
hing fit in there position
one, so solely responsible for oneself, itself. So here, I've outlined the lease Sydney, leasing
nificant bits, I think, really hard to say, for all the odd numbers, which are just responsible
for themselves. So that's what the blue bar indicates, the blue bars don't represent value,
they represent a range of responsibility. And that's really important for you to keep
in mind. So now, all the cells which have arranged responsibility to now the cells have
a range of spo
nsibility to four minutes that all these ranges of responsibilities or powers
of 2/8, is responsible for eight cells, and 16 for 16 cells. So now, how do we do a range
query on this now that we're not really working in this standard array, but rather this weird
type of range of responsibilities? And the answer is, we're going to calculate the prefix
sum up to a certain index. And that's what's eventually going to allow us to do a range
queries. So let's figure out how to do prefix sums, just lik
e we did for a regular array,
but on this family tree. So the idea is we're going to start at some index and cascade downwards
until we reach zero, you'll soon see what I mean. So for example, let's find the prefix
sum up to index seven. So we're in calculating the prefix sound from index, one to seven,
inclusive. Usually, everything in a Finnick tree is inclusive. So if we look at where
we are at seven, so we can get the value in the array at position seven. And then we want
to cascade downward
s. So the next guy below us is six, and then four. Notice that we were
like hopping down. So we start seven, and then we move down to six. And then from six,
we go down to levels because six as arranges responsibility to, and then we're at four
and four is a range responsibility of force, that brings us all the way down to zero. And
we're always going to be able to go from where we are all the way down to zero. So the prefix
sum for seven is the array at index seven plus the array index six plus
the array index
four. All right, now to find the prefix sum for index 11. So we always start at where
we are. So that's 11. And then we're going to cascade down. So the cell directly below
us, is 10. And 10, is arrangers responsible to so we're gonna go down to, so that's eight and an eight brings us all
the way down directly to zero. Okay, cool. And one last one, let's find the prefix son
of two index four. So four, we searched for four as arranger responsibility of exactly
four sets already.
We're back down to zero so we can stop. Okay, let's pull this all
together and try to do an interval sum between i and j. So let's calculate the interval sun
between say 11 and 15. So the idea is we're going to calculate the prefix sum of 15 and
11. So we're going to calculate prefix sum of 15. And then we're going to calculate the
prefix sum up to 11. But notice that we're not going to calculate up to 11. inclusive,
but exclusive because we want the value at a lot. Okay, so if we start at 15, t
hen we
cascade down until we hit zero. So 15 has arranger responsibility of one, subtract one
from 15. Now we get to 14. And 14 has arranger responsibility if two, because the least significant
there on 14 is two, okay, now we go to 12, and then keep cascading down. So the prefix
sum to 15 is the array at index 15 plus 14 plus 12 plus eight. All right, now the prefix
sum for up to 11 dot inclusive, so we're going to start at 10. Now we want to cascade down
until we hit zero. So 10 is arranger re
sponsible to subtract two from 10, we get to eight.
Now from eight has arranger responsibility of eight, so cascade down, so eight minus
eight is zero. So now, the range sum is the sum of all the indices of 15 minus those of
10, then that's how we would calculate the range sum. So notice that in the worst case,
we're querying a cell which has a binary representation, which is all the ones and these are the numbers
of the form like to the n minus one to a power of two, minus one. So a power of tw
o has one
bit set and all zeros, but when you subtract one, then your whole bunch of ones here. So
if you look at 15, almost all of its bits are ones. And those are the worst cases. So
in the worst case, we might be asked to query say 15, and seven, both of which have a lot
of ones in them. So we're doing about to log base two of N operations. But in the average
case, this actually pretty good. And we're going to implement this in such a way that
it's just going to be bit manipulation. So this i
s like super fast. So the range query
algorithm is pretty much this. You see, it's like literally no code, the range query from
i to j, we have a Fenwick tree of size. And so I'm going to define a function called prefix
sum, which does that cascade at cascading down operation. So we started I and while
I is now equal to zero, we're going to sum up the values in our final tree. And we're
going to remove the value of the least significant bit. And we're all we're going to keep doing
this until our
value is zero. And then we can return the sum. So the range query manages
a difference between those prefix sums. So really neat little algorithm. I want to talk
about Fenwick trees and point updates. So let's dive right in. But before we get to
that, absolutely, make sure you checked out the Fenwick tree range query video. And I
posted last, just to get the context of how the Fenwick tree is set up and how we're doing
operations on it. Okay, so just to refresh your brain on how we actually did
a prefix
sum, and how we did those range queries. So what we did was, we started at a value at
some index doesn't matter which and then we continuously removed the least significant
bit until we hit zero. That's like cascading down effect that gave so let's say we started at 13 or 13, at least
significant bit was one, so we removed one and we got 12. And then we found out that
the least that nificant bit of 12 was four So we remove four and then at least thing,
if you have eight was eight, now
we reach zero. And once we reach zero, we know that
we're done. So doing a point, date is very analogous to this, instead of removing, we're
going to be adding the least significant bit, as you'll see. So for instance, if we want
to add a value at index nine, then we have to find out all the cells which are responsible
for nine, because remember, cells are responsible for a range of responsibility. So if we start
with nine who find the least significant bit and add it to nine, and we get 10. So
10,
finally significant bet, is has a value of two, then we add it to 10, then find the least
significant bit 12 at 12. So it's four. Now it's 16. And then we would do the same thing,
and then we're out of bounds. So we would know to stop. So if I draw a line outwards
from nine, all the cells that I hit are the ones I have to update. So remember, those
lines represent a range of responsibility. So the lines that I hit are the ones that
I got when I added the least nificant bits. Okay. So 14, add
some constant x, at position
six in the Fenwick tree, which cells do I need to modify? So we start six, and we find
the leasing of Gambit sex, and ad sex, so we get eight, find the least significant bit
of eight and that 816. So if I draw a line out from sex, then indeed the cells that I
do hit r six, eight, and 16. So the required updates for our Fenwick tree are that we need
to add x to position six, eight, and 16. So the algorithm is really, really simple. So
if we have a Fenwick tree, store
d in their array of size n, then while I, so I supposition
we originally want to update. So while i is less than n, we're going to add x to the tree
acquisition I an add on the value of the least significant bit of AI. And that's it, where
the function LSB is the significant bit of AI. And there are built in functions to do
that, usually, and we're going to have a look at some source code. Alright, so that was
point updates. And now we actually need to construct the Fenwick tree. Let's talk abou
t
Fenwick tree construction. We've already seen how to do range queries and point updates
in the last two videos. But we haven't even seen how to construct the Fenwick tree yet.
And the reason I've kept this for last is you can't understand the Fenwick tree construction
without having previously understood how point updates work. Alright, so let's dive right
in. So we could do the naive construction of a Fenwick tree. So if we're given an array
of values, a, and we want to transform this into a
Fenwick tree, what we could do is initialize
our Fenwick tree to be an array containing all zeros and add the values into the Fenwick
tree one at a time using point updates to get a total time complexity of order n log
n. However, we can do better, we can actually do this in linear time. So why bother with
n log n? Alright, so in the linear construction, we're going to be given an array of values
we wish to convert into the Fenwick tree, a legitimate Fenwick tree, not just the array
of values th
emselves. And the idea is we're going to propagate the values throughout our
Fenwick tree in place. And we're going to do this by updating the
immediate cell as responsible for us. Eventually, as we pass through the entire tree, everyone's
going to be updated, and we're going to have a fully functional Fenwick tree at the end
of the day. So it kind of relies on this cascading idea. So you You propagate some thing to the
parent who's responsible for you, and then that parent propagate its value t
o its parent
and so on. So it's just kind of like almost delegating the value. So let's see
how this works. Oh, one more thing before that. So if the current position is position
I, then the immediate cell above us, which is responsible for us, so our parent, let's say that is J, and j is
given by I, plus the least significant bit of AI. Alright, so if we start at one, well,
the least significant bit of one is one, so the parent is at position two. So notice that
there was a for at position two,
but we're going to add two for the value of i, which
is three. So now position two as a value of seven. Now, we want a bit position two. So
find out which is responsible for position two. So two, plus at least significant bit
of two is four. So four is actually responsible for two are immediately responsible for two.
So go to index four and add the seven. Then who is responsible for three? Well, that's
four. So go to position four, and add the value at index three. Now, who's responsible
for fo
ur? Well, eight is responsible for four. So go to position eight, and add the value
four. So now in position eight, we have four. So in our five, and then you see how we keep
doing this, updating our parent, the immediate cell responsible for us, now seven is updating
eight. But now, nobody, oh, eight doesn't have a parent.
Because our Fenwick tree is too small, it only has 12 cells, but the parent that would
be responsible for eight is 16. And 16, is out of bounds. So we just ignore it. It's
no
t irrelevant. So now, we keep going. So I is nine nines, least significant bit is
one, so j is 10. That's where the parent is. So keep propagating that value 10s, parent
12, lemons parent also 12. And now we are the same sort of situation we had with eight
where we have an out of band situation, so we ignore it. So the values that are there
right now are the values of the Fenwick tree. And with these values, we can do range queries
and point updates, not with the original array that we had. So l
et's look at the construction
algorithm itself. In case you want to implement this, we will have a look at some source code
in the next video. But if you're using another language that I'm not using, this can be helpful.
So given an array of values, you want to turn into a Fenwick tree. Let's get it's about
the length, which is n. And I recommend you actually clone or make a deep copy of the
values, just so that you don't accidentally manipulate the values array while you're constructing
your Fe
nwick tree. That could be problematic, because we're doing all this stuff in place.
So clone the values array, and then start I at one and go up to n and then compute j
so the parent. So we just i plus the least significant bit of I do an if statement to
check if j is less than n, that might actually be less than or equal to and actually not
thinking about it because everything is one based and in a Fenwick tree. Yeah, I'm pretty
sure it should be less than or equal to. Alright, let's have a loo
k at some Fenwick tree source
code. I'm here in my GitHub repository. You can find it at this link which I'll put in
the description below. William fees, slash data dash structures, and the Fenwick trees
source code is right here under the Fenwick tree folder. So let's dive right in. I have
it here local on my computer and my text editor. Alright, so this source code is provided to
you in Java, but it's really easy to train layer two, any language you're working. So
create a Fenwick tree class w
hich has two constructors. One, they'll create an empty
family tree for a given size and then you populate yourself. And another one, which
you give it an array of values, like we saw in the last video, and constructs the Fenwick
tree in a linear time. So this is probably the constructor you want to use, and not the
other one. But I give you the option to use either or. So one thing that you guys should
know is that the values array pass in this thing needs to be one based. In the last video,
I
was hesitant on whether or not to actually go less than or less than or equal to the
length of the array. And that's going to depend on whether your array is one based or zero
based. Usually everything a family tree is one based in which case, it would be less
than if you're using this as the length. All right. But other than that, so this is just
the construction algorithm we saw. So just propagate the value to the parent. So that's
right here, and ignore it if it's out of bounds. So pretty sim
ple stuff. So this is probably
one of the most interesting methods is the least significant bit method. And it's going
to return the value of the least significant bit for some integer i. So this bit magic
right here, essentially isolates and returns the least significant bit value. Something
that's a bit more readable, is this right here, which uses Java's built in method find
the least significant bit. However, using a Rob manipulation, like this
without an additional function call will be fas
ter. Okay, so the prefix sums, this is
something we covered in the first video, which allows you to compute the prefix sum from
one to I both inclusive. And all this is being done one based. So this would do the cascading
down, they were talking about. So start with a sum equal to zero, and add the values of
the indices you hit along the way while you cascading down. And this line line 55 is equivalent
to i minus equal the least significant bit of I, which is a lot more readable than this
crazy
manipulation, which essentially just clears that bit. But you want to use as much
bit manipulation as you can keep your Fenwick tree fast, even though it's already really,
really fast. But the More button manipulation you use, the less operation or machine level
operations you're going to do. And so your program is going to be so much faster. Okay,
if you want to find the sum between IMG inclusive, then we can call the prefix some methods right
here, and just essentially get the differences. So
that's easy. So adding, so this is from
a point update, at position, I add K to it. So we k can be positive or negative, that
doesn't really matter. So what are you going to do as for AI, you're going to update everyone
who's responsible for you. So all the ranges that are responsible for you. And for each
of those, you're going to add the constant K. And then you're going to propagate up to
your parent, my thing i equals i plus the least significant bit, and you're going to
do that, until you'r
e still within the tree at some valent index. And this additional
method added for fun is that you want to set the index is equal to k, this might sometimes
be useful, so just get the value at I so II, and then call the Add method. So pretty simple
stuff. As you see this is what 8080 ish lines and half of it is comments. So this is a really
simple and yet extremely fast the structure. And interesting topic I want to talk about
today is the suffix array. This is an incredibly powerful data struct
ure to have in your toolbox
when you're doing some string processing. Stuff like arrays are a relatively new data
structure appearing around the early 90s. due to the heavy memory consumption needs
of suffix trees. Let's start with the basics and talk about just what a suffix is. For
our purposes, a suffix is a non empty substring at the end of a string. For example, if we
ask ourselves, what all the possible suffixes of the string horse are, we are able to come
up with five unique suffixes. And
they are e, s, e, r, s, e, and so on. Now we can answer
the question what is a suffix array? The answer is a suffix array is the array containing
all the sorted suffixes of a string. Let's see an example of this. Suppose, you want
to find the suffix array for the word camel. On the left, I constructed a table with all
the suffixes of camel and the indices of where that particular suffix started in a string
camel. Then, on the right hand side, I sorted all the suffixes in lexicographic order in
a table. The actual suffix array is the array of sorted
indices highlighted in orange, we do not need to actually store the suffixes themselves
if we know where the suffix begins in the original string. This is an ingenious idea,
and it provides us with a compressed representation of the sort of suffixes without actually needing
to physically store the suffixes themselves. All we need is the original string and the
array of associated indices. In summary, a suffix array is an array of indices wh
ich
store the sorted suffixes of a string. Furthermore, for a bit of history on the suffix array,
it is a data structure originally designed to be a space efficient alternative to a suffix
tree, which in turn is itself meant to be a compressed version of another data structure
called a try. As a side note, even though the suffix array is a very different from
the suffix treat suffix arrays can do just about virtually anything, the suffix tree
can with some additional auxiliary information such a
s a longest common prefix array, which
will be the topic of our next video. In this video, we're going to talk about perhaps the
most important piece of information associated with the suffix array. And that is the longest
common prefix array, also known as the LCP array. The LCP array is an array where each
index stores how many characters two sorted suffixes have in common with each other. Let's
have a closer look. Perhaps the best way to show what the LCP array is, is to do an example.
In the
following example, we'll find what the LCP array is, for the string, A, B, A,
B, B, A, B. The first thing we'll want to do is construct the suffix array for our string
to find out what all the sorted suffixes are. Notice that the very first entry that we placed
in our LCP array, that is the middle column is zero. This is because this index is undefined,
so we'll ignore it now. To begin constructing our LCP array, let's begin by looking at the
first two suffixes and seeing how many characters th
ey have in common with each other. We noticed
that this is two so we placed two in the first index of our LCP array. Now we move on to
the next two suffixes. Notice that they also have an LCP value of two and the next two
don't have anything in common. So we placed zero and the next to only have one character
in common, then three characters in common and lastly, only one character in common.
So the result is the following LCP array, highlighted in purple. In summary, the LCP
array is an array w
hich tracks how many characters To sort of suffixes have in common with each
other. Although very simple, you'd be surprised how much information can be derived from such
a simple construction. Another thing worth noting is that the very first index in our
LCP array was undefined. Since we store the LCP array as an integer array, by convention,
we usually set this first index to be zero. So that doesn't interfere with any operations
we might want to perform on the LCP array. And this is fine for
most purposes. Lastly,
we saw what the LCP array was, but not how to construct it very efficiently. There are
other algorithms than the one I showed you how to construct the LCP array, which run
in a much better time complexity, that is in n log n time, and even in linear time.
In this video, I want to discuss a new application of suffix arrays and LCP arrays, and that
is finding and counting unique substrings. There are a variety of interesting problems
in computer science, especially bioinfor
matics that require you to find all the unique substrings
of a string. The naive algorithm has a terrible time complexity of n squared, which requires
a lot of space. The idea is to generate all the substrings of the string and dump them
inside a set. A superior approach is to use the information stored inside the LCP array.
This provides not only a quick but also a space efficient solution. I'm not saying that
this is the canonical way of finding all unique substrings because there exists Other
notable
algorithms such as Robin carp in combination with Bloom filters. Let's now look at an example
of how to find unique substrings. Let's now look at an example of how to find all the
unique substrings of the string, a z A z A, A for every string there are exactly n times
n plus one over two substrings. The proof of this I will leave as an exercise so listener,
but it's really not that hard to derive. Now notice that all the substrings here, there
are a few duplicate ones. I have highlighte
d the repeated substrings there are exactly
six of them, and nine unique ones. Now let's use the information inside the LCP array to
figure out which of those substrings really were the duplicate once in the table on the
right I generate the LCP array for the string AZ az a. Remember what the LCP array represents,
it represents that two adjacent suffixes of the original string share a certain amount
of characters with each other. So if the LCP value at a certain index is say five, and
there are
five characters in common between those two suffixes, in other words, there
are five repeated substrings between those two suffixes, since they come from the same
larger string. So if we inspect the first LCP position at index one, we see that has
a value of one. We know that the repeated string is the first character in a so we know
that A is a repeated substring the next LCP values three, so there are three repeated
substring between ACA and AZ ACA, namely Ay, ay z, and ACA, the next interesti
ng LCP values to for
the last two suffixes, so there are two repeated substrings. Here we can eliminate z, and z
A, we can then come up with an interesting way of counting all unique substrings. We
know how many substrings there are, and that is n times n plus one over two. And we also
know the number of duplicate strings that is the sum of all the LCP values. If this
doesn't make immediate sense, try out a few examples and play around with it. If we go
back to the example we just did with a str
ing az, az a and we set n equal to five, which
is the length of the string, then we can get the correct answer. have nine by punching
n equals five and then removing all the repeated substring values summed up in the LCP array.
Alright, so if you're looking for a programming challenge concerning counting substrings,
like I mentioned here, check out the link to the Caris online judge for some practice.
I also included some source code for suffix array and LCP array available on GitHub github.com
slash William fees slash dia dash structures. There is a really neat problem called longest
common substring problem, or it's generalization, the K common substring problem, which is really
what I want to focus on today. First, let's state the problem and then discuss multiple
ways of solving it. Suppose we have n strings. How do we find the longest common substring
shared between at least k of them, with K being anywhere from two to n, the number of
strings. As an example, consider the three st
rings, s one, s two, and s three, with the
value of k equal to two, meaning that we want a minimum of two strings from our pool of
three strings to share the longest common substring between them. In this situation,
the longest common substring is B, CA. But know that the longest common substring is
not required to be unique, because there can be multiple. The traditional approach to solving
this problem is to employ a technique called dynamic programming, which can solve the problem
and a time
complexity equal to the product of the string lengths. Obviously, this method
can get unwieldy very quickly. And you should avoid using it whenever possible, a far superior
approach. And the way to solve this problem is to use a suffix array, which can find the
solution in linear time proportional to the sum of the string lines. So how do we do this?
How do we use a suffix array to solve the longest common substring problem? Let's consider
the three strings we saw earlier, s one, s two and s thr
ee. What we will first want to
do is concatenate them all together into a larger string, which I will call t, which
is short for text. However, when doing this, we must be careful and place unique Sentinel
values between our strings. We need to do this for multiple reasons. But the main one
being that we must avoid any intermingling of suffixes when we construct the suffix array.
Additionally, I would like to know that these settle values need to be lexicographically
less than any of the charact
ers contained in any of our strings. So in the ASCII table,
the pound sign, the dollar sign and the percent sign are all less than any alphabetic character
contained within s one, s two and s three, so we're good in doing the concatenation.
Once we have the string T, we are able to construct the suffix array for tea. This procedure
can be accomplished in linear time with a linear suffix array construction algorithm.
You will notice that on this slide, I am displaying both the LCP array values on
the leftmost
column and the sorted suffixes of t as they would appear in the suffix array on the right
column. I also gave each suffix a certain color to match it with the original string
it belongs to. In this slide, you can see that the suffixes
starting with Sentinel values got sorted to the top because they were lexicographically
less than all the characters in the string. And this is to be expected. For our purposes,
we want to ignore them because we injected them into the string t ourselv
es. So back
to the central problem. How do we find the longest common substring of K strings? Given
that we now have the suffix array and the LCP array constructed? The answer is, we're
going to look for K strings, each have different colors whom share the largest LCP value. Let's
do an example with K equals three. Since k equals three We have three strings, this means
that we need one string of each color, we can achieve a maximum of two, if we sell the
following three adjacent suffixes. Notice
that each of them is a different color, meaning
that they came from different original strings. And that the minimum LCP value in the selected
window is to note that we must ignore the first entry in the window. This means that
the longest common substring solution is the string ca of length two, which is shared amongst
all three of s one, s two and s three. Let's do another example. But this time, let's change
the value of K to be equal to two. Since k equals two, we want to have two suffixes
of
different colors and the maximum longest common prefix value between them. In this case, there
is a unique solution, which is the string B CA, with a length of three shared between
s one and s two, but not s three. So far, we've covered only some trivial cases, things
get much Messier. When not all the different colors you need are exactly adjacent with
each other. A trick we will use to overcome this will be to use a sliding window technique
to capture the correct amount suffix colors. Here'
s what we'll do at each step, we'll adjust
the bottom or the top end point so that the window contains exactly k suffixes of different
colors. Once we have a valid window with the correct amount of colors, we'll want to query
the minimum LCP value for that window range. In the picture below, you can see that this
value is two, here's the minimum value and the LCP array for that window is two, again,
ignore the first entry which is five. Since the minimum value for that window is two,
all the suf
fixes in that window, do share the prefix a G, which has length of two, as
we would expect, here are some more details on how we are actually going to perform the
query to obtain the minimum LCP value on the current window we are considering. Turns out
that this is a really popular problem, and it can be solved in a variety of ways. Since
we're dealing with a sliding window and not just any arbitrary range query, we can use
a linear solution from the minimum sliding range query problem to obtain
the value we
want. Alternatively, I would recommend using a minimum range query data structure such
as a segment tree to perform logarithmic range queries on the LCP array. This is theoretically
a little slower, but it is much easier to implement in my opinion. So to implement the
sliding window, we will also need an additional data structure to keep track of the colors in our window, I recommend using
a hash table for this task. On the bottom left, I drew a table to indicate how much
of each c
olor we are capturing and the current window, a valid window will require at least
K or more columns to have a value greater than zero. In the example that we'll follow,
let's suppose that k equals three, meaning all three colors will need to be present in
our window for a valid query to occur. By query I mean, querying the LCP array for the
minimum value in the current window, and then possibly updating the best longest common
substring value found so far. Right now you can see that our window
is missing some blue.
So our rule when we're missing a certain color is to expand the window down. And when we
already meet the criteria, we shrink the window down. So let's expand our window downwards
to capture some blue. Now we have enough blue and enough of each color to do a valid query.
When we perform the query, we would see that the longest common substring for the window
we're in will have length three representing the string a G. Now since we have enough features
We can then reduce the
window size, I removed one green suffix, and we still have at least
one of each color. So we can again perform a query to find out that we get the same result
as last time. Now I shrink the window even more. And now we don't have enough green.
So we cannot perform a valid query here. What we need to do is expand the window downwards
until we hit a green suffix. Luckily, there was a green suffix right there. And we can
perform a query to attempt to update the best longest common substring value
found so far.
In this window, the longest common substring is only of length one. So the longest common
substring for this window is the string a, which is not as long as the longest common
substring we found before, which was a G of length three, so we can ignore this one and
keep searching. Now we shrink the interval, and we're short one color, and that is red.
Let's expand downwards until we reach a red suffix. So we expand and find a blue suffix.
This is not what we wanted. So let's keep sea
rching the expand Finally, green suffix,
and keep doing this until a red suffix is reached. This video is getting a little long,
so I'm going to break it up here. And in the next video, we'll look at a full example of
this method being applied. However, in the meantime, if you're looking for a challenge
regarding the longest common substring problem, make sure to check out the life forms problem
on the cast website. And if you're looking for an implementation of longest common substring
algorith
m, as I just described, make sure to check out my algorithms repository@github.com
slash William fiza slash algorithms, we're going to finish where we left off in the last
video. In this video, I want to do a full example solving the longest common substring
problem with a suffix array. For this example, we're going to have four
strings, s one, s two s three and s four. I have also selected the value of K to be
equal to two, meaning that we want a minimum of two strings of our pool of four to sh
are
the longest common substring. And between them. I have also provided you with a concatenated
text we'll be working with as well as the solution at the bottom of the screen. In case
you want to pause the video and figure it out for yourself. The first step in finding
the longest common substring between a set of our four strings is to build the suffix
array and the LCP array, which I have displayed on the right side and the left side, respectively.
While I will be conducting the longest commo
n substring algorithm, notice the variables
on the left as they change. The window LCP and a window LCS values will track the longest
common prefix and the longest common substring values for the current window. And the LCS
length, and the LCS set will track the best values so far. So let's get started. Initially,
our window starts at the top, and we want the window to contain two different colors.
So our rule is to expand downwards when we do not meet this criteria. As I expand down,
the suffix
is green, so we still only have one color, I expand on again and still one
more green suffix. I expand downwards again and now we arrive at a blue suffix. And here
we are able to perform a range query for our window. However, our query isn't fruitful
because the window longest common prefix value is zero. So there's no longest common substring
here. When we satisfy the window color criteria like we do now, we decrease the window size
and decrease the window size by one and still nothing interes
ting. Decrease the window size
again. And this time, look what we found. The current window contains a longest common
prefix length of two. So we obtained the longest common substring BC and added to our solution
set. Now we keep shrinking the interval size because we meet the color requirement. Our
window size is now too small All because K is two, so we need two different color strings.
So we expand once more. Now something interesting has happened because we find an LCP value
of three, which
is larger than our current best value. So we update the solution set
to have the string B, C, D, instead of just BC, which is one character longer. Now we
get to shrink the window size, the window was now too small, so we expand the LCP value
of zero here, so that is no good. So shrink the window size. Now expand to meet the color
requirement, we get an LCP value of one, but that doesn't beat our current best, which
is three. So shrink the window. Now we need to meet the core requirements, so ex
pand,
we have only blue strings, so keep expanding and LCP value of one for this window range.
So that's no good, so shrink. Now we have an LCP value of two we're getting closer to
our best, but still not good enough. So we have to shrink and like let go. Now expand.
Now something interesting is going on here. We have a window RCP value of three which
is equal to our best so far. So instead of saying that, the CDE our newfound, longest
common substring value beats BCD, which is of the same lengt
h, we keep both in the solution
set. Alright, so now, let's shrink our window interval because we meet the color requirement.
Now expand it, we still need one more color expand again, our LCP window value is zero.
So shrink and shrink again. Now expand LCP value of one here, that's not good enough.
So smaller, expand, we'll see p value of two. Okay, we might be getting closer, but we meet
color requirements. So shrink. Now expand to meet the color requirement. These two strings
have Nelson p val
ues, zero, shrink. Now expand, now shrink. Now we've reached the end and
found our solution to the longest common substring problem with a four strings and a k value
of two. As I was doing the window expanding and shrinking, I want you to notice that each
time the window either expanded or shrank, I only ever moved one of the endpoints downwards.
And they were always going downwards. So we know that the number of Windows has to be
linear in proportion to the number of suffixes that we have. And
the number of suffixes that
we have is the length of our text, T. So we come to the conclusion that there must be
a linear amount of windows that we must consider, which is really good. Because we want our
time complexity. Be quick, today's topic is going to be on an efficient way to solve the
longest repeated substring problem. The longest repeated substring problem is
another one of these fairly common problems in computer science, lots of problems can
actually be reduced to this problem, so i
t's important that we have an efficient way of
solving it. The naive approach requires n squared time and lots of space. What we want
to do instead is use the information stored inside the longest common prefix array to
yield a faster and also space efficient solution. Let's do an example what is the longest repeated
substring of the string abracadabra? Feel free to pause the video and figure it out
yourself. The answer is the substring Abra, which is the longest substring that appears
at least
twice in the string. So we call it the longest repeated substring. Here you can
see the first instance of Abra on the left. And now you can see the second repeated instance
of Abra on the right, although these substrings are disjoint and do not overlap. In general,
this is permitted for the longest repeated substring. Now let's solve the problem using
a suffix array and an LCP array which I have generated on the right hand side. I'll give
you a moment to pause the video and inspect the LCP array
in case you notice anything
special in relation to the longest repeated substring. Now that you already know what
the answer is, effectively, what we're looking for in the LCP array is the maximum value.
Intuitively we want to do this because we know that the suffixes are already sorted.
So if two adjacent suffixes have a large, longest common prefix value, then they share
a good amount of characters with each other. We also know that if the LCP value at a certain
index is greater than zero, th
en the string shared between the two adjacent suffixes is
guaranteed to be repeated, because it is common between two suffixes, each of which start
at different locations in the string. So here, again, is Abracadabra. Since our LCP value
was four, we know that the first four characters from the suffix abracadabra forms one part
of the repeated substring. Next, we know that the suffix above it which shares the LCP value
of four also shares four characters present in that longest repeated substrin
g. Now, I
want you to look at another interesting case, can you find the longest repeated substring
of the string, a ba, ba, ba, ba? Well, if you did, you will find out that you not only
got one longest repeated substring, but two longest repeated substrings. Since there can
be ties. In a case like this, you're not looking for a single largest value, but all largest
values that might exist. For the first maximum, you can see that we'll want the first three
characters from the suffix, a ba, ba, b
a, ba, and for the second maximum, we'll want
the first three characters from the suffix ba, ba. Visually, for this first, longest
repeated substring, we have a BA, which appears at the start, and then a BA again, closer
to the end. The other repeated substring is found using the second largest common prefix
value, and its first occurrence is found here, and the second one just next to it. That is
all for repeated substrings pretty easy once the suffix array and the LCP array have already
been c
onstructed. Here is a practice problem on campus if you want to tackle a longest
repeated substantive problem which requires you to have an efficient solution. Today,
I'm going to introduce probably one of the most important types of trees in computer
science, which are balanced binary search trees. Balanced Binary search trees are very
different from the traditional binary search tree because they not only conform to the
binary search tree in variant, but they are also balanced. What I mean by
balanced is
that they're self adjusting to maintain a logarithmic height in proportion to the number
of nodes they hold. This is very important because it keeps operations such as insertion
and deletion extremely fast, because the tree is much more squashed. In terms of complexity,
a binary search tree has averaged logarithmic operations, which is quite good. However,
the worst case still remains linear, because the tree could degrade into a chain for some
inputs. One such input is a sequence of
increasing numbers. To avoid this linear complexity,
we've invented balanced binary search trees in which the worst case is logarithmic for
all operations, which makes them very appealing. Central to how nearly all Balanced Binary
Search Tree implementations keep themselves balanced is the concept of tree rotations,
which is going to be the main topic of this video. Later, we'll actually look at some
specific types of balanced binary search trees to see how these rotations come into play.
So th
e secret ingredient to most advanced binary search tree implementations is the
combination Have two things. One a clever tree invariant and tree rotations. A tree
invariant is simply a property or a rule that you impose on your tree, such that it must
be met at the end of every operation. To ensure that the invariant is always satisfied a series
of tree rotations are normally applied. We'll get back to this concept and fitting variants
in a later video. So don't worry about them so much for now.
Right now we're going to
look at how tree rotations work. Suppose our invariant is not satisfied. And to fix it,
we need to do a right rotation about node A. Assuming node A has a left child B, we
can perform a right rotation to put B where node A was and push node A down to become
B's right child. When I first learned about this, in my undergrad, I was Mind blown. Literally,
I thought it was absurd, or even to the point of being illegal that we should be allowed
to do this and just move around
nodes in a tree. But what I've realized since is that
a transformation like this is totally legitimate. Since we're not breaking the binary search
tree invariant in any way. If you inspect the left tree, you'll discover that in terms
of ordering and placement, node D is less than node B is less than E is less than a
is less than c, then you inspect the right tree and remark that well, this is also true.
A bit more detail on why performing tree rotations is a valid operation. First, you have to
remember
that all balanced binary search trees are binary search trees, meaning that the binary
search tree invariant holds. So for every node and the values in the left subtree are
less than the value of n and the values in the right subtree are all greater than the
value of n. So in this sense, it doesn't matter what the tree structure itself looks like.
All that fundamentally matters is that the binary search tree invariant holds. This means
we are free to shuffle, transform, or even rotate t
he values and nodes in our tree as
we please as long as the binary search tree environment remains satisfied. Now let's look at how these rotations are
done in great detail. For simplicity, since the rotations are symmetric, or will only
do the right rotation, and you should be able to figure out the left rotation on your own.
first have a look at the tree on the right. Notice that there are directed edges pointing
downwards and another node p above a, which may or may not exist. This is why the
re is
a dotted line on that edge. If a note a does have a parent node p, then it is important
that we take it into account when doing the rotation. In either case, we start with a
pointer reference to note a, this is the orange arrow, then we'll want a pointer to node B.
After that set, a is left a pointer to point to B's right child, then change B's right
pointer to point to node A and we've successfully done a right rotation. If we rearrange the
nodes, this is what we would end up with. Howeve
r, notice that there's a slight problem.
If node A had a parent node, then either the parents left or right pointer would still
be referencing a. This is problematic since B is A's successor after the rotation. So
we change the link to now point to B. This step is usually done on the recursive call
back using the return value of the right rotate function. We just finished looking at the
case where each node has a reference the left and the right child nodes. But in some Balanced
Binary Search Tr
ee implementations, it's more convenient for nodes to also have a reference
to the parent node. This complicates tree rotations because now instead of updating
three pointers, we need to update six pointers. Let's have a look. In this case, where we
also have a parent link, every node is in a sense, doubly linked. We start off with
a pointer or referencing a and the first thing we'll want to do is also reference node B
And node p, so we don't lose them as we shuffle around pointers. Next we'll a
djust the left
subtree of a to make A's left pointer reference B's right subtree. Of course, throughout this
example, assume B is not know, if you're paranoid, you can add an extra if statement to check
for this condition. However, it would be a mistake to assume B is right subtree is not
null, we actually have to check against this before setting B's right child's parent to
reference a. Next let's make a the right subtree of me. So set meas right pointer to reference
a. Now make A's parent poin
ter or reference be the last thing we need to do is adjust
the reference to the parent node P. So make these parent pointer a reference P. And now
the very last thing I promise is to make peas left or right pointer reference the successor
node B. Notice that we need to check if p is not now because it might not exist. This
will be the case for the root node. If we readjust the tree, you will see that we correctly
did a right rotation. There are a lot of steps in doing this right rotation. And it
's a very
error prone process, which is why I wanted to do it in such detail. Today we're going
to look at how to insert nodes into an avielle tree in great detail. We'll be making use
of the tree rotation technique we looked at in the last video. So if you didn't watch
that, simply roll back one video. Alright, before we get too far, I shouldn't mention
what an avielle tree is. An EDL tree is one of many types of balanced binary search trees,
which allow for logarithmic insertion, deletion and
search operations. something really special
about avielle tree is that it was the first type of Balanced Binary Search Tree to be
discovered. Then, soon after a whole bunch of other types
of balanced binary search trees started to emerge, including the two three tree, the
HV the scapegoat tree, and the avielle trees main rival the red black tree, what you need
to know about Next is the property that keeps the avielle tree balanced. And this is the
balance factor. Simply put, the bounce factor of
a node is the difference between the height
of the right subtree and the left subtree. I'm pretty sure the bounce factor can also
be defined as the left subtree height minus the right subtree height. But don't quote
me on this, it would also screw up a lot of what I'm about to say, and may also be the
reason why you find many inconsistent ideas about what way to do tree rotations on various
Google search results pages. So for consistency, let's keep the bounce factor right subtree
height minus
left subtree height for clarity because people get this wrong or define it
differently. The height of node x is calculated as the number of edges between x and the furthest
leaf. So if your tree only has one note, the tree has height zero, not height one because
there are no edges, that invariant on the avielle tree that keeps it balanced is forcing
the balance factor of every node to be either minus one zero or plus one. If the balance
factor of a node is anything else, we need to resolve that
with tree rotations. In terms
of information, we just store in each node to actually make the avielle tree work. What
we'll need is the actual value of the node stores. This value must be comparable. So
we know how to insert it and in what position it goes to the tree. Then we'll also need
the bounce factor and the height of the node as well as the left and the right child pointers.
As the algorithm executes, we'll need to update these values. So keep that in mind. So a slider
so Mac I said that
the balance factor of a node must always be minus zero or plus one.
A natural question to ask is how do we handle the case where this is not true? The answer
is that if this is not true, then the nonce factor must be either be plus two or minus
two, which we can easily handle with tree rotations. The rotations we need to perform
depending on the structure of the tree can be broken down into four distinct cases. The
first such case is when the tree is what we call left heavy, and there are two l
eft child
nodes. This is an easy case to fix, because all we need to do is perform a right rotation
about the yellow node to balance. The next case is the left right case where you have
a left child, but then that node has a right child. To fix this, you do a left rotation
about the left child. So the green one on the left most image, what happens then is
that this transforms into the left left case we just saw, which we can resolve with a right
rotation balance. The third case is the right righ
t case, which is symmetric to the left
left case. So instead of doing a right rotation, we do a left rotation about the green note.
Last but not least, is the right left case, which is symmetric to the left right case.
For this case, you would perform a right rotation about the yellow node on the left most image
to transform this case into the right right case, and then do a left rotation about the
green note in the middle image. Next, I want to show you some pseudocode for
inserting nodes into
an ACL tree because it's not all that obvious or easy. This first method
is the public facing method for the insert method, which returns true or false depending
on whether the value was successfully inserted or not. For simplicity, we're going to ban
duplicate values in our ACL tree. So if the value already exists, or the value is no,
this method would return false. If the node does not know, and it doesn't already exist
in the tree, we call our private recursive insert method, where we pass in
a pointer
to the root node and the value we want to insert. The private recursive method is also
simple. If we hit the base case a null node, we simply return a new instance of the node
with the value we want to insert. Otherwise, we get to compare to a value with the value
we're trying to insert against the current node to determine if we should go on the left
or the right subtree. After that, on the recursive call back, we call the update method which
updates the bounce factor and height valu
es for this note, and lastly, we rebounds the
tree with a bounce method. Now let's take a look at the update enhance method. And what
they're actually doing. The update method updates the bounce factor and height values
of our node. So to calculate the height of the node, we get the maximum height of the
left and the right subtrees and then add one. Notice that I initialize the left and the
right subtree heights to be minus one. But this is because it will cancel out with a
plus one with a max f
unction in the case where the node has no sub trees giving the correct
height of zero for leaf nodes. Lastly, I update the balance factor for this node. By finding
the difference between the right subtree and the left subtree heights, the balanced method is slightly more involved
but not too crazy. Here we check if our bounds factor has an illegal value of minus two or
a plus two. If the bounce factor is a minus two, then we know that the node is left heavy,
then we dig further into the left sub
tree. To determine if we're dealing with a left
left case or a left right case, we do a similar thing if the balance factor is plus two, except
we're dealing with the right right or the right left case. If the bounce factor is not
plus two or minus two, then we know that the bounce factor is going to be either plus one,
zero or minus one. And in either of those cases, we don't need to do anything inside
the four cases we might run into. Notice that all we do here are calls to the left rotation
a
nd right rotation methods that we saw in the last video. Also notice that the left
right and right left cases, call the left left and right Right right case methods respectively,
since they reduce to those cases after a first rotation. In the last video, we looked at
this right rotation method. But since we're dealing with any vl tree In this video, we
actually need to augment that method to update the height and bounce rate values for the
nodes, we're moving around. When we do rotations, this i
s a subtle detail you must not forget,
otherwise, your height and bounce factor values will be inconsistent with the left rotation
cases metric to this one, so you should be able to figure it out pretty easily. In this
video, we're going to look at how to remove elements from an avielle tree. And what you'll
discover is that removing elements from an avielle tree is almost identical to removing
elements from a regular binary search tree. So for the majority of this video, we're going
to actually
be looking at how to remove elements from a binary search tree in the very end
argument, that algorithm for ADL trees specifically. So let's get started. So just for review,
I want to look at how to remove nodes in a binary search tree in detail. So we can generally
break it up into two steps finding and replacing. In the find phase, you find the element you
want to remove from the tree, if it exists, and then replace it with a successor node,
the successor node is necessary for us to maintain
the binary search tree invariant.
More details on the Find phase. This is when we're searching for the element in the tree
to see if it exists. And basically four things in happen. First is we hit a null node, in
which case we know that the value we're looking for doesn't exist. Our comparateur returns
a value of zero, meaning we found the node we want to remove, the comparative value is
less than zero. so the value we're looking for, if it exists, is going to be found in
the left subtree, or th
e comparative value is greater than zero, in which case the value
if it exists, is in the right subtree. So let's do an example of finding nodes in a
binary search tree. Suppose we're looking for a 14, well, we should have a reference
to the root node. So this is where we start. So we compare 20 and 14, and we know 14 is
less than 20. So we go on the left subtree. We know 14 is greater than 10. So we go on
the right subtree. We know 14 is less than 15. So you're on the left subtree. 14 is greate
r
than 12. So the right subtree. And finally, there we found it, the node we were looking
for. Now let's see what happens with a node that doesn't exist. So let's try and find
26. So again, we start at the root node, then we go to the right subtree, because 26 is
greater than 20, then we go the left subtree, because 26 is less than 31. And once we're
at 25, we would go to the right subtree. And then discovered that 26 does not exist in
the tree. So once we find the node, assuming it exists, we n
eed to replace that node with
its successor. The motivation for this is that if we just remove the node without finding
a successor, then there'd be a gap in the tree. And when we're looking for a successor
node, one of four cases will happen. Either were a leaf node, in which
case there are no sub trees, the node to remove has no left subtree the node to remove has
no right subtree or the node to remove has both the left subtree and right subtree. We'll
see how to handle all these cases. In the
first case where the node to remove is a leaf
node, we can simply remove it without any side effects. The successor node in this case
would be simply a null node. So suppose was a remove node eight from this tree, the first
thing we do is find out where eight is in the tree. So we'd go down the tree and then
we'd found node eight, then we discover that Oh, it's a leaf node, so we can just remove
it like Alright, now for cases two and three, where there's only a left or a right subtree.
In these
cases, the successor node is the immediate child of that left or right subtree.
The reason the successor is the immediate node down from the node we're removing is
that it is the next node which is either greater than it the case right subtree or less than
net, and the case of a left subtree. Let's do an example, suppose we want to remove node
nine, the first thing we do is find when no nine is in the tree. So start the route and
go to the right subtree, find the node we want to remove, which i
s nine. And then we
inspect nine and discover that it only has a left subtree. So the successor node is its
immediate child on the left, so seven. So what we do is we get a reference
to seven, and then get ready to remove nine. And then
we remove nine and then make seven, the successor by linking it back up to five. And if we rebalance
the tree, that's what it looks like. And no nine has been removed. So the last case, is
when the node we're trying to remove has both a left subtree and a right s
ubtree. So the
question in this case is in which subtree, will we find the successor of the node we're
trying to remove? And surprisingly, or not surprisingly, the answer is both. The successor
can either be the largest value in the left subtree, or the smallest value in the right
subtree. Once the successor node has been found in either left or right subtree, we
replace the value of the node to remove with a value in the successor node. However, the
story doesn't end there, we must not forget t
o remove the duplicate value of the successor
node that still exists in the tree. One common strategy to resolve this is to recursively
call the function again, but with the value to remove as the value in the successor node.
Let's see an example because this isn't completely trivial. Let's remove node seven from this
tree, which is the root node. So we would start at the root node and discover that in
fact, this is the node we want to remove and notice that it has two non empty sub trees,
so we
must choose a successor. To choose a successor we either pick the smallest value
in the right subtree, or the largest value in the left subtree. Let's find the smallest
value in the right subtree. And to do this, you will go to the right ones, and then dig
as far left as possible. So now that we found the successor node 11, we will copy its value
into the node we're trying to remove, which is the root node seven. Notice that now there
are two instances of 11 in our tree, and we want unique valu
es in our tree. So to remove
the 11 that is currently highlighted, we would recursively call our remove method, but on
11 this time, and luckily for us, this will always result in a case one two or three removal.
So to remove 11 we notice that it only has a right subtree. So its successor is its immediate
right child. So we restage the removal and get ready to remove 11. So we remove 11 and
then hook 14 back up to 18. And if we rebalance the tree, then we can see that the duplicate
11 value is g
one. All right at the moment we've been waiting for how do we augment the
binary search tree removal algorithm for avielle trees. The solution is simple, you only need
to add two lines of code to ensure that the tree remains balanced and that the bounce
factor and height values remain up to date. On the recursive callback, you invoke the
update and balance methods you saw in the insert video, which ensure that when the node
is removed from the avielle tree, that the tree remains balanced. It's a
s easy as that.
Today what we're going to have a look at some of the source code for the avielle tree. The
link to the source code I'm going to present in this video can be found on GitHub github.com
slash emphysema slash data dash structures. Make sure you have watched the last three
videos on the avielle tree about tree rotations, insertions and removals in avielle trees before
continuing so you can understand the source code I'm presenting. I don't normally do this,
but I'm going to present a
live demo of the avielle tree in action. So I'm here in my
GitHub repository. I'm just compiling some of the Java code with avielle tree and then
I'm going to run it and you see It generated a random tree with some values and notice
that the tree is relatively well balanced for the number of nodes that are in it. So
I randomly inserted some notes. And you might expect the tree to be a bit more sloppy if
it were just a simple binary search tree. But the avielle tree really keeps the tree
quite r
igid. So I think we're ready to dive into the source
code. Now. Here we are, in the source code of a recursive avielle tree implementation
and the Java programming language. So let's get started. If you look at the class definition
for the avielle tree, you will notice that this class takes a generic type argument,
which is of type t, which extends comparable. And this generic type I'm defining, basically
says that the types of values we're going to be inserting inside the tree need to be
compar
able in a sentence, meaning we need to be able to insert them and know how to
insert them. So if you look inside the subclass node, which I've created, you can see that
the value we're actually storing inside the node is of type T. So it's a comparable. And
that's very important. The other instance variables I'm storing inside the, the node
subclass for the save your tree is, of course, the bounce factor as an integer, the height
of the node, because we want to be able to query the height of a n
ode in constant time,
so I'm storing it inside every node. And then of course, there's going to be the left and
the right child pointers, right here. And notice that this, this node class implements
the tree printer, printable note interface. And that's just an interface I have somewhere
in this folder that allows me to display the tree I did on the terminal. So this isn't
quite a requirement if you're actually building an ACL tree. And nor are these overrides,
those are also for being able to d
isplay the tree in the terminal, which is really handy
for debugging actually. Alright, so instance variables at the avielle tree class level,
we have the root node, this should really be private, although I'm using it for testing,
so I'm leaving it package private. I'm also keeping track of the number of nodes inside
the tree inside this node count variable. Something else I'm also using for testing
is this height function. Although it's really good to just saying you check yourself to
make sur
e that your avielle tree height is relatively low. Then there are these methods
like size and is empty, which are pretty self explanatory. This the display method I call
to actually print the tree in the terminal. The set contains method to check if a certain
value already exists inside the tree. So this is the public facing method, which calls the
private method, and I give it the route as the initial node to start off with. So to
check if a node exists inside a tree, if we hit the base case, t
hen we know that that
value does not exist. Otherwise, we compare the current value to the value inside the
node. So that returns a comparative value of either a value less than zero, which means
that if the value does exist, it'll be found in the left subtree or comparateur is greater
than zero, meaning the value if it exists, is in the right subtree. Otherwise, the means
that the comparative value is equal to zero. And that means we found the node inside the
tree so we can return true. So we'r
e trying to insert this value variable, if it's no,
we don't want anything to do with it. So we just return false. And we return a Boolean
for the insert method to indicate whether an insertion was successful or not. So if
the value If not already exists inside the tree, then we're going to call the private
insert, method and then assign its value to the root. And also increment the node count,
because we know we're going to successfully insert something if we're inside this block,
and at the sa
me time, we're going to return true. Alright, let's have a look at the private
insert method. So if we reach the base case, meaning we traverse all the way down our tree,
and the value of node is null, then we know this is the position where we need to insert
that new node, so create a new node and give it the value. Otherwise, we're searching for
the insertion position. So again, invoke the comparator function, then do value compared
to no dot value. This dot compare two comes from the comparab
le interface at the class
level. So that's why we're allowed to invoke dot compare to on a generic type here and
another generic type there. So again, if comparateur is less than zero, then insert in left subtree.
Otherwise, insert in the right subtree. And here is the extra two lines, you need to add
for the avielle tree, which is calling the update method to update the bounce factor
and the height value for this node and also rebalance the tree if necessary. Well rebalance
the tree at this nod
e. All right, so let's look at the update method. So this is to update
the balance factor and height of a node. So here are two variables that grab the height
of the left subtree and the right subtree. Then I update the height for this note. So
that's always one plus the maximum of either the left subtree or the right subtree height.
Then once we have the appropriate values for left and right, we can of course update the
bounce factor, which is the difference between the right and the left subtr
ee heights. Okay,
so we've done the update, now we can call the balance method. And inside the balance
method, we're looking to readjust nodes whose balance factors are either minus two or plus
two. In the case of a minus two, this means that our tree is left heavy. And inside the
left heavy case, we can run into the left left case or the left right case. And inside
the right heavy case, we have the right right case and the right left case. And to identify
those sub cases, you just check the bal
ance factor, but on either no dot left or no dot
right respectively. And if if the balance factor of the node is not minus two or plus
two, then we know that the balance factor is either zero plus one or minus one, which
is fine, and we don't have to touch anything. We don't have to rebalance the tree if either
one of these is true. So we just simply return the note. So now we can look at the individual
cases themselves. So like the left left case, we'll just perform a right rotation. The left
r
ight case, we'll do an initial left rotation, and then call the left left case. So we're
reusing methods we already wrote here, since this case degrades to that case, after one
rotation, and a similar thing happens for the right right case and the right left case,
we get to call the right right case after one right rotation over here. then here are
the notorious left rotation and right rotation methods. For the avielle tree. It's very important
that we call these two update methods to update the
height and balance factor values after
that we've rotated about this node, because those values will undoubtedly change. And
it's also important that this is the order you do them and you can't Do them in inverse
order because you have to update the the child node first to get the correct balance factor
and height values for the parent. Okay, now we're at the Remove method. So in this
remove method, we want to remove this element, or I should have called it value rather. But
still, so if the el
ement is no, well, we know it doesn't exist in the tree. So just return
false. Otherwise, make sure that value exists in the tree, and then remove it by calling
the private remove method, then decrease the node count and simply return true on the Remove
method to indicate that the removal was successful. So let's look at this private remove method where all the action is happening. So if we
had the base case, and just return null, then we get our comparative value. And we
do what we always do, w
e check if the comparator is less than zero. And if so we dig into the
left sub tree. And we know that the value we're looking for is smaller than the current
value. And then we recursively call the Remove method. But passing down no dot left, so we're
just decreasing the region or searching and passing the element as well. Otherwise do
the same thing. But for the right. So recurse down the right subtree. Otherwise, if this
is not true, and that is not true, then we know that the comparative val
ue is equal to
zero, meaning we found the node we wanted to remove. And we know from the slides that
this is where the four cases come up. If no dot left is equal to null, then we know that
we want to return no dot right. Or if no dot right is null, meaning the right subtree is
null, then return no dot left. Okay, and then the final tricky case, where we're trying
to remove a node, who still has both sub trees, so the left subtree is there and the right
subtree is there. But we still want to rem
ove that node. And we can either pick the smallest
value, while the successor can either be the smallest value in the right subtree of the
largest value in the left subtree. And I'm using a heuristic right here to actually determine
which one I want to do, just because I think it's a good heuristic. And here's what it
is, if the height of the left subtree is greater, I want to remove nodes from that subtree.
But otherwise, if the right subtree has a larger height, than I want to remove nodes
fro
m that subtree. So that's the heuristic I'm using to choose which subtree I remove
from. And I just made up this heuristic. But I think in general, it'll work pretty well.
So what we do, let's say there are are or the left subtree has larger height, than what
I'm going to do is I'm going to find the successor value by going down the left subtree once
and then digging all the way to the right to find the max value, and then swapping that
value into the current node. And then this is the recursive
part, we recurse. on removing
the successor value instead of the element we passed in, or rather the element we passed
in becomes the successor value. So we've removed the element, but now we also have to remove
the successor value because we don't want duplicate values in our tree, then we basically
do the same thing. But on the other side. If the condition goes the other way. Here's
what I was mentioning in the last video, where you want to call the update and rebalance
method on the call bac
k of this remove method so that the tree remains balanced even though
we're removing nodes and these are the Find men and find max method I was calling right
appear, which just digs left or digs, right depending on the case. And here's just an
iterator to traverse the tree, I don't think I need to go over this. And here's another
method for testing which validates the binary search tree invariant, also not particularly
important. And this is what I was calling in the main function that will just
randomly
insert values inside the tree and then display it. So when I invoked this file on the terminal, this is what executed. So that is an ACL tree.
It's pretty cool data structure, and I quite enjoyed writing it up. Today's data structure
is the index the priority queue. This is going to prove to be a very useful data structure
that you wish you would have known a long time ago. So just before we get started, this
video builds off concepts from the previous priority queue videos, which simp
ly go over
the basics. Strictly speaking, you can probably get by without watching all those videos,
as I will be doing a quick recap. But for those of you who want to know, priority queues
and full detail, please check out the description for links to those. So what exactly is an
indexed party queue? Well, it's a traditional priority queue variant, which on top of having
all the regular priority queue operations, also supports quick updates and deletions
of key value pairs. One of the big probl
ems that the index party queue solves is being
able to quickly look up and dynamically change the values in your priority queue on the fly,
which is often very useful. Let's look at an example. Suppose a hospital has a waiting
room with n people, which need different levels of attention. Each person in the waiting room
has a certain condition that needs to be dealt with. For instance, Mary is in labor. So she
has a priority of nine a car she has a paper cut, he has a priority of one. James has a
n
arrow in his leg, he has a priority of seven Naija stomach hurts, she gets priority of
three. Richard has a fractured wrist, so his priority is five. And lastly, Leah also has
a stomach hurts, we want to process these patients by highest priority first. This means
the hospital would serve Mary first followed by James. However, then something happens
suppose nyda stomach condition worsens, and she starts vomiting. Her priority needs to
get updated to six. Because of this Neda gets served next o
nce they're finished with James.
During this time, Richard gets impatient and leaves he goes to another clinic down the
street so he no longer needs to be accounted for. Further suppose that a car wash goes
to take a drink of water slips on the floor and as a result cracks his head and needs
immediate attention. So we increase his priority to 10. Once the EDA is dealt with a karsch
shouldn't be next. And lastly, this is followed by layer. As we saw in the hospital example,
it is very important t
o be able to dynamically update the priority of certain people. The
index priority queue is a data structure which lets us do this efficiently. The first step
two using an index party queue is to assign index values to all the keys thus forming
a bi directional mapping. We use an index persecute to track who should get served next
in our hospital, we need to assign each person a unique key index value between zero and
n non inclusive. Note that this mapping is intended to be bi directional. So I
would
advise using a bi directional hash table to be able to flip back and forth between the
key and its key index. Basically, any operation on the index party q will require the associated
key index of a particular key. So you're probably wondering why I'm saying that we need to map
keys to indices in the domain zero to n nine inclusive. The reason for this is that typically
part of queues are implemented as heaps, which under the hood are actually arrays. So we
want to facilitate being able t
o index into those arrays this will become apparent shortly.
I will say though that often and I mean very often, the keys themselves are already integers
in the range zero to n. So there's no need to actually construct this bi directional
mapping. It's already there implicitly. However, it is handy to be able to support any type
of key such as names or general objects, we can think of an index party queue As an abstract
data type with certain operations we want it to support here are about a doz
en or so
operations, we want our index particules support. These are deleting keys, getting the value
associated with a key, checking if a key exists in the priority queue, getting the key index
with the smallest value, getting smallest value in the index Burcu, being able to insert
and update key value pairs. And finally, the specialized update operations increase in
decrease key which I'll talk about at the end. For all these operations, you need the
key index associated with a particular key
that you're dealing with. Throughout these
slides, I will be denoting the key index simply as the variable KPI to distinguish it from
other index values, so keep your eye out for that. And index party queue can be implemented
in several ways. Some ways with really good time complexity is using specialized heap
structures. However, we're going to cover the binary heap implementation for simplicity,
you will notice that the time complexity for all these operations are either constant or
logarithmi
c, which is really good. In a traditional party queue. The remove and update operations
are linear because we are not maintaining a mapping to the position of where our values
exist within the heap. Before we dive into the index party queue per se, I want to spend
a few slides giving a refresher on the traditional priority queue data structure which only supports
values not key value pairs like the index barbecue. Still, both data structures are
very similar, and most people would consider them
the same. Although there are key differences
in my opinion, recall that a very common way to represent a binary heap is within array
since every node is indexed sequentially. If we were to represent the following binary
heap as an array, we would get this array of values. If we know the index of node i,
we can figure out what its left and right child nodes are by using simple formulas,
the left child is two times i plus one and the right child is two times i plus two, assuming
Zero Based indices
. As an example, what are the children of the node at index four? Well,
we can just plug in AI into the two formulas I just gave you to obtain the indices, nine
and 10. Of course, you can always run the math backwards and figure out with a parent
of given notice. Both of these are especially useful if you're either walking up or down
the tree. Whenever you want to insert a new value into the priority queue, you insert
the new value at the insertion position at the bottom right of the binary tree
. Suppose
we want to insert the value eight, this would violate the heap invariant, so we need to
bubble up the value until the invariant is met. So swap nodes five and 12. The heap invariant
is still not satisfied. So swap again, swap nodes two and five, and now the tree is balanced.
Now let's review how removals are done in a traditional priority queue. To remove items
search for the element you want to remove and then swap it with the last node, perform
the removal and finally bubble up or do
wn the swapped value. For this example, suppose
we want to remove the node that has the value five, we don't know where the node value five
is within the priority queue. So we have to search for it. This is one of the major differences
between the priority queue and the index priority queue. So start at node zero and process each
node sequentially until a node with a value of five is found, if any. So we found a node with a value five to actually
remove it from the heap, swap it with the right m
ost bottom node. Once this is done,
remove node five from the tree. Now the purple node we swapped into five spoon position may
not satisfy the heap invariant, so we need to either move it up or down the tree. In
this case, since the purple node has the value of one, which is smaller than its children,
and we're dealing with a max heap, we'll want to move the node down. That was a quick recap
on just about everything you need to know about a traditional party queue. Now let's
talk about implemen
ting an indexed priority queue with a binary heap. For the following
examples, suppose we're dealing with n people with different priorities that we need to
serve. Perhaps we're prioritizing people for a queue at a hospital, a waiting line at a
restaurant or who knows what the main thing is, we'll assume that the values can dynamically
change and that we always want to serve the person with the lowest priority to figure
out who to serve. Next, we'll use a min indexed priority queue to sort by lo
west value first.
The first step as mentioned before, is to assign each person a unique index value. between
zero and N, non inclusive. These are the key index values in the second column beside each
person's name, then I'm going to give each person an initial value to place inside the
index party queue. These values will be maintained by the index priority queue once inserted,
note that the values can be any comparable value, you want not only integers as shown
here, this means we can have stri
ngs, objects, or whatever type of data we want. If I was
to build a min indexed priority queue, out of the key value pairs I have in the last
slide, this is the heap I would get. Remember that unlike the previous example, we're sorting
by smallest value first, since we're dealing with a min heap. If I want to access the value
for a given key K, First, you need to figure out what its key indexes. And then you will
look up in the values array maintained by the index priority queue. Here's a good q
uestion,
What value does the key Bella have in the index priority queue? Well, first, find Bella's
key index value, which is one, then index into the values array at position one to find
the corresponding value for the key in the index partner queue. So Bella has a value
of 15. Awesome, now we know how to get the value for a particular key in the index priority
queue. But how do we get the index of the node for a particular key? To do that, we'll
need to maintain some additional information, nam
ely a position map we can use to tell us
the index of a node in the heap for any given key index. For convenience, I will store the
position map as an array called p n inside the priority queue. As an example, let's find
out which node represents the key Dylan. First find the key index for Dylan, which happens
to be three, then look up index three in the position map to tell you where Dylan is in
the heap. It looks like Dylan is at the node and the index seven highlighted in orange.
Now here's a
follow up question. Where is George in the heap? I'll give you a quick
moment to figure that out before I give you the answer. All right, so with just about every operation,
the first step is to find the key index for the key we care about, which is George, then
use the key index to do a lookup inside the position map and find out the node for George,
which happens to be node one. Great, now we know how to look up the node for a given key.
But how do we find the key for a given node. This inver
se lookup will prove to be a very
useful operation. Say you want to know which key is associated with the root node at index
zero? Well, you need to be able to do an inverse lookup to figure that out. To do the inverse
lookup from node indices, and D keys will need to maintain an inverse lookup table.
I will denote this lookup table as I am short for inverse map. Let's see if we can figure
out which person is represented in the node at index two. To do that, simply do a lookup
in the inverse map
at index two. This gives us information about which key index is associated
with that node. Knowing the key index enables us to retrieve the actual key by doing a lookup
in our bi directional hash table. In this case, the node at position two represents
the key Ana, here's a follow up question to make sure you're still paying attention. Which
key is being represented in the node at index position three. Same as before, find the key
index through the inverse map, then with the key next, figure o
ut the actual key from the
bi directional hash table, we can then conclude that the node at position three represents
the key Isaac. What we just covered was how an index party Q is structured internally
and what all the moving parts are. Now we want to actually do some useful operations
with this index party queue, such as inserting new key value pairs, removing key value pairs
based on a key and also updating the value associated with a key. These are all possible
and are actually very similar
to how you would do it with a regular priority queue insertion
is nearly the same except that we have to update the position map pm and the inverse
map I am to reflect the movement of key value pairs. Suppose we want to insert the key marry
with a value of two into the priority queue. What we first have to do is assign marry a
unique key index value, then we're going to insert the new key value pair at the insertion
position. Notice that upon insertion, we updated our arrays at index 12. To ref
lect that the
new value was in fact inserted. Currently, the heap invariant is not satisfied since
the new node at index 12 has a value less than the one at node five. To resolve this,
we're going to swap the newly inserted value upwards until the heap invariant is satisfied
swapping nodes, we need to update the position map and the inverse map. by swapping the values
of the two nodes we're exchanging the values array does not need to be touched since it
gets indexed by the key index value that
we get from the map and not the node index per
se. It turns out that the heap invariant is still not satisfied, so we need to keep swapping
upwards. Let's have a look at some pseudocode for insertions
and this snippet, the user calling this function needs to provide a valid key index k II, as
well as a non null value they want to insert. The first thing I do is store the value associated
with a key inside the values array. Then I update the position map and the inverse map
to reflect the fact th
at a new key value pair has been inserted into the priority queue.
Finally, I move the node up the heap until the heap invariant is satisfied. Let's take
a closer look at this swim method to see how that happens. Here we are looking at the swim
method, you'll notice that it has two supporting functions swap, and let's swap is simply exchanges
to nodes based on index and last determines if no AI has a value less than node j. The
logic for the swim function is fairly simple. It begins by finding t
he index of the parent
node and walking up the tree. For every iteration, we walk up one layer in the tree. If the index
of the current node is not the root node and the value of the current node is less than
the parent node, remember that we're dealing with a min heap, and want small values to
be as high up as possible in our heap to actually issue a node exchange simply call the swap
function and provide the index of the current node and the parent node, and then update
the current node and th
e parent node index values. I also want to talk about swapping
and how that actually occurs. Because it's slightly different than a traditional party
queue. And this swap method, we're not actually moving around the values in the array, we're
only swapping around index values. Remember that the values array is indexed by the key
index, not the node index. So effectively, the values array can remain constant while
we update the inverse map and the position map. First, we update the positions of w
here
key index values are found in the index party queue. Remember what the position map is,
it is the position of which node index a given key index is found at. So we can do a straightforward
swap by indexing into the inverse map to find the key index values and swap indices i and
j. Followed by this simply update the key index values associated with nodes iMj and
the inverse map. To do this, this is just a simple straightforward exchange. Next up,
I want to talk about polling and removing ele
ments from an indexed priority queue. Polling
is still a logarithmic operation. However, removing is improved from a linear time complexity
to a logarithmic time complexity. Since node position lookups are now constant time but
repositioning the nodes after removal This is why the heap invariant is still logarithmic.
So let's jump into an example. Suppose we want to pull the root node This is something
we often want to do since it'll give us the key value pair with the lowest value in the
heap.
The first step is to exchange the root node with the bottom right node. As we do
that, remember to swap the index values. Now we can remove the read node from the tree.
As we remove the value, let's make sure we store the key value pair we're removing so
we can return it from our function later. Then clean up the Remove node. Finally, restore
the heap invariant by moving the swapped purple node down. Since the left and the right child
nodes have an equal value of two, let's default to selecting
the left one. Let's do a slightly more involved removal
by removing a specific key from the index party Q. And this example, let's remove the
key Laura. The first thing we want to do is get the key index for Laura, which is the
key we're working with. So in this case, the key index is equal to 11. Once we know the
key index, we can use that information to locate the node within the heap by looking
inside the position map, then swap the node we want to remove that is the node which contains
the k
ey value pair for Laura and exchange it with the bottom rightmost node. store the
key value pair before the actual removal, clean up the Remove node. And finally restore
the heap invariant by moving the swapped purple node up or down, we're going to make the purple
node move down because it has a large value. Alright, let's look at some pseudocode. for
removing key value pairs. The code is also very short only five lines of implementation
and three lines of cleanup. The first thing to do is exch
ange the position of the node
we want to remove. And the bottom right most node, which is always at index position as
Zed short for size of the heap. Once we've exchanged the nodes, the rightmost node is
now in the position where the node we want to remove was. So we need to move it either
up or down the heap to satisfy the heap invariant, we don't know which it will be so we try to
move the node up and down. Hence sink and swim. Lastly, I just clean up the values associated
with node we want to
remove, you can also return the key value pair being removed. But
I didn't do that here. Let's take a quick look at the sync method. So we understand
how that works. to sync a node, we want to select the child with the smallest value and
defaulting to the left child if there's a tie. In this next block, I tried to update
the smallest child to be the right child. But first I need to check if the right child
node index is within the size of the heap, and its value is actually less than the one
of
the left node. The stopping condition is if we're outside the size of the heap, or
we cannot sync the current node any further. Lastly, we want to make sure we swap the current
node with whatever was the node with the smallest value. Lastly, we want to make sure we swap
the current node with whichever was the node with the smallest value. The last core operation
we want to do on an index priority queue is key value pair updates similar to remove those
updates in a min index binary heap also tak
e logarithmic time due to the constant time
lookup to find out where the node is and the logarithmic time to adjust where the key value
pair should appear in the heap. Suppose we want to update the value of the key Carly
to now have a value of one, like every operation, find the key index of the key we want to work
with the key index for Carly happens to be two, then we can use that key index value
to update the values array with the new value. Of course, the heap invariant may not be satisfied,
so we need to move the node either up or down until it is the pseudocode for updating the value of a
key is super simple. Simply update the value in the values array and move the node either
up or down the heap. It's that easy. Alright, so there was nothing too special about updates,
but the part I really wanted to get to was increase and decrease key. In many applications
such as dextrous or prims algorithm, it's often useful to be able to update a given
key to make its value either always sma
ller or always larger. In the event that a worse
value is given to the index party here it should not be updated. In such situations,
it is often useful to define the more restrictive form of update operation called increased
key or a decrease key respectively, depending on whether you want to increase the value
associated with a key or always try and decrease the value associated with the key. Both of
these operations are very simple. They simply consist of doing an if statement before performi
ng
an update operation which either increases or decreases the value associated with the
key So bottom line, these two functions are just convenience methods that wrap a get operation
with an if statement to update a value. Today, we're going to look at some source code for
an indexed priority queue. So just before we get started, make sure you watch my video
on the index priority queue, where I go over the implementation details and why an index
priority queue is an important data structure. Al
l the source code for this video can be
found on my data structures repository@github.com slash wm fuzzer slash data structures, the
link can be found in the description below. Here we are in the source code for a min indexed
binary heap. This source code is presented in the Java programming language. To get started,
notice that our min index binary heap requires that we pass in a type of object which is
comparable This is so we can order our key value pairs within the heap. You'll also notice
t
hat I'm extending a min indexed dear heap. This is just to be more generic. And all I
do in the constructor is simply initialize this heap to have at most two children for
each and node while idea heap supports in general and teach children. So let's look
at the D air heap implementation where all the fun stuff is happening. So let's go over
I guess all the instance variables. So s Zed is just the number of elements in the heap,
n is a constant, representing the maximum number of elements in the
heap, D is the degree
of each node. So for the binary heap, this number is two. The two arrays, child and parent
track the child and parent indices for each node, so we don't have to compute them dynamically.
pm and I am or the position map and the inverse maps, which we're going to use to track the
key indices for our keys. And lastly is the values array, which is the array that contains
the values associated with a certain keys. Note that it's very important to notice that
this array is index
ed by the key indices and not by the nodes, indices, per se. So in the
constructor, we give a degree and a certain maximum size for our heap. Then we just initialize
whatever value for the degree and the maximum size of the heap, then I initialize all our
arrays, and I compute what the child and the parent indices should be. And I also initialize
the position map and inverse map to have all negative one values, then we have a few straightforward
methods such as size, is and key contains. So you'
ll notice that for contains, we don't
pass in the key for our key value pair, instead, we pass in the key index. And we're going
to do this for all our methods. So I just have a few convenience bounds checking methods
that you'll see again, throughout these videos, such as key has to be in the bounds or throw.
So this just makes sure that the key index is valid. And after this check is done, we
can check does that he exist within our heap by looking inside the position map and checking
that the
value is not equal to negative one. Then we have things like peak min key index,
which gets the key index for the node at top of the heap. And similarly, Paul min key index, and also peak min value and pull min value.
The reason these are separate is because sometimes we want to just look at the value at the top
of the heap but not necessarily remove it, the pole version will actually remove it.
Now let's look at insert. So to insert a key value pair, we need to make sure that that
key value pai
r does not already exist in the heap. Otherwise, we're going to throw an exception
that I simply validate if the value is not null. And if it is we throw an exception.
Otherwise, we simply update our indices for the position map and inverse map. Then we
assign the values array to have the value we passed in. And then we swim up that node
which we know will be at position size, and we also increment the size variable so that
our heap is one element larger than value of pretty straightforward, jus
t do a look
up inside the values array. then delete is slightly more interesting. We make sure the
key exists. Then we get the index of where that key exists within the heap. we swap the
index of the node and the last node in the heap. Then we reposition the new node we swapped
into ies position To go either up the heap or down the heap, we capture the value in
the position for the key index so that we can return it later, we clean up the node
we want to delete. And finally we return that value.
update is also pretty easy. Just make
sure that the key exists and the value is not no, then we get the index for the node,
then we capture the old value, update the new value, then move it within the heap. And
finally return the old value. So increase and decrease are just short for decrease key
and increase key we're dealing with a min dr heap here. So make sure you take that into
account when looking at the logic. So we make sure the key exists and it's not know then
if the value is less tha
n the value already in the heap, the values array at the key index,
we can update it and also swim it down. Notice that I didn't call the update method here,
because we know we're going to swim down in the update method, we sink and we swim. So
we do both operations, we don't know which way the node will go whether it's going to
bubble up or bubble down. But in the decrease key method, we do. Same thing for the increased
key method, except I switched the order for the less competitor so that the
values array
x and the first position and the value being passed in is on the right. These are just
the sink and swim methods I talked about in the slides, we can go over quickly. So to
sync node i, we get the minimum child of node i since we're working with a D reheat, we
need to search all the children of node, find the one with the least value, this is going
to be node j and then we swap i and j make sure that i is equal to the value of j and
then we find the minimum child again and then rep
eat this until we can't sink the node anymore.
Same thing for swim, except that we need to find the parent of note I which we can just
do a look up in the parents array at index I swap with the parent and keep doing this
until i is less than the parent min child just looks at all the children of Note II
and finds the one with a minimum value and returns its index. Also pretty straightforward.
Swap we covered this and slides basically swapped the indices and the position map and
the inverse map f
or nodes i and j. Then we have the last function which simply compares
values, the values for nodes i and j and other convenience methods just because I didn't
want to include all the logic of throwing exceptions everywhere. Just kind of wrap them
in helper functions. This is just for tests to make sure that our heat is in these a min
heap. So that's all there is to an indexed the air heap. I hope there wasn't too much
to ingest. And thank you for watching. Please give this video a thumbs up if
you learn something
and subscribe for more mathematics and computer science videos. Thanks
Comments