What is Load Balancing? Load Balancing Explained

Psst. Hey kid. Want to learn about load balancers? Give me 10 minutes of your attention, and by the end of this video you will be able to destroy any system design interview that dares to ask you about load balances. Ready? Say you have set up an infrastructure that has a bunch of servers. You have a couple of clones of the same server. Your primary task is to direct the traffic to clones, so that none of them are idling or overworking. So you decide to create a separate server that would b

e responsible for redirecting the requests to the clone servers. You decide to name that server a load balancer, patent it, trademark it. But you suddenly get a knock on the door for patent fraud since that concept already exists, as it was first created in 1669 by Lord Balancoire [This is a joke btw]. So you decide to use a pre-existing load balancer solution, and here are the options you can choose from. They usually come either as a dedicated machine, as software or as a cloud-based solu

tion. Let's go over these one by one. Hardware load balancers are dedicated machines with specialized physical hardware. By specialized hardware I mean configured FPGA or ASIC chips, that allows it to redirect requests in really efficient manner, since these machines are purpose built for this task. Hardware and software on those is set up to work like a clockwork. They are really costly and usually used in data centers. The cheaper alternative is running a software load balancer on a regul

ar server. You can slap one of these bad boys in your server and be done with it. Most of those are free and open source so you don't need to pay $53,000 for a dedicated machine to re-route your 10 user per month traffic to your over engineered architecture. You know what else is free? These videos. And you better click the like and subscribe button as a thank you right now young man! There are also cloud-based solutions, where you can click five buttons and add a Solutions Architect to you

r resume. These will do everything behind the scenes, but to configure those correctly you need to finish this video to understand the terminology. Now that you have picked the type of load balancer you need to choose the subtype. There are two subtype options layer 4 and layer 7 load balancers. Can you guess why are the layer 4 load balancers named that way? That's right, because those operate on the 4th layer of the OSI stack. Well I lied, it actually looks at both layer 3 and 4, but it's

called layer 4 for some reason. This means those can't look at the contents of the packets of requests such as the headers and the body. But can only look at the source IP, destination IP and ports, and based on this information need to make a decision which server these should be forwarded to, based on your configuration. On the other hand layer 7 load balancer, which is actually layer 5, 6, 7, can look at the contents of the packet and can decide where to redirect the request based on tha

t. For this case you can pick either one of these for our primary task. However, if you had other requirements like re-routing the requests to /payment path to a separate server that handles payments or returning content from a separate static content server when an image or video is requested, then you should pick layer 7. If you wanted much faster response time, for cases like handling video feed, where speed is really important and you don't care much about the contents of each packet, t

hen layer 4 would be more suitable for you. Now that you have picked a suitable type of a load balancer we got a new problem. Say a request comes in, and now we need to redirect it to a server, but which one of those should the load balancer redirect it to? Well, this can be done in different ways. And those ways are called routing algorithms. The simplest one is called random. Every time a request comes in it just picks a random server. Nothing fancy. Then there is Round Robin. It just for

wards the request to servers in a cycle. There is a more sophisticated version called Weighted Round Robin, where a weight is assigned to each server, usually based on its performance, if the servers have different hardware. The decision to move to the next server in a cycle is based on the weight. This is useful in cases where you have clones of your application on differently performing servers. The weight trick can sometimes also be used for the random approach. Then there is the least c

onnections algorithm. It checks which of the servers has Least Connections at the moment, and re-routes the request to one of those. There are many more. Like Resource Based - that checks the load on each server. Response time - that re-routes to the server that responds the fastest. Etc. oK. Now you have solved the problem for this particular case, and have an understanding of load balances in their core form - as a machine that balances load. But the current ones have many more features.

Let's go over some of those. First one is called SSL termination. Say you have a much complex architecture, consisting of multiple servers talking with eachother. The load balancer receives an HTTPS request, which if you don't know means that it is encrypted. It decrypts the request to see the contents, decides where to redirect it and passes it to the first line. The first line decrypts it to see the contents, does some work and passes those down to the others. But here is the problem - de

crypting those takes processing power, and if each server needs to do that the system as a whole loses time, that it could have spent to do useful work. So the SSL termination feature has been added. It allows the load balancers to decrypt the request and pass the unencrypted version to the rest of the system, and when the unencrypted response is returned - it encrypts it back before returning it to the user. Next one is Response Caching. Say you have a logo on your website plastered on eve

ry page. Every time a user connects you have to go to the static content server, grab the logo and return it every single time. This might overload the server if there are too many users. This is why Response Caching was created. It allows load balancer to cache the frequently used content in RAM, that improves the response time and reduces the load on the static server. Another one is health checks. Say your load balancer uses round robin routing algorithm, and one day one of your clone se

rvers decides to retire and pursue a less stressful career as a toaster. Your load balancer blissfully unaware of its decision continues to redirect requests to a toaster, which doesn't return a response. So one of three requests do not receive a response. Health Checks solve that issue by adding an ability to check the health of the server, before passing the requests to it. To make sure that it's available. If not - it will pick another one. There are actually a lot more of features. It wo

uld be difficult to go over each of those without making the video an hour long. Also those depend on the load balancers software. So make sure you read the documentation for the load balancer that you pick. Now you have set up everything to work correctly. However there is one small issue. Adding a single load balancer introduces a problem to the whole system, called a single point of failure. That means if the load balancer dies, the whole system will be unresponsive. To solve this, you a

dd one more load balancer, in either active-active or active-passive configuration. This is called failover. Active passive means that you have one actively working load balancer and a passive backup one on a standby. The active one sends heartbeat requests every couple of seconds to the passive one telling it that it's alive. If for some reason passive one stops receiving heartbeats, it takes the IP of the active one and becomes active itself. In active-active configuration both are splitt

ing the traffic. If one of those fails the other one starts handling all of the traffic. This is also used as means of horizontally scale the load balancers to distribute the load. But how do you redirect the requests to two load balancers? The user first connects to the DNS to get the IP address of the load balancer, but DNS only binds to a single IP address. To solve this you have to use something called DNS round robin. Check this video to learn how that works

What is Load Balancing? Load Balancing Explained

Related articles

Comments