Demystifying Load Balancing Algorithms: An Intermediate Guide
In our latest Medium blog post, we delved into the fascinating world of web servers, but there’s more to explore. In the world of web servers, one of the key challenges is the efficient distribution of incoming traffic across multiple server instances. Load balancing serves as the secret ingredient that unlocks the doors to high availability, scalability, and enhanced performance for web applications. But what fuels this magic? It’s the load-balancing algorithms. In this intermediate-level guide, we will take a deep dive into the universe of load-balancing algorithms, uncovering their types, inner workings, and the scenarios in which to deploy them.
The Need for Load Balancing
Before we get into the nitty-gritty of load-balancing algorithms, let’s understand why load-balancing is crucial. A load balancer, whether in software or hardware, serves the purpose of preventing any single server from becoming overloaded. A load-balancing algorithm is the logic that a load balancer uses to distribute network traffic between servers (an algorithm is a set of predefined rules).
If a single server goes down, the load balancer redirects traffic to the remaining online servers. When a new server is added to the server group, the load balancer automatically starts to send requests to it.
In this manner, a load balancer performs the following functions:
- Distributes client requests or network load efficiently across multiple servers
- Ensures high availability and reliability by sending requests only to servers that are online
- Provides the flexibility to add or subtract servers as demand dictates
Benefits of Load Balancing
Scalability
Imagine you’re running a popular e-commerce website, and traffic spikes during holiday sales. Without load balancing, a single server might crumble under the overwhelming requests. Load balancing ensures that these requests are distributed evenly across multiple servers, keeping your website responsive and available.
High Availability
Hardware failures are inevitable. Load balancing helps mitigate the risk by rerouting traffic to healthy servers when one goes down. This redundancy ensures that your website remains operational even when hardware fails.
Performance Optimization
Load balancing also enhances performance. By intelligently distributing traffic, it minimizes server overload, reducing response times and improving user experience.
Types of Load Balancing Algorithms
Now, let’s explore the various load-balancing algorithms commonly used:
- Round Robin
In the round-robin, the request from the client is distributed cyclically. What does that mean, we will see with the help of a diagram.
In the above diagram, let’s say we have three servers A, B, and C and requests from the client 1,2,3,4,5,6 come to the load balancer. The load balancer will forward the request from client 1 to server A, client 2 requests to server B, and client 3 requests to server C. Now, for the client’s 4 requests, the load balancer will forward back to server A, and the client’s 5 requests to server B and the same cycle repeats.
Loads are evenly distributed which increases the responsiveness of the servers. But what will happen, if server B has higher RAM, CPU, and other specs than server A and C? In that case, servers A and C may get overloaded and fail quickly, while server B sits idle. This method can be preferred where the server’s configuration is the same.
2. Least Connections
In the round-robin and the weighted round-robin, we can see that the load balancer is not taking into consideration the current load connections on each server.
On server A the client’s 1 and 2 requests are not disconnected yet. On server B client’s 3 requests are not disconnected. But the client’s 4,5 & 6 requests are already disconnected. Now if the new request comes in according to the round-robin algorithm, it will be forwarded to server A, then server B, and then server C. Now from here, we can see loads on server A pile up, and server A resources may be exhausted quickly.
So, here the least connection method can play a major role. The least connection algorithm takes into consideration the current load on each server, so for new upcoming requests, the load balancer will forward that request to the server that has the least connections.
3. Weighted Round Robin
Dealing with different configurations of the servers, the administrator can assign the weight or ratio to the server, depending on the request it can handle. Let’s say, server A can take 3 requests per second, server B can take 2 requests per second on average, and server C can take 1 request per second.
So the load balancer will assign a weight to the server A=3, B=2, C=1. You can see the diagram below:
Now, if the request comes from the clients, the load balancer will forward the first three requests to server A, then the client’s 4 and 5 requests to server B, and the client’s 6 requests to server C. After this, if the seventh, eighth, or ninth request occurs, the same cycle will be repeated, just like a round-robin.
4. Hashing methods
Hashing algorithms are used in the case of persistent connections (which means sticking a client to a specific server). This may be due to the wide range of content that is served to the clients like videos. Cache to be served, this reduces the response latency, and better CPU utilization.
Different hashing methods can be used:
- URL Hash method
- Souch IP Hash method
The below diagram shows how requests from the client are hashed based on its URL or IP address, forwarded to the respective server.
URL Hash Method
The load balancer generates the hash value based on the HTTP URL present in requests coming from the clients. Based on the hash value, requests will be forwarded to servers. So if the same request comes for the same URL, it will be sent to the same server.
Source IP Hash Method
In this method, the client (source) and the server (destination) utilize the client’s IP address to generate a unique hash key. Subsequently, based on this generated key, the load balancer directs the client to a specific server.
This method is particularly useful for ensuring that a client is consistently connected to the same server, even after a disconnection occurs.
5. Least Response Time
Least Response Time directs requests to the server with the lowest response time. This algorithm is suitable for applications where server response times are crucial, but it requires continuous monitoring of server response times to be effective.
6. Random
The Random algorithm, as the name suggests, randomly selects a server to handle each request. While simple, it doesn’t guarantee even load distribution and can lead to uneven server loads.
When to Use Which Algorithm?
The choice of a load-balancing algorithm depends on your specific use case:
- Round Robin is a good choice when all servers are equally capable and response times are consistent.
- Least Connections is best when server capacities vary, but response times are still relatively similar.
- Weighted Round Robin and Weighted Least Connections are suitable when your servers have different processing capacities, and you want to balance the load based on their capabilities.
- IP Hash is ideal for maintaining a session state when a client needs to interact with the same server consistently.
- Least Response Time is useful when you need to minimize response times, but you must monitor server response times continuously.
- Random can be a quick choice if you need simplicity and load balancing is not critical.
Conclusion
Load balancing is a fundamental component in the realm of web servers, playing a pivotal role in ensuring the availability, scalability, and optimal performance of web applications. By distributing client requests intelligently across multiple servers, load balancing prevents overloading of individual servers and guarantees that your online services remain responsive and reliable, even in the face of server failures.
As you venture further into the world of web servers and load balancing, consider the unique requirements of your application. The choice of a load-balancing algorithm can significantly impact your application’s performance and reliability. Select the algorithm that best aligns with your goals, whether it’s even distribution, resource-aware load balancing, or session persistence. Your choice will be a key factor in providing a seamless and responsive user experience in the digital landscape.