Load balancer

What is load balancer?

A load balancer is a device that distributes network or application traffic across a cluster of servers. Load balancing improves responsiveness and increases availability of applications.

A load balancer sits between the client and the server farm accepting incoming network and application traffic and distributing the traffic across multiple backend servers using various methods. By balancing application requests across multiple servers, a load balancer reduces individual server load and prevents any one application server from becoming a single point of failure, thus improving overall application availability and responsiveness.

Load balancing is the most straightforward method of scaling out an application server infrastructure. As application demand increases, new servers can be easily added to the resource pool, and the load balancer will immediately begin sending traffic to the new server.

How do load balancers work?

When one application server becomes unavailable, the load balancer directs all new application requests to other available servers in the pool.

To handle more advanced application delivery requirements, an application delivery controller (ADC) is used to improve the performance, security and resiliency of applications delivered to the web. An ADC is not only a load balancer, but a platform for delivering networks, applications and mobile services in the fastest, safest and most consistent manner, regardless of where, when and how they are accessed.

Load balancing algorithms and methods

Load balancing uses various algorithms, called load balancing methods, to define the criteria that the ADC appliance uses to select the service to which to redirect each client request. Different load balancing algorithms use different criteria.

  • The Least Connection Method 
    The default method, when a virtual server is configured to use the least connection, it selects the service with the fewest active connections.
  • The Round Robin Method
    This method continuously rotates a list of services that are attached to it. When the virtual server receives a request, it assigns the connection to the first service in the list, and then moves that service to the bottom of the list.
  • The Least Response Time Method
    This method selects the service with the fewest active connections and the lowest average response time.
  • The Least Bandwidth Method 
    This method selects the service that is currently serving the least amount of traffic, measured in megabits per second (Mbps).
  • The Least Packets Method 
    This method selects the service that has received the fewest packets over a specified period of time.
  • The Custom Load Method
    When using this method, the load balancing appliance chooses a service that is not handling any active transactions. If all of the services in the load balancing setup are handling active transactions, the appliance selects the service with the smallest load.

Why are load balancers needed?

Traffic volumes are increasing and applications are becoming more complex. Load balancers provide the bedrock for building flexible networks that meet evolving demands by improving performance and security for many types of traffic and services, including applications.

Common Load Balancing Schemes

The most common load balancing schemes are:

  • Even Task Distribution Scheme
  • Weighted Task Distribution Scheme
  • Sticky Session Scheme
  • Even Size Task Queue Distribution Scheme
  • Autonomous Queue Scheme

Even Task Distribution Scheme

An even task distribution scheme means that the tasks are distributed evenly between the servers in the cluster. This scheme is thus very simple. This makes it easier to implement. The even task distribution scheme is also known as “Round Robin”, meaning the servers receive work in a round robin fashion (evenly distributed). Here is a diagram illustrating even task distribution load balancing:

Even task distribution load balancing is suitable when the servers in the cluster all have the same capacity, and the incoming tasks statistically require the same amount of work.

Even task distribution ignores the difference in the work required to process each task. That means, that even if each server is given the same number of tasks, you can have situations where one server has more tasks requiring heavy processing than the others. This may happen due to the randomness of incoming tasks. This would often even itself out over time, since the overloaded server may all of a sudden receive a set of light work load tasks too.

So, even if even task distribution load balancing distributes the tasks evenly onto the servers in the cluster, this may not result in a truly even distribution of the work load.

DNS Based Load Balancing

DNS based load balancing is a simple scheme where you configure your DNS to return different IP addresses to different computers when they request an IP address for your domain name. This achieves an effect that is similar to the even task distribution scheme, except that most computers cache the IP address and thus keep coming back to the same IP address until a new DNS lookup is made.

While DNS based load balancing is possible, it is not the best way of reliably distributing traffic across multiple computers. You are better off using dedicated load balancing software or hardware.

Weighted Task Distribution Scheme

A weighted task distribution load balancing scheme distributes the incoming tasks onto the servers in the cluster using weights. That means that you can specify the weight (ratio) of tasks a server should receive relative to other servers. This is useful if the servers in the cluster do not all have the same capacity.

For instance, if one of three servers only has 2/3 capacity of the two others, you can use the weights 3, 3, 2. This means that the first server should receive 3 tasks, the second server 3 tasks, and the last server only 2 tasks, for every 8 tasks received. That way the server with 2/3 capacity only receives 2/3 tasks compared to the other servers in the cluster.

As mentioned earlier, weighted task distribution load balancing is useful when the servers in the cluster do not all have the same capacity. However, weighted task distribution still does not take the work required to process the tasks into consideration.

Sticky Session Scheme

The two previous load balancing schemes are based on the assumption that any incoming task can be processed independently of previously executed tasks. This may not always be the case though.

Imagine if the servers in the cluster keep some kind of session state, like the session object in a Java web application (or in PHP, or ASP). If a task (HTTP request) arrives at server 1, and that results in writing some value to session state, what happens if subsequent requests from the same user are sent to server 2 or server 3? Then that session value might be missing, because it is stored in the memory of server 1.

The solution to this problem is called Sticky Session Load Balancing. All tasks (e.g. HTTP requests) belonging to the same session (e.g the same user) are sent to the same server. That way any stored session values that might be needed by subsequent tasks (requests) are available.

With sticky session load balancing it isn’t the tasks that are distributed out to the servers, but rather the task sessions. This will of course result in a somewhat more unpredictable distribution of work load, as some sessions will contain few tasks, and other sessions will contain many tasks.

Another solution is to avoid using session variables completely, or to store the session variables in a database or cache server, accessible to all servers in the cluster. I prefer to avoid session variables completely if possible, but you may have good reasons to use session variables.

Even Size Task Queue Distribution Scheme

The even size task queue distribution scheme is similar to the weighted task distribution scheme, but with a twist. Instead of blindly distributing the tasks onto the servers in the cluster, the load balancer keeps a task queue for each server. The task queues contain all requests that each server is currently processing, or which are waiting to be processed. Here is a diagram illustrating this principle:

When a server finishes a task, for instance has finished sending back an HTTP response to a client, the task is removed from the task queue for that server.

The even tasks queued distribution scheme works by making sure that each server queue has the same amount of tasks in progress at the same time. Servers with higher capacity will finish tasks faster than servers with low capacity. Thus the task queues of the higher capacity servers will empty faster and thus faster have space for new tasks.

As you can imagine, this load balancing scheme implicitly takes both the work required to process each task and the capacity of each server into consideration. New tasks are sent to the servers with fewest tasks queued up. Tasks are emptied from the queues when they are finished, which means that the time it took to process the task is automatically impacting the size of the queue. Since how fast a task is completed depends on the server capacity, server capacity is automatically taken into consideration too. If a server is temporarily overloaded, its task queue size will become larger than the task queues of the other servers in the cluster. The overloaded server will thus not have new tasks assigned to it until it has worked its queue down.

The load balancer will have to do a bit more accounting using this scheme. It has to keep track of task queues, and it has to keep track of when a task is completed, so it can be removed from the corresponding task queue.

Autonomous Queue Scheme

The autonomous queue load balancing scheme, all incoming tasks are stored in a task queue. The servers in the server cluster connects to this queue and takes the number of tasks they can process.

In this scheme there is no real load balancer. Each server takes the load it is able to handle. There is just the task queue and the server. If a server falls out of the cluster, its tasks are kept unprocessed on the task queue, and processed by other servers later. Thus each server functions autonomously of the other servers and of the task queue. No load balancer needs to know what servers are part of the cluster etc. The task queue does not need to know about the servers. Each server just needs to know about the task queue.

Autonomous queue load balancing also implicitly takes the work load and capacity of each sever into consideration. Servers only take tasks from the queue then they have capacity to process them.

Autonomous queue has a little disadvantage compared to even queue size distribution. A server that wants a task needs to first connect to the queue, then download the task, and then provide a response. This is 2 to 3 network roundtrips (depending on whether a response needs to be sent back).

The even queue size distribution scheme has one network roundtrip less. The load balancer sends a request to a server, and the server sends back a response (if needed). That is just 1 to 2 network roundtrips.

Load Balancing Hardware and Software

In many cases you don’t have to implement your own load balancer. You can buy ready-made hardware and software for this purpose. Many web server implementations have some kind of built-in software load balancing functionality. Be sure to look at what already exists before you start implementing your own load balancing software

 

 

 

 

 

 

 

 

 

 

 

 

 

Leave a comment