The Gatekeepers of Scale: Understanding Firewalls and Load Balancers in Modern Infrastructure.

Introduction: The Perimeter and the Traffic Flow

As the internet grows and reaches to millions, the backend becomes much more vulnerable and volatile if it is exposed directly to the raw, unmanaged traffic of the public internet. Being a great software engineer doesn’t only mean that you can write elegant and efficient code; you should also be able to understand the underlying physical architecture that supports and is responsible for the proper delivery of your code. This "physical" perimeter of a server is what determines if your logic survives a traffic spike or a cyber attack.

The Internet is an open web which includes a mix of legitimate users, malicious bots, and overwhelming volume—all hitting a single entry point: your bare-bone server, where all your critical operations lie.

There come the “Physical” gatekeepers, the firewall and the load balancers that filter and organise this chaos before it ever reaches the server. This architecture is what makes a production-grade developer rather than someone who just write codes.

The Firewall: Your Digital Security Guard

Firewall is a layer of defence that sits before the load balancer and acts as a police, deciding which packets are "safe" based on a set of predefined security rules. They keep a check on all upcoming and outgoing requests.

Different types of firewalls:

Proxy-based firewalls: These are proxies that sit between the server and client. When a client sends a request to the server, they intercept it, sanitise it and sends to the server. Similarly, when a request is generated from the server, it is intercepted and scanned only for the requested, safe response and then sent.
A proxy-based firewall is kind of like a bouncer at a bar. This bouncer stops guests before they enter the bar to make sure they are not underage, armed, or in any other way a threat to the bar and its patrons. The bouncer also stops patrons on their way out to ensure that they have a safe way to get home and are not planning to drink and drive.
The downside of having a bouncer at the bar is that when multiple users simultaneously enter or leave the bar, it creates a long queue, thereby increasing the waiting time. Similarly, when traffic increases or there is a spike in traffic, it creates congestion, thereby increasing latency.

Stateful firewalls: These firewall saves information regarding open connections and use this information to analyse incoming and outgoing traffic, rather than inspecting each packet. They rely on a lot of context while making decisions. They are faster than proxy-based firewalls because they do not inspect every packet and rely on saved data.

Web application firewalls (WAF): Traditional firewalls help to secure private networks, while a Web Application Firewall protects the server by monitoring HTTP requests coming to the server. It protects the server from attacks like cross-site forgery, cross-site-scripting (XSS), file inclusion, SQL injection, etc.

WAF is a type of reverse proxy that operates through a set of rules often called policies. These policies are responsible for filtering out malicious traffic during an attack, and during a DDoS attack, rate limiting can be implemented by adding a policy. One of the most widely used commercial WAFs is Cloudflare’s WAF, which protects millions of applications from attack everyday.

The Load Balancer: The Traffic Conductor

After the request comes from the firewall, it needs to be optimised such that it doesn’t request a server that is already busy and serving multiple requests at once with a huge load. Load balancers come into rescue, where their core responsibility is distribution and redundancy, ensuring no single piece of hardware becomes a bottleneck or a single point of failure.
Load balancers behave like a traffic police standing at a multi-lane junction directing cars into the clearest path so the highway never grinds to a halt. For this, load balancers use multiple algorithms, most common is the round robin algorithm. It is a high-speed "Traffic Hub" that sits between the Firewall and your server rack, masking the IP addresses of your individual backend servers from the public.

Why is it so important?

When the number of users grows, it becomes impossible to scale vertically after a certain point in time; horizontal scaling comes into the picture. As the traffic grows, multiple servers are introduced to delegate the load coming into one server, and this function of delegation is done via load balancers. Before sending a request to a server, the load balancer ensures that it is in a good state by sending a health check request.

Real World Scenario: The Shopping Mall Model

Let’s understand the above by a real-world scenario, i.e. a shopping mall model. When you enter a shopping mall, the security guard scans your bag so that you cannot take any harmful items into the mall. When you exit, it is the responsibility of the security guard to prevent you from exiting the mall with unbilled items. This is your firewall. When you get into the mall and are done with shopping, you are helped by someone who takes you to a billing counter that has a shorter queue. That person acts like a load balancer.