Introduction
An Application Load Balancer (ALB) is AWS’s Layer 7 load balancing service for HTTP and HTTPS traffic. It does more than spread requests across servers. It inspects request content, applies routing rules, and sends each call to the right target group based on the host, path, headers, or other application-level details.
That matters because most production workloads are no longer simple web sites. They are APIs, microservices, container services, and serverless backends with different URL paths, versions, and environments. A single ALB can route /api to one service, /images to another, and the same domain to different backends based on host name.
ALBs are different from Classic Load Balancers and Network Load Balancers. Classic Load Balancers are the older generation and lack many modern features. Network Load Balancers operate at Layer 4 and are built for very high performance, static IP use cases, and non-HTTP protocols. ALBs sit in the middle: application-aware, flexible, and built for modern web traffic.
According to AWS documentation, ALBs support content-based routing, containerized workloads, and Lambda targets. This guide walks through planning, creation, routing, security, monitoring, optimization, and best practices so you can build an ALB that works in production, not just in a lab.
Understanding Application Load Balancers
An ALB operates at Layer 7 of the OSI model, which means it understands HTTP request details such as the host header, URL path, query string, and methods. That gives you routing logic that is impossible with a simple Layer 4 balancer. Instead of sending traffic blindly, the ALB makes decisions based on what the client is asking for.
The core building blocks are straightforward. Listeners accept incoming traffic on ports such as 80 and 443. Rules evaluate request conditions and decide what action to take. Target groups define where traffic should go. Targets are the actual destinations, such as EC2 instances, IP addresses, or Lambda functions. AWS explains this structure in its Application Load Balancer listener and rule documentation.
Common use cases include path-based routing for microservices, host-based routing for multiple domains, blue/green deployments, and API routing. For example, app.example.com might go to one target group, while api.example.com goes to another. A release can be shifted gradually by sending a portion of traffic to a green environment.
Use an internet-facing ALB when clients on the public internet need access. Use an internal ALB when the traffic stays inside your VPC, such as between application tiers or internal APIs. This distinction is important for security and network design. ALBs also integrate with Amazon ECS, Amazon EKS, AWS Lambda, Auto Scaling, and AWS Certificate Manager for flexible application delivery.
Key Takeaway
ALBs are best when routing decisions depend on application content. If you need host-based or path-based routing for web traffic, ALB is usually the right first choice.
Common ALB routing patterns
- Path-based routing:
/api,/admin, and/staticeach map to different backend services. - Host-based routing:
tenant1.example.comandtenant2.example.comroute to separate tenants or services. - Blue/green routing: Production traffic shifts from one target group to another during deployment.
- API routing: Specific endpoints can be isolated without changing DNS structure.
Planning Your ALB Architecture
Good ALB design starts with the application, not the AWS console. Before you create anything, define traffic patterns, latency targets, availability goals, and scaling behavior. A high-traffic API with strict uptime requirements needs a very different design from an internal admin portal. AWS recommends placing load balancers across multiple Availability Zones for resilience, and that should be the default assumption for production workloads.
Choose your VPC and subnets carefully. For internet-facing ALBs, place subnets in at least two Availability Zones with public route table access. For internal ALBs, use private subnets and keep traffic inside the network boundary. If your application spans multiple AZs, your ALB should too. This helps you survive a zonal failure without rearchitecting later.
One ALB is often enough for several services if they share the same security posture, certificate strategy, and operational team. Multiple ALBs make sense when you need isolation between environments, different domains, separate compliance boundaries, or independent release cycles. Don’t split too early, but don’t force everything behind a single entry point if governance becomes messy.
Plan security groups, DNS, logging, and TLS certificates before deployment. It is easier to design for these from day one than to retrofit them after traffic is live. If you need detailed guidance on network design, the AWS VPC documentation and AWS Certificate Manager documentation are the right starting points.
Pro Tip
Design the ALB around failure domains. If an Availability Zone disappears, your application should keep serving traffic without manual intervention.
Architecture questions to answer first
- How many domains, services, and environments will the ALB support?
- Do you need public access, private access, or both?
- Will traffic be bursty or steady?
- Are you terminating TLS at the ALB or passing it through?
- Do you need centralized logging for compliance or incident response?
Creating an Application Load Balancer
In the AWS Console, creation starts in the Elastic Load Balancing area. Select Application Load Balancer, name the resource, choose the scheme, select the IP address type, and assign subnets. AWS supports IPv4 and dualstack configurations, so decide whether IPv6 is part of your network plan. The official AWS ALB creation guide walks through these fields in detail.
The load balancer name should be clear and consistent with your naming standard. Use a format that communicates environment, application, and function. For example, a name like prod-orders-web-alb is easier to manage than a random label. Select subnets in at least two AZs. If you are creating an internet-facing ALB, make sure those subnets are public and have routes to an internet gateway.
Security groups control inbound and outbound traffic at the instance and load balancer level, so they matter immediately. A public ALB usually allows inbound 80 and 443 from the appropriate source ranges, while a private ALB should allow only internal networks or application tiers. After the ALB is created, configure listeners for HTTP and HTTPS. Most production systems should redirect HTTP to HTTPS rather than leaving port 80 open for content delivery.
Once the ALB is live, validate the DNS name, listener status, and target health. If the DNS name resolves but targets are unhealthy, the issue is usually in the target group, health check path, security group, or application startup process.
Console creation checklist
- Select Application Load Balancer.
- Choose internet-facing or internal.
- Select IPv4 or dualstack.
- Pick subnets in at least two AZs.
- Attach security groups.
- Create listeners for 80 and 443.
- Verify DNS resolution and target health.
Configuring Target Groups and Health Checks
Target groups are where the real routing decisions end up. They define the backend application endpoints and the health checks that determine whether those endpoints should receive traffic. In practice, a target group is your control point for availability. If targets are marked unhealthy, the ALB stops sending them traffic. AWS documents target group behavior in the target group guide.
ALBs support three main target types. Instance targets register EC2 instances directly. IP targets are useful for containers, on-premises endpoints, and service discovery patterns. Lambda targets let you route HTTP requests to serverless functions without a fleet of servers. Choose the target type based on how your application actually runs, not on what seems simplest during setup.
Health checks deserve real attention. Configure the protocol, path, interval, timeout, and thresholds so they reflect true readiness. A health endpoint should not just return “200 OK.” It should confirm the service can actually process real work, reach dependencies if necessary, and respond within your latency budget. If your app needs warmup time, don’t point health checks at the root path if that path returns success before the app is truly ready.
Use deregistration delay to let in-flight requests finish during deployment or scale-in. Use slow start when newly registered targets need time to warm up before taking full traffic volume. These features reduce disruption during deployments and scale events.
Warning
Do not use an over-simplified health endpoint that always returns success. It can hide real failures and send traffic to an application that is up but not usable.
Health check design tips
- Use a dedicated endpoint such as
/healthor/ready. - Make the endpoint fast and deterministic.
- Include dependency checks only if the dependency is required for service readiness.
- Set thresholds to avoid flapping from brief spikes or startup delays.
Setting Up Routing and Listener Rules
Listener rules are where ALBs become intelligent. The default action applies when no rule matches, so set it deliberately rather than leaving it vague. Common defaults include forwarding to the main production target group or returning a fixed 404 response for unexpected traffic. AWS explains listener rule behavior in its listener rules documentation.
Host-based routing is ideal when a single ALB serves multiple domains or subdomains. For example, billing.example.com can route to one service while support.example.com routes to another. Path-based routing is useful when one domain hosts multiple services, such as /api, /admin, and /docs. This pattern is common in microservices and platform teams.
ALB rules also support redirects, fixed responses, and forwarding to multiple target groups. Redirects are useful for HTTP to HTTPS enforcement or legacy URL migration. Fixed responses are clean for maintenance windows, blocked paths, or simple error handling. Weighted forwarding supports safer releases by splitting traffic across multiple target groups.
For monoliths, keep rules simple and avoid unnecessary complexity. For microservices, separate by path or host and keep ownership clear. For multi-tenant systems, host-based routing by tenant domain or path-based routing by tenant ID can isolate traffic cleanly. The rule set should be readable by the operations team at 2 a.m., not just by the developer who built it.
Practical routing examples
| Scenario | Recommended routing approach |
|---|---|
| Single web app | One listener, one default target group, HTTP to HTTPS redirect |
| Microservices | Path-based routing by service prefix |
| Multi-tenant SaaS | Host-based routing by tenant domain |
| Blue/green release | Weighted forward to old and new target groups |
Securing Your Application Load Balancer
TLS termination should usually happen at the ALB. That lets you centralize certificate management and offload encryption work from backend targets. Use AWS Certificate Manager (ACM) to provision and manage certificates, then attach them to the HTTPS listener. AWS documents this workflow in its ACM overview and ALB integration guides.
Force HTTPS with a redirect from port 80 to 443. This is one of the simplest security wins you can implement. It prevents accidental plaintext access and makes browser behavior predictable. If you need strict enforcement, combine the redirect with HSTS at the application layer so clients remember to use HTTPS.
Security groups and network ACLs help restrict traffic, but they solve different problems. Security groups are stateful and usually the primary control at the ALB and target level. Network ACLs are stateless and work at the subnet boundary. In most cases, security groups do the heavy lifting, while NACLs provide broader guardrails. For compliance-sensitive systems, that layered approach matters.
ALBs can also integrate with authentication systems such as Amazon Cognito or OIDC providers. This is useful when you want the load balancer to handle authentication before traffic reaches the app. For secure headers, use the ALB to enforce TLS and redirects, then configure the application to emit headers like HSTS, X-Content-Type-Options, and Content-Security-Policy. PCI DSS and other frameworks often expect strong transport protection and control over public-facing services; see the PCI Security Standards Council for baseline expectations.
Note
Security at the ALB is not a substitute for application security. Treat the ALB as one control layer in a larger defense-in-depth design.
Monitoring, Logging, and Troubleshooting
ALB access logs are essential when you need to understand who hit what, when, and how the request was handled. You can deliver these logs to Amazon S3 and use them for auditing, incident review, and traffic analysis. AWS documents the format and configuration in the ALB access logging guide.
The key CloudWatch metrics are request count, target response time, HTTP code counts, and target health state. These tell you if the ALB is receiving traffic, whether your targets are slow, and whether errors are originating at the application or the edge. A spike in 5xx responses should trigger an investigation into target health, application logs, and recent deployments.
Create CloudWatch alarms for unhealthy host counts, elevated latency, and unusual error rates. Don’t wait for users to complain. Pair alarms with operational runbooks so your team knows exactly where to look first. If the ALB is healthy but the app is failing, the issue is often behind the load balancer rather than in it.
A systematic debugging workflow works best. Start with target health. Then inspect listener rules. Then check security groups and NACLs. Finally, read the application and access logs side by side. This removes guesswork and helps you isolate whether the problem is routing, networking, or the service itself.
“Most load balancer incidents are not load balancer problems. They are health check, routing, or application readiness problems that only show up at the ALB layer.”
Common failure patterns
- Targets return 5xx because the app cannot connect to a database.
- Health checks fail because the path is wrong or the port is closed.
- Requests hit the wrong service because a listener rule is too broad.
- HTTPS fails because the certificate does not match the host name.
Optimizing Performance and Cost
ALBs scale automatically, so you do not need to pre-provision capacity the way you might with older architectures. That said, scale does not remove the need for design discipline. AWS handles the load balancer scaling layer, but your targets, health checks, and routing rules still determine end-user experience during traffic spikes.
Reduce latency by placing targets close to the ALB in the same region and, where possible, in the same AZ path used by the request. Keep health checks efficient and avoid expensive database calls. Simplify routing rules so every request does not have to evaluate a long chain of conditions. Small inefficiencies compound under load.
For cost control, consolidate services behind one ALB when the security model and ownership model allow it. That can reduce the number of load balancers you pay for and simplify certificate management. Use multiple ALBs only when separation is worth the extra cost and operational overhead.
According to AWS pricing, ALB charges include hourly and Load Balancer Capacity Unit components, while other load balancing options have different pricing and feature tradeoffs. If you need Layer 4 behavior or static IP use cases, a different load balancer may be more appropriate. For general web traffic, ALB usually gives the best balance of features and operational simplicity.
Pro Tip
Test spike behavior before production rollout. Even though ALBs scale automatically, bad health checks or weak targets can still create a user-visible outage during a traffic burst.
Ways to validate performance
- Run pre-production load tests with realistic concurrency.
- Measure target response time before and after rule changes.
- Watch for slow-start behavior after deployment.
- Compare consolidated versus split ALB designs on cost and ops overhead.
Using ALBs with Containers and Serverless
ALBs are a natural fit for containers because service discovery changes constantly. In Amazon ECS and Amazon EKS, tasks and pods scale up and down frequently, and IP-based registration is often the cleanest way to keep the ALB in sync with running workloads. AWS supports this pattern through service integrations that register and deregister targets automatically.
For ECS, an ALB commonly fronts one or more services and forwards to IP or instance target groups depending on the network mode. For EKS, ingress controllers and ALB integrations can route traffic to Kubernetes services and pods without manual target maintenance. That makes the ALB part of the delivery pipeline rather than a static network appliance.
Blue/green and rolling deployments work well with containers. A new task set or deployment group can receive a portion of traffic, be checked for health, and then take over gradually. This is safer than replacing all traffic at once. For Lambda, ALB target groups let HTTP requests trigger functions directly, which is useful for lightweight APIs, event-style endpoints, or short-lived backend logic. The AWS Lambda and ALB documentation explains the integration model.
Operationally, pay attention to port mapping, service health, and autoscaling thresholds. A container may be healthy from the orchestrator’s perspective but still fail the ALB health check if the port is wrong or the readiness endpoint is too strict. That mismatch is a common source of confusion in containerized environments.
Best Practices and Design Patterns
Use naming and tagging standards from the beginning. Include environment, application, and purpose in the ALB name, and tag resources with owner, cost center, and application tier. This makes cleanup, chargeback, and incident response much easier. It also helps when multiple teams share the same AWS account structure.
Separate dev, staging, and production when the change cadence or risk profile differs. Development traffic should not affect production listener rules, certificates, or access logs. Staging should mirror production as closely as practical, especially for certificate validation, health checks, and routing behavior. If the environments diverge too much, your testing loses value.
Use multiple target groups and weighted routing when releases need caution. Weighted forwarding lets you compare old and new versions under real traffic before full cutover. That is a practical way to reduce release risk without building a separate traffic management system. Centralize load balancing when teams share standards and governance. Decentralize when autonomy and fault isolation matter more.
Automation should be the default. Use CloudFormation or Terraform to define ALBs, listeners, target groups, and rules as code. That creates repeatability and peer review. For governance, align with the AWS Well-Architected Reliability Pillar and your internal change control process so load balancer changes are traceable.
Design patterns worth standardizing
- One public ALB per major application or domain family.
- One internal ALB for east-west service access where appropriate.
- Blue/green target groups for production releases.
- Infrastructure as code for every listener, rule, and certificate attachment.
Conclusion
Application Load Balancers solve a specific problem very well: routing HTTP and HTTPS traffic based on application context. If you plan the architecture carefully, create the ALB with the right subnets and security groups, configure target groups and health checks correctly, and keep routing rules simple enough to operate, you get a reliable traffic layer that scales with your application.
That reliability matters because the ALB sits in the path of your users, APIs, and services. A good design improves availability, reduces deployment risk, and makes troubleshooting far easier. A bad design creates silent failure modes, confusing health checks, and unnecessary cost. The difference is usually in the planning and the operational discipline, not in the AWS console clicks.
For production workloads, the pattern is clear: secure the listener with ACM, redirect HTTP to HTTPS, monitor CloudWatch metrics and access logs, and use weighted routing or slow start when changing live traffic. If you run containers or serverless functions, let the ALB support the delivery model instead of fighting it. Keep the system observable, documented, and automated.
Vision Training Systems helps IT professionals build these skills with practical cloud training focused on real deployment work. If your team is standardizing AWS networking or preparing for production migrations, use this guide as your baseline and turn it into your internal ALB runbook. The best time to fix an ALB design is before users notice the problem.