How Disney+Hotstar Defends Millions of Requests per Second
While API gateways offer many security features, they themselves must be secured. Find out how Disney+Hotstar fortifies API gateway.
Finally, we get to see ‘elite level cricket’ in the US.
In 1844, the St. George’s Cricket Club in New York hosted the first international game of cricket between the United States and Canada.
Now, 180 years later, things have come full circle. We just saw the US beat Canada in the ICC Men’s T20 World Cup in Dallas. And we are not surprised to see the demand for sports streaming skyrocketing.
Meanwhile, the OTT streaming industry is struggling because of streaming rights, regional constraints, blackouts, seasonality of sports, and other exclusivity caveats.
Remember FIFA WC 2022? Those crashing apps? Buffering streams?? Well, poor scalability and performance issues are not the only tech blunders that plague streaming giants.
Authentication exploits can be pretty serious as well, resulting in financial losses, availability issues, and a negative impact on the user experience.
API gateways provide an important layer of security for APIs. But streaming services like Netflix and Disney+ Hotstar have complex microservices architectures that require additional security measures fortifying API gateway.
They need to properly authenticate requests to origin APIs while also preventing potential exploits from attacks, and that was one of the key challenges Disney+Hotstar faced.
So, how does it defend millions of requests per second against potential exploitations? We’ll soon find out.
First, let’s look at the infrastructure that Disney+Hotstar employs to handle watch events on their platform.
Source: Techahead
Disney+Hotstar utilizes AWS for hosting, using EC2 instances for almost all traffic and S3 as their data store. They employ a mix of on-demand and spot instances, optimizing costs with ML algorithms, and process large volumes of data using AWS EMR clusters.
Their infrastructure, geared for events like IPL matches, includes 500+ high CPU-intensive instances (C4.4X Large/C4.8X Large) with 16 TBs of RAM and 8000 CPU cores.
When a user sends a watch event from their device, it is first received by Kafka, which streams the events. The event is processed and saved to data stores, including Elasticsearch for powerful search capabilities. Redis acts as a high-speed buffer for reading and writing data, ensuring swift responses.
The API server handles these buffered responses and interfaces with the API gateway to deliver back the required information to the user’s device, ensuring seamless real-time streaming.
To facilitate secure and efficient user access to the streaming platform, Disney+Hotstar's architecture incorporates a robust authentication mechanism that complements the event-driven infrastructure responsible for handling watch events.
Old architecture for user authentication
The old architecture of Disney+Hotstar involved using JWT tokens to authorize client requests. Clients sent requests with JWT tokens, which then passed through an API proxy and a load balancer before reaching the origin services. Each origin service independently authenticated the JWT token using a dedicated in-house token library.
Source: Disney+ Hotstar
Limitations with this architecture?
Distributed authentication burden: Each client-facing service was responsible for handling authentication, requiring a deep understanding of the authentication process. This setup posed a security risk as it increased the chances of errors. Additionally, sharing token secrets with multiple services violated the "Least Privilege" principle, expanding the attack surface and potential for security breaches.
Version management issues: Different services were running varying versions of the in-house token SDK, leading to inconsistencies. This disparity caused complications in rolling out updates, such as key rotations and other security enhancements, which could lead to potential vulnerabilities or service disruptions.
Inconsistent error handling: Unauthenticated requests often resulted in inconsistent error responses across different services. This inconsistency made it challenging to maintain a uniform interface and business contract between clients and services, potentially leading to a poor user experience and difficulty in debugging issues.
New architecture with centralized gateway authentication
The new architecture consolidates all the tasks into a dedicated Gateway AuthService and integrates authentication checks as plugins in the Emissary API Gateway for enhanced flexibility and security.
A quick understanding of Emissary API Gateway-
The Emissary API Gateway is a Kubernetes-native API Gateway built on the Envoy Proxy that evaluates incoming client API requests and routes them to the appropriate backend APIs through Mappings and Listeners, providing robust security and authentication features. It is designed to be a self-service API Gateway that empowers developers to manage day-to-day configurations while allowing cluster/platform operators to manage global configuration, such as authentication and security resources.
Some key changes this new architecture brought
Centralized authentication: Introduction of the Gateway AuthService which handles all authentication requests. This service performs token validation, silent refresh, and envelope generation, ensuring consistent authentication across all services.
API gateway integration: Implementing authentication checks as plugins in the Emissary API Gateway. These plugins are invoked based on specific criteria in the request path, allowing for tailored authentication measures for different APIs.
Token authentication: Gateway AuthService conducts basic JWT token authentication by verifying the token signature, expiration, and other relevant data.
3rd party auth integration: Adoption of pluggable authentication for third-party requests, enabling flexible and customizable authentication processes.
Silent token refresh: Token lifecycle management is centralized. The Gateway Auth Service manages the silent refresh of tokens without client intervention, ensuring uninterrupted access to APIs and enhancing user experience.
User session identity: To streamline user session management and avoid passing user tokens across multiple services, a new identity structure known as "Envelope" is introduced. This envelope is generated once at the Gateway, enriched with common data, and can be consumed by all origin services in the request chain, simplifying validation and data access.
Consistent and enriched responses: The introduction of the envelope ensures consistent unauthenticated error responses and enriched payload data. Origin services rely on the pre-enriched envelope data for authentication, reducing the likelihood of errors and inconsistencies.
These improvements ensure that Hotstar can safely manage user identity tokens, maintain robust security, and protect origin services from invalid requests while providing a flexible and extensible authentication framework.
How Disney+Hotstar solved major challenges with the Gateway Authentication solution?
1. User session identity (Envelope)
To securely handle user identity, Disney+ Hotstar introduced a new identity structure called “Envelope.” It replaces the traditional JWT token method, which can be fragile. The Envelope, modeled as a Protocol Buffer, carries user identity info and can be enriched with additional data as needed. It’s valid only for the duration of a client request and passed between internal services using a dedicated SDK.
2. Centralized data enrichment
Disney+ Hotstar enriches user data at the edge when generating the Envelope. For instance, “User Cohorts” categorizes users with similar behavior for targeted experiences. Users who prefer certain sports get notifications when relevant events are live. Similarly, users with expiring subscriptions receive reminders.
3. Force session block and token refresh
To manage security and update user access, Hotstar blocks or refreshes tokens under specific conditions (e.g., logout, flagged malicious activity, or subscription changes). Instead of using costly methods like Redis checks or local caching for every request, they implemented a Bloom Filter. This space-efficient, probabilistic data structure checks if a session is likely blocked.
That’s how Disney+Hotstar built a new age authentication system that handles user token validation, refreshes user logouts, changes in subscription, user data enrichment in the Envelope, and security features at the gateway.
Disney+Hotstar’s ability to handle millions of requests per second relies on a robust and scalable architecture. This architecture utilizes cloud-based services like Amazon API Gateway to efficiently manage and route incoming API requests.
How Amazon API Gateway works
Source: Amazon API Gateway
Amazon API Gateway acts as a front door for applications to access data, business logic, or functionality from backend services.
Clients such as connected users, web/mobile applications, IoT devices, and private applications make requests that are proxied through API Gateway.
API Gateway can cache responses and is monitored by Amazon CloudWatch for operational insights.
The requests are then routed to various AWS services like AWS Lambda, Amazon EC2, Amazon Kinesis, and Amazon DynamoDB or other public and private endpoints, enabling secure and scalable API interactions.
Amazon API Gateway offers efficient API development, performance at any scale, cost savings through tiered pricing, support for RESTful APIs, easy monitoring with CloudWatch integration, and many other advantages.
By centralizing API authentication at the gateway, one can minimize the risk of each microservice attempting independently to manage access, token verification, and other elements of the authentication process, as Disney+Hotstar did.
In addition to this, you can also implement rate limiting, conduct continuous monitoring, remove unused APIs, deploy a web application firewall, and use behavioral analytics to detect and respond to potential threats to enhance API Gateway security.