Skip to main content
Data / ML, Engineering

Securing Kafka® Infrastructure at Uber

April 7, 2022 / Global
Featured image for Securing Kafka® Infrastructure at Uber
# Filename: $JAVA_HOME/conf/security/java.security   …
security.provider.9=sun.security.smartcardio.SunPCSC
security.provider.10=com.uber.kafka.UPKIProvider (Here 10 is the position of the UPKIProvider in the list of providers available with the JDK)
# Key and Trust manager algorithms for fetching keys and certs ssl.keymanager.algorithm=UPKI
ssl.trustmanager.algorithm=UPKI # inject custom UPKIProvider to retrieve key and certs from uPKI
security.providers=com.uber.kafka.security.provider.UPKIProvider
Figure 1: uPKI Identity provisioning and initial identity fetch on service launch
Figure 2: Rotating workload identities
ActorGeneric term for an entity that is the subject of an authorization decision. In the Kafka example, the Actor value is fetched during the authentication stage after a successful TLS handshake happens. Actors are in the form of . For example, for a service ‘TripService’, the identity provided to it could be `spiffe://upki/tripService` which will act as the Actor ID. Functionally, this is the equivalent of a java.security.Principal.
ResourceA resource consists of a domain and a value on which authorization is enforced. For example, for a Kafka topic ‘trips’, the resource will be named as `urn:uber:infra:kafka:topics:trips` where `urn:uber:infra:kafka:topics` is the domain and ‘trips’ is the value. Other Kafka domains could be `urn:uber:infra:kafka:clusters` and `urn:uber:infra:kafka:groups` to enforce authorizations on clusters and consumer groups respectively.
OperationAction which is being attempted by the Actor on the Resource. While registering a resource domain with Charter, one can provide a list of permissible operations. For example, for the domain `topics.kafka://` available operations are `write`, `read`, `alter`, `delete` and `describe`.
{“actor”: “spiffe://upki/tripService”,
“resource”: “urn:uber:infra:kafka:topics:trips”,
“operation”: “write”}
Figure 3: Sequence diagram which shows how Producer sends a message to Kafka cluster
Figure 4: Deep dive into how UPKIProvider fetches Key/Certs from uPKI and furnish them to JVM
Figure 5: Authorization workflow
Figure 6: Two way Authorizer Lookup with “allow.everyone.if.no.acl.found=true
Figure 7: Latency Improvements seen with JDK11
Prateek Agarwal

Prateek Agarwal

Prateek Agarwal is a Staff Software Engineer on Uber’s Streaming Data Team. He is passionate about distributed systems, security, and automation areas. He has been working on highly available, fault resilient streaming systems, including core Kafka, Zookeeper, and Kafka ecosystem services.

Ryan Turner

Ryan Turner

Ryan Turner is a Staff Software Engineer leading Platform Authentication and Kubernetes Security initiatives and a maintainer of the SPIRE project.

KK Sriramadhesikan

KK Sriramadhesikan

KK Sriramadhesikan is a Senior Staff Security Engineer on Uber’s Security Engineering team. He leads secure authentication and authorization across Uber’s security infrastructure.

Posted by Prateek Agarwal, Ryan Turner, KK Sriramadhesikan