Finding Kafka’s throughput limit in Dropbox infrastructure

Finding Kafka’s throughput limit in Dropbox infrastructure

Understanding Kafka’s throughput limit in Dropbox infrastructure is crucial in making proper provisioning decision for different use cases, and this has been an important goal for the team. There is a rich set of factors that can affect a Kafka cluster’s workload: number of producers, number of consumer groups, initial consumer offsets, message per second, size of each message, and the number of topics and partitions involved, to name a few. Each point in that multidimensional space corresponds to a unique traffic pattern that can be applied to Kafka, and it’s represented by a vector of parameters: .

Source: blogs.dropbox.com