Skip Navigation
Show nav
Dev Center
  • Get Started
  • Documentation
  • Changelog
  • Search
  • Get Started
    • Node.js
    • Ruby on Rails
    • Ruby
    • Python
    • Java
    • PHP
    • Go
    • Scala
    • Clojure
    • .NET
  • Documentation
  • Changelog
  • More
    Additional Resources
    • Home
    • Elements
    • Products
    • Pricing
    • Careers
    • Help
    • Status
    • Events
    • Podcasts
    • Compliance Center
    Heroku Blog

    Heroku Blog

    Find out what's new with Heroku on our blog.

    Visit Blog
  • Log inorSign up
Hide categories

Categories

  • Heroku Architecture
    • Compute (Dynos)
      • Dyno Management
      • Dyno Concepts
      • Dyno Behavior
      • Dyno Reference
      • Dyno Troubleshooting
    • Stacks (operating system images)
    • Networking & DNS
    • Platform Policies
    • Platform Principles
  • Developer Tools
    • Command Line
    • Heroku VS Code Extension
  • Deployment
    • Deploying with Git
    • Deploying with Docker
    • Deployment Integrations
  • Continuous Delivery & Integration (Heroku Flow)
    • Continuous Integration
  • Language Support
    • Node.js
      • Working with Node.js
      • Troubleshooting Node.js Apps
      • Node.js Behavior in Heroku
    • Ruby
      • Rails Support
      • Working with Bundler
      • Working with Ruby
      • Ruby Behavior in Heroku
      • Troubleshooting Ruby Apps
    • Python
      • Working with Python
      • Background Jobs in Python
      • Python Behavior in Heroku
      • Working with Django
    • Java
      • Java Behavior in Heroku
      • Working with Java
      • Working with Maven
      • Working with Spring Boot
      • Troubleshooting Java Apps
    • PHP
      • PHP Behavior in Heroku
      • Working with PHP
    • Go
      • Go Dependency Management
    • Scala
    • Clojure
    • .NET
      • Working with .NET
  • Databases & Data Management
    • Heroku Postgres
      • Postgres Basics
      • Postgres Getting Started
      • Postgres Performance
      • Postgres Data Transfer & Preservation
      • Postgres Availability
      • Postgres Special Topics
      • Migrating to Heroku Postgres
    • Heroku Key-Value Store
    • Apache Kafka on Heroku
    • Other Data Stores
  • AI
    • Model Context Protocol
    • Vector Database
    • Heroku Inference
      • Inference API
      • Quick Start Guides
      • AI Models
      • Inference Essentials
    • Working with AI
  • Monitoring & Metrics
    • Logging
  • App Performance
  • Add-ons
    • All Add-ons
  • Collaboration
  • Security
    • App Security
    • Identities & Authentication
      • Single Sign-on (SSO)
    • Private Spaces
      • Infrastructure Networking
    • Compliance
  • Heroku Enterprise
    • Enterprise Accounts
    • Enterprise Teams
    • Heroku Connect (Salesforce sync)
      • Heroku Connect Administration
      • Heroku Connect Reference
      • Heroku Connect Troubleshooting
  • Patterns & Best Practices
  • Extending Heroku
    • Platform API
    • App Webhooks
    • Heroku Labs
    • Building Add-ons
      • Add-on Development Tasks
      • Add-on APIs
      • Add-on Guidelines & Requirements
    • Building CLI Plugins
    • Developing Buildpacks
    • Dev Center
  • Accounts & Billing
  • Troubleshooting & Support
  • Integrating with Salesforce
  • Databases & Data Management
  • Apache Kafka on Heroku
  • Kafka Event Stream Modeling

Kafka Event Stream Modeling

English — 日本語に切り替える

Last updated November 30, 2022

Table of Contents

  • Core Apache Kafka concepts
  • Considerations to balance
  • Modeling to support your product logic
  • Further reading

Apache Kafka on Heroku is a powerful tool for creating modern application architectures, and for dealing with high-throughput event streams. Moving to a world of streaming event data, though, isn’t as simple as switching out the relational database that your ORM interacts with. To get the most out of streaming data, you must tune both your data and your Kafka configuration to support your product’s logic and needs.

Core Apache Kafka concepts

As covered in the Apache Kafka on Heroku article, there are a number of core concepts that are critical for understanding and tuning Apache Kafka on Heroku. The critical concepts for this discussion are topics and partitions.

Topics are the primary channel- or stream-like construct in Kafka, representing a type of event, much like a table would represent a type of record in a relational data store.

Topics are comprised of some number of partitions. Each partition contains a discrete subset of the events (or messages, in Kafka parlance) belonging to a given topic.

Modifying the number and usage of these partitions is important to tuning your Kafka for your product, and for balancing ordering, parallelism, and resilience concerns.

Considerations to balance

The following are the key properties to balance when evaluating the partition structure for use with a given topic.

Message ordering

Messages within a given partition is strictly ordered, but this ordering isn’t guaranteed across partitions.

Consumer group parallelism

A consumer group can have as many parallel consumers of a topic as there are partitions in the topic.

Resource utilization

High numbers of partitions can increase the resource utilization and time to recover or re-elect leaders when brokers are recovering from failure.

Custom partition functions

Producers can choose arbitrary logic for sending messages to the partitions within a topic, using basic hashing for even distribution, or specific logic to maintain ordering and throughput semantics for a given product’s needs.

The consideration of these attributes, while not exhaustive, provides a strong basis for the design of your topic’s partition structure.

Modeling to support your product logic

If strict ordering of your events isn’t paramount, but you require extremely high parallelism for throughput, it makes sense to go with a partition count high enough to work with a scaled out consumer group, but low enough to not impose undue burden on the cluster.

If strict ordering is important in your product’s logic, it’s important to be clear about the domain under which that ordering matters. For instance, is ordering required globally, across all changes to all state? Or is ordering only required for changes related to a given user or account? Over what time period does ordering matter? It’s often reasonable to build a compound key based off the attributes that matter for ordering, and to consistently hash messages to partitions based on those keys. For instance, partitioning by the combination of user_id and session_id would provide for strict ordering of events related to a given user’s session, but wouldn’t maintain ordering across sessions or users.

Further reading

The following are excellent resources from the broader Kafka community that can be useful in optimizing the way that your partitions are modeled for your application’s need.

  • How to choose the number of topics/partitions in a Kafka cluster? by Confluent
  • Message delivery semantics in the core Apache Kafka documentation
  • Best Practices for Running Kafka Connectors on Heroku

Keep reading

  • Apache Kafka on Heroku

Feedback

Log in to submit feedback.

Robust Usage of Apache Kafka on Heroku Kafka Streams on Heroku

Information & Support

  • Getting Started
  • Documentation
  • Changelog
  • Compliance Center
  • Training & Education
  • Blog
  • Support Channels
  • Status

Language Reference

  • Node.js
  • Ruby
  • Java
  • PHP
  • Python
  • Go
  • Scala
  • Clojure
  • .NET

Other Resources

  • Careers
  • Elements
  • Products
  • Pricing
  • RSS
    • Dev Center Articles
    • Dev Center Changelog
    • Heroku Blog
    • Heroku News Blog
    • Heroku Engineering Blog
  • Twitter
    • Dev Center Articles
    • Dev Center Changelog
    • Heroku
    • Heroku Status
  • Github
  • LinkedIn
  • © 2025 Salesforce, Inc. All rights reserved. Various trademarks held by their respective owners. Salesforce Tower, 415 Mission Street, 3rd Floor, San Francisco, CA 94105, United States
  • heroku.com
  • Legal
  • Terms of Service
  • Privacy Information
  • Responsible Disclosure
  • Trust
  • Contact
  • Cookie Preferences
  • Your Privacy Choices