What is Apache Kafka | Kafka Streaming API | RapidAPI (2022)

Table of Contents

  • What is Kafka (Apache Kafka)?
    • How does it work?
      • Example
      • Kafka Infrastructure
  • What is Event Streaming?
    • What is Kafka Streams API?
    • What Can I Use It For?
  • Kafka Use Cases
  • How to Use Kafka Types on RapidAPI
    • 1. Navigate to the Demo Kafka API
    • 2. Inspect Dashboard Panels
    • 3. Review the Schema and Types For the Products Topic
    • 4. Consume and Produce events in the Dashboard
  • Conclusion
    • What is Kafka stream processing?
    • Is Kafka an API?
    • What is Kafka used for?
    • What are topics in Kafka?
    • What is Kafka in simple words?
    • Footnotes

Apache Kafka is an open-source distributed event streaming platform

kafka.apache.org

Kafka was developed at LinkedIn in the early 2010s. The software was soon open-sourced, put through the Apache Incubator, and has grown in use. The platform’s website claims that over 80% of Fortune 100 companies use or trust Apache Kafka1.

The “Apache” part of the name is drawn from the Apache Software Foundation. This foundation is a community of developers that work on producing and maintaining open-source software projects.

Furthermore, the platform was bourne out of the deconstruction of monolithic application architectures2. Creating an ecosystem of Producers, Consumers, Topics, Logs, Events, Connectors,Clients, and Servers. All of these terms are actors in the system supporting event streaming.

This article will briefly discuss how Apache Kafka works, its use cases, and relevant terminology. After that, I’ll walk you through how to discover and test Kafka Topics with RapidAPI easily.

How does it work?

There’s a helpful paradigm shift that has to occur when understanding Kafka after working with REST APIs. Kafka is focused onEvents, notthings3.

Theoretically, during the operations of an application, an Event may occur. That Event contains data describing what happened. Subsequently, the Event is emitted and stored in a Log. TheLog is a series of Events ordered by the time they occurred. EachLog is defined as a Topic. Therefore, eachEventis defined by its Topic, timestamp, and data4. Finally, other devices can subscribe to theTopic and retrieve data.

This is only part of the picture. Next, we need to discussProducersandConsumers. Whenever a Client application or device is creatingEvents it is known as a Producer. Intuitively, applications or devices that subscribe toTopics(a log of events) is aConsumer.

Example

On your phone, you have a stock market application, and you’re following the daily price of Amazon (AZMN). The application on your phone (Consumer) displays the current price for AMZN (the Topic). Your phone is constantly updating the price (reading Events), giving you real-time estimates.

How are you able to get real-time price updates? A stockbroking server emits price changes (Producer), and those changes are logged in a Topic stored in an Apache Kafka server cluster. The mobile application, in this example, reads the Topic and updates the price.

This is a simple example and does not encompass all the possible use cases for Kafka, but it helps visualize the terminology.

Kafka Infrastructure

It’s easy to imagine Apache Kafka working in the abstract. However, what does the technology stack behind the platform look like?

Apache Kafka is deployed as a cluster of servers. Consequently, the servers work together using replication to improve performance and to protect against data loss. Servers, in Kafka, are referred to as Brokersbecause they work as the intermediary betweenProducers andConsumers.

Kafka can be deployed with:

(Video) Apache Kafka® 101: Kafka Streams

  • virtual machines
  • containers
  • on-premises
  • in the cloud

Additionally, you might choose to manage the cluster using a service like Kubernetes.

“[…] event streaming is the practice of capturing data in real-time […] storing these event streams durably for later retrieval; manipulating, processing, and reacting to the event streams […] and routing the event streams to different destination technologies as needed.

Intro to Streaming, Apache Kafka Documentation

Event streaming is the core function of Apache Kafka. The stream is the lifecycle of the event as the relevant data makes its way from creation to retrieval/consumption. Apache Kafka includes a series of APIs accessed through a language-independent protocol to interact with the event streaming process. The five APIs are the:

  • Producer API
  • Consumer API
  • Streams API
  • Connect API
  • Admin API

Next, we’ll focus on the Streams API.

What is Kafka Streams API?

The Kafka Streams API allows the transforming of data with very low latency. Also, retrieving data from the Streams API can be stateless, stateful, and with a specified time range.

Apache Kafka is only supported using the Java programming language, so you won’t find a client library for other languages that are part of the project. However, client libraries exist for different languages, but they are not supported through the Apache Kafka project. A popular choice for different client libraries is Confluent.

Besides the language constraint, the Streams API is OS and deployment agnostic.

One important point with the Kafka Streams API is that it’s deployed with your Java application. It’s not set up on a different server. The API becomes a dependency in your Java application. This is slightly different than a typical API deployment.

What is Apache Kafka | Kafka Streaming API | RapidAPI (1)

Introducing the Streams API into the conversation often raises a question, what’s the difference between the Consumer API and the Streams API?

What Can I Use It For?

In the previous section, I introduced the Streams API and raised the question: What’sthe difference between the Consumer API and the Streams API?

First, let’s reiterate thatTopicsreside in Kafka’s storage layer, while streams are considered part of the processing layer6. The Consumer API and the Streams API are part of the processing layer.

The difference between the Streams API and the Consumer API is defined by the features of the Streams API. The Streams API supports exactly-once processing semantics, fault-tolerant stateful processing, event-time processing, streams/table processing (more like processing data with a traditional database), interactive queries, and a testing-kit7.

All the aforementioned features would need to be implemented on their own with the Consumer API. Sometimes organizations want the lower-level control. However, for newcomers, the Streams API is a godsend for event stream processing.

Some of the most common use cases for Apache Kafka are8:

  • Messaging
  • Activity Tracking
  • Metrics
  • Log aggregation
  • Stream Processing
  • Event Sourcing
  • Commit Log

In the final part of this article, I’ll discuss how to discover and test KafkaTopics with RapidAPI.

RapidAPI is the first API platform to allow discovery of Kafka clusters and topics, viewing of schemas, and testing/consuming of records from a browser.

(Video) What are Kafka Streams? | Kafka Streams API Tutorial | Hadoop Tutorial | Edureka

Next, let’s test sending and receiving events as part of the event processing stream. This is possible with RapidAPI’s support for Kafka APIs on its marketplace dashboard and through the enterprise hub.

This how-to section does not cover creating an Apache Kafka cluster and setting up topics. Thankfully, RapidAPI has published a Demo Kafka API to help us get started. Therefore, we will use this API in the rest of the article for convenience.

1. Navigate to the Demo Kafka API

Follow this link to the Demo Kafka API on the RapidAPI marketplace.

What is Apache Kafka | Kafka Streaming API | RapidAPI (2)

If you haven’t already, you can create an account on RapidAPI for free to explore, subscribe, and test thousands of APIs.

2. Inspect Dashboard Panels

The dashboard is divided into three panels. For REST APIs, the left side panel lists different routes. However, for Kafka APIs, the left panel lists the available Topics. For the Demo Kafka API, this includes:

  • Products
  • Transactions
  • Page Views

What is Apache Kafka | Kafka Streaming API | RapidAPI (3)

The center panel allows us to interact with the Kafka cluster inputting different values, submitting events, or consuming a real event stream. There are two parent tabs, Consume andProduce.

TheConsume tab allows us to connect to an event stream and specify the partition, time offset, and maximum records to retrieve. Additionally, we can inspect the topic schema and configuration. Clicking theView Records button connects us to the stream and starts displaying events. The animation at the top of the right panel informs us that we are subscribed and listening to the Product topic.

What is Apache Kafka | Kafka Streaming API | RapidAPI (4)

Similarly, we can select theProduce tab in the middle panel to add a new event to the stream.

3. Review the Schema and Types For the Products Topic

First, select the Products topic in the left panel.

Then, in the middle panel, click the Produce tab.

There are three sub-tabs in this section:

  • Data
  • Headers
  • Options

The Data tab displays the Key Schema and Value Schema. Also, we have a raw data input box to submit our events based on the defined schema. Here, we can test different Kafka schema types and various Kafka data types.

Next, inspect the schema definition in the middle panel and compare it to the streamed events already in the log.

What is Apache Kafka | Kafka Streaming API | RapidAPI (5)

(Video) 1. Intro to Streams | Apache Kafka® Streams API

This will help us in the next step when we connect to the Product stream, create an event, and observe the event added to the stream.

4. Consume and Produce events in the Dashboard

Now, let’s create a real event and simultaneously consume it all in the dashboard!

First, click on the Consume tab in the center panel. You can leave the default selects as they are. Click theView Records button. Events will populate on the right panel.

Next, after you connect to the Products topic, select theProduce tab in the middle panel.

In theKey input field, paste the following JSON code,

{ "category": "harmonica"}

and for theValue input, copy-and-paste the code below.

{ "name": "Hohner", "productId": "987654321"}

Finally, click theProduce Records button to send this event off into the stream.

The input fields clear, and the event immediately appears in our stream!

What is Apache Kafka | Kafka Streaming API | RapidAPI (6)

If you do not see the new event, double-check that you are still listening to the event stream. You should see the animated dots below the header in the right panel.

Congratulations on producing and consuming your first events with RapidAPI! This article took a quick look at what Apache Kafka is, how it works, and what it can do. Additionally, I gave a brief tutorial on testing and inspecting your Apache Kafka APIs or public event streams with the RapidAPI dashboard. If you’re looking for more, you can read more Apache Kafka on RapidAPI in the docs. Or, check out how to add your Kafka API to RapidAPI.

What is Kafka stream processing?

Event streaming is the practice of capturing data in real-time, storing these event streams durably for later retrieval, manipulating, processing, and reacting to the event streams. And routing the event streams to different destination technologies as needed.

Is Kafka an API?

Kafka is deployed as a cluster of servers. However, the APIs for Kafka are not part of the server deployment. They are deployed as part of your application. Then, your application uses the APIs which communicate with the server cluster.

What is Kafka used for?

Kafka is used for event-driven architectures. The Kafka platform helps collect, store, and make events available for client applications to stream data in real-time.

(Video) Kafka Streams 101: Getting Started

What are topics in Kafka?

Topics, in Kafka, are a series of events stored in a log that belong to a similar category. Hence, a topic may be a Product or Custom Activity.

What is Kafka in simple words?

Kafka is a series of servers that work together to facilitate event-streaming. This provides computers, phones, and devices the ability to react to data in real-time.

Footnotes

1 “Apache Kafka.” Apache Kafka, kafka.apache.org/.

2 Berglund, Tim. “What Is Apache Kafka®?”Www.youtube.com, youtu.be/FKgi3n-FyNU. Accessed 15 Apr. 2021.

3 See Berglund (2), explains the deconstruction of monolithic applications at the beginning of the video.

4See Berglund (2), the process explained throughout the video.

5 Berglund, Tim. “1. Intro to Streams | Apache Kafka® Streams API.”Www.youtube.com, youtu.be/Z3JKCLG3VP4?t=538. Accessed 15 Apr. 2021.

6 Noll, Michael. “Streams and Tables in Apache Kafka: Event Processing Fundamentals.” Confluent, www.confluent.io/blog/kafka-streams-tables-part-3-event-processing-fundamentals/. Accessed 15 Apr. 2021.

7 Noll, Michael.“Kafka: Consumer API vs Streams API.” Stack Overflow, stackoverflow.com/a/44041420. Accessed 15 Apr. 2021. Commented answer to a question about the difference between the Consumer API and the Streams API.

8 “Apache Kafka.”Apache Kafka, kafka.apache.org/uses. Accessed 15 Apr. 2021. List of common use cases.

FAQs

What is the difference between Kafka and REST API? ›

Kafka APIs store data in topics. With REST APIs, you can store data in the database on the server. With Kafka API, you often are not interested in a response. You are typically expecting a response back when using REST APIs.

Does Kafka have an API? ›

Kafka has four core APIs: The Producer API allows an application to publish a stream of records to one or more Kafka topics. The Consumer API allows an application to subscribe to one or more topics and process the stream of records produced to them.

Is Kafka a REST API? ›

The Kafka REST Proxy is a RESTful web API that allows your application to send and receive messages using HTTP rather than TCP. It can be used to produce data to and consume data from Kafka or for executing queries on cluster configuration.

What is Kafka Connect API? ›

Kafka Connect is a free, open-source component of Apache Kafka® that works as a centralized data hub for simple data integration between databases, key-value stores, search indexes, and file systems.

Is Apache Kafka an API gateway? ›

Apache Kafka plays a key role in modern microservice architectures to build open, scalable, flexible and decoupled real time applications. API Management complements Kafka by providing a way to implement and govern the full life cycle of the APIs.

What is the difference between Kafka and Kafka streams? ›

Every topic in Kafka is split into one or more partitions. Kafka partitions data for storing, transporting, and replicating it. Kafka Streams partitions data for processing it. In both cases, this partitioning enables elasticity, scalability, high performance, and fault tolerance.

How does Kafka API work? ›

The Kafka Streams API to implement stream processing applications and microservices. It provides higher-level functions to process event streams, including transformations, stateful operations like aggregations and joins, windowing, processing based on event-time, and more.

Is an API an application? ›

API is the acronym for Application Programming Interface, which is a software intermediary that allows two applications to talk to each other. Each time you use an app like Facebook, send an instant message, or check the weather on your phone, you're using an API.

Is Kafka using HTTP? ›

Domain-driven design (DDD): Often, HTTP/REST and Kafka are combined to leverage the best of both worlds: Kafka for decoupling and HTTP for synchronous client-server communication. A service mesh using Kafka in conjunction with REST APIs is a common architecture.

How do I call Kafka REST API? ›

Import data from any REST API to Apache Kafka incrementally using JDBC
  1. Introduction.
  2. Prerequisites.
  3. Download and Install Autonomous REST Connector.
  4. Configure Autonomous REST Connector.
  5. Create Kafka JDBC Source configuration.
  6. Import the data into Kafka topic.

What is Kafka with example? ›

Apache Kafka is a publish-subscribe based durable messaging system. A messaging system sends messages between processes, applications, and servers. Apache Kafka is a software where topics can be defined (think of a topic as a category), applications can add, process and reprocess records.

Which protocol does Kafka use? ›

Kafka uses a binary protocol over TCP. The protocol defines all APIs as request response message pairs. All messages are size delimited and are made up of the following primitive types.

How does Kafka streaming work? ›

Kafka Streams achieves parallelism by distributing tasks, its fundamental work unit, across instances of the application, as well as across threads within an instance.

How do I stream data to Kafka? ›

This quick start follows these steps: Start a Kafka cluster on a single machine. Write example input data to a Kafka topic, using the so-called console producer included in Apache Kafka. Process the input data with a Java application that uses the Kafka Streams library.

When should we use Kafka streams? ›

Kafka Streams, or the Streams API, makes it easier to transform or filter data from one Kafka topic and publish it to another Kafka topic, although you can use Streams for sending events to external systems if you wish.

What is API gateway for? ›

API Gateway acts as a "front door" for applications to access data, business logic, or functionality from your backend services, such as workloads running on Amazon Elastic Compute Cloud (Amazon EC2), code running on AWS Lambda, any web application, or real-time communication applications.

What is the difference between API gateway and service mesh? ›

The Difference Between Service Mesh and API Gateway

A service mesh aims to manage internal service-to-service communication, while an API Gateway is primarily focused to manage traffic from client-to-service. Focuses on internal organizing resources. Maps external traffic to internal resources.

What Kafka topics? ›

Kafka topics are the categories used to organize messages. Each topic has a name that is unique across the entire Kafka cluster. Messages are sent to and read from specific topics. In other words, producers write data to topics, and consumers read data from topics. Kafka topics are multi-subscriber.

Does Netflix use Kafka? ›

Apache Kafka is an open-source streaming platform that enables the development of applications that ingest a high volume of real-time data. It was originally built by the geniuses at LinkedIn and is now used at Netflix, Pinterest and Airbnb to name a few.

Is Kafka and Apache Kafka same? ›

While both platforms fall under big data technologies, they are classified into different categories. Confluent Kafka falls under the data processing category in the big data. On the other hand, Apache Kafka falls under the data operations category as it is a message queuing system.

Why do we need Kafka? ›

Kafka operates as a modern distributed system that runs as a cluster and can scale to handle any number of applications. Kafka is designed to serve as a storage system and can store data as long as necessary; most message queues remove messages immediately after the consumer confirms receipt.

How does Kafka integrate with REST API? ›

Import data from any REST API to Apache Kafka incrementally using JDBC
  1. Introduction.
  2. Prerequisites.
  3. Download and Install Autonomous REST Connector.
  4. Configure Autonomous REST Connector.
  5. Create Kafka JDBC Source configuration.
  6. Import the data into Kafka topic.

Why Kafka is used in microservices? ›

Popular use cases of Kafka include: Traditional messaging, to decouple data producers from processors with better latency and scalability. Site activity tracking with real-time publish-subscribe feeds. As a replacement for file-based log aggregation, where event data becomes a stream of messages.

Is Kafka over HTTP? ›

Apache Kafka uses a custom protocol on top of TCP/IP for communication between applications and the Kafka cluster. With Kafka Bridge, clients can communicate with your Event Streams Kafka cluster over the HTTP/1.1 protocol. You can manage consumers and send and receive records over HTTP.

What are disadvantages of Kafka? ›

Disadvantages Of Apache Kafka

Do not have complete set of monitoring tools: Apache Kafka does not contain a complete set of monitoring as well as managing tools. Thus, new startups or enterprises fear to work with Kafka. Message tweaking issues: The Kafka broker uses system calls to deliver messages to the consumer.

Videos

1. Creating Kafka Streams Application | Kafka Stream Quick Start | Introduction to Kafka Streams API
(Learning Journal)
2. Difference Between Kafka APIs - Stream,Connector,Producer,Consumer APIs - Kafka Interview Questions
(GK TechFlex)
3. A Deep Dive into Apache Kafka This is Event Streaming by Andrew Dunnings & Katherine Stanley
(Devoxx)
4. Introduction to Kafka Streams - Akash
(SYSTEMS DESIGN AND ENGINEERING ASIA PACIFIC)
5. Kafka Streams API - Quick Start - Apache Kafka Tutorial
(Le Hao Nhi)
6. What is Apache Kafka®?
(Confluent)

Top Articles

Latest Posts

Article information

Author: Otha Schamberger

Last Updated: 12/28/2022

Views: 6029

Rating: 4.4 / 5 (75 voted)

Reviews: 82% of readers found this page helpful

Author information

Name: Otha Schamberger

Birthday: 1999-08-15

Address: Suite 490 606 Hammes Ferry, Carterhaven, IL 62290

Phone: +8557035444877

Job: Forward IT Agent

Hobby: Fishing, Flying, Jewelry making, Digital arts, Sand art, Parkour, tabletop games

Introduction: My name is Otha Schamberger, I am a vast, good, healthy, cheerful, energetic, gorgeous, magnificent person who loves writing and wants to share my knowledge and understanding with you.