Skip to main content

Kafka Setup Guide

This guide provides instructions for setting up Apache Kafka (version 3.7.0) on your system. The setup includes necessary directories, binaries, configurations, and libraries for running Kafka and its components effectively.

Running Apache Kafka and Zookeeper on Your Server

Table of Contents

  1. Prerequisites
  2. Installing Java
  3. Installing Zookeeper
  4. Installing Kafka
  5. Configuring Zookeeper
  6. Configuring Kafka
  7. Setting Up Systemd Services
  8. Starting Services
  9. Verifying the Installation
  10. Troubleshooting

Prerequisites

  • A server running a Linux-based operating system (e.g., Ubuntu, CentOS).
  • Access to the terminal with sudo privileges.
  • Basic understanding of terminal commands.

Installing Java

Apache Kafka requires Java to run. Make sure you have Java installed:
  1. Update your package index:
    sudo apt update
    
  2. Install OpenJDK:
    sudo apt install openjdk-17-jdk
    
  3. Verify the installation:
    java -version
    

Installing Zookeeper

Zookeeper is required for managing Kafka brokers. Here’s how to install it:
  1. Download Zookeeper:
    wget https://downloads.apache.org/zookeeper/zookeeper-3.8.0/apache-zookeeper-3.8.0-bin.tar.gz
    
  2. Extract the downloaded file:
    tar -xzf apache-zookeeper-3.8.0-bin.tar.gz
    
  3. Move to /opt:
    sudo mv apache-zookeeper-3.8.0-bin /opt/zookeeper
    
  4. Create a configuration file:
    sudo nano /opt/zookeeper/conf/zoo.cfg
    
    Add the following configuration:
    tickTime=2000
    dataDir=/var/lib/zookeeper
    clientPort=2181
    maxClientCnxns=60
    
  5. Create the data directory:
    sudo mkdir /var/lib/zookeeper
    

Installing Kafka

  1. Download Kafka:
    wget https://downloads.apache.org/kafka/3.4.0/kafka_2.13-3.4.0.tgz
    
  2. Extract the downloaded file:
    tar -xzf kafka_2.13-3.4.0.tgz
    
  3. Move to /opt:
    sudo mv kafka_2.13-3.4.0 /opt/kafka
    

Configuring Zookeeper

Make sure Zookeeper is configured correctly. The configuration file was created earlier. You may edit it if needed.

Configuring Kafka

  1. Create a configuration file for Kafka:
    sudo nano /opt/kafka/config/server.properties
    
    Add or modify the following configurations:
    broker.id=0
    listeners=PLAINTEXT://:9092
    log.dirs=/var/lib/kafka/logs
    zookeeper.connect=localhost:2181
    
  2. Create the logs directory:
    sudo mkdir -p /var/lib/kafka/logs
    

Setting Up Systemd Services

Create a Zookeeper Service

  1. Create the service file:
    sudo nano /etc/systemd/system/zookeeper.service
    
  2. Add the following content:
    [Unit]
    Description=Apache Zookeeper
    After=network.target
    
    [Service]
    User=your_username
    ExecStart=/opt/zookeeper/bin/zkServer.sh start /opt/zookeeper/conf/zoo.cfg
    ExecStop=/opt/zookeeper/bin/zkServer.sh stop
    Restart=on-failure
    
    [Install]
    WantedBy=multi-user.target
    

Create a Kafka Service

  1. Create the service file:
    sudo nano /etc/systemd/system/kafka.service
    
  2. Add the following content:
    [Unit]
    Description=Apache Kafka
    After=zookeeper.service
    
    [Service]
    User=your_username
    ExecStart=/opt/kafka/bin/kafka-server-start.sh /opt/kafka/config/server.properties
    ExecStop=/opt/kafka/bin/kafka-server-stop.sh
    Restart=on-failure
    
    [Install]
    WantedBy=multi-user.target
    

Starting Services

  1. Reload systemd to recognize the new services:
    sudo systemctl daemon-reload
    
  2. Start Zookeeper:
    sudo systemctl start zookeeper
    
  3. Start Kafka:
    sudo systemctl start kafka
    
  4. Enable both services to start on boot:
    sudo systemctl enable zookeeper
    sudo systemctl enable kafka
    

Verifying the Installation

  1. Check the status of Zookeeper:
    sudo systemctl status zookeeper
    
  2. Check the status of Kafka:
    sudo systemctl status kafka
    
  3. Create a test topic:
    /opt/kafka/bin/kafka-topics.sh --create --topic test --bootstrap-server localhost:9092 --partitions 1 --replication-factor 1
    
  4. List topics to confirm:
    /opt/kafka/bin/kafka-topics.sh --list --bootstrap-server localhost:9092
    

Troubleshooting

  • Check logs if Kafka or Zookeeper fail to start:
    • Zookeeper logs: /opt/zookeeper/logs/zookeeper.out
    • Kafka logs: /opt/kafka/logs/server.log
  • Ensure the correct version of Java is installed.
  • Verify that no other services are using the same ports (2181 for Zookeeper, 9092 for Kafka).

Conclusion

You have successfully installed and configured Apache Kafka and Zookeeper on your server. Both services are set up to run automatically at startup, ensuring reliable operation. You can now start producing and consuming messages using Kafka.

Directory Structure

The following is the directory structure for the Kafka installation:
kafka_2.12-3.7.0
├── bin                  # Contains scripts to run Kafka and Zookeeper
├── config               # Configuration files for Kafka and Zookeeper
├── kafka-consumer.py    # Python consumer script
├── kafka-logger.py      # Python logger script
└── libs                 # Required libraries and dependencies

Key Directories and Files

  • bin/: Scripts to start, stop, and manage Kafka and Zookeeper instances.
  • config/: Contains properties files for configuring Kafka and its components.
  • libs/: Contains all the necessary jar files required for Kafka to function.

Prerequisites

Before installing Kafka, ensure that you have the following prerequisites:
  • Java: Kafka requires Java 8 or later. Ensure that it is installed and the JAVA_HOME environment variable is set correctly.
  • Zookeeper: Kafka uses Zookeeper for cluster management. It can be run separately or within Kafka.

Installation Steps

  1. Download Kafka:
    • Download the Kafka binary from the Apache Kafka website.
    • Extract the downloaded tar.gz file to your desired installation directory.
  2. Verify Installation:

Configuration Files

The configuration files are located in the config/ directory. Key files include:
  • server.properties: Configuration for Kafka brokers.
  • zookeeper.properties: Configuration for Zookeeper.
  • connect-distributed.properties: Configuration for running Kafka Connect in distributed mode.

Example of server.properties

# Basic Kafka server configuration
broker.id=0
listeners=PLAINTEXT://localhost:9092
log.dirs=/tmp/kafka-logs
num.partitions=1

Example of zookeeper.properties

# Basic Zookeeper configuration
tickTime=2000
dataDir=/tmp/zookeeper
clientPort=2181

Starting Kafka

  1. Start Zookeeper:
    • Use the following command to start Zookeeper:
      bin/zookeeper-server-start.sh config/zookeeper.properties
      
  2. Start Kafka Server:
    • Once Zookeeper is running, start the Kafka server:
      bin/kafka-server-start.sh config/server.properties
      

Using Kafka Command-Line Tools

Kafka provides various command-line tools located in the bin/ directory. Here are some commonly used commands:
  • Create a Topic:
    bin/kafka-topics.sh --create --topic my-topic --bootstrap-server localhost:9092 --partitions 1 --replication-factor 1
    
  • List Topics:
    bin/kafka-topics.sh --list --bootstrap-server localhost:9092
    
  • Produce Messages:
    bin/kafka-console-producer.sh --topic my-topic --bootstrap-server localhost:9092
    
  • Consume Messages:
    bin/kafka-console-consumer.sh --topic my-topic --from-beginning --bootstrap-server localhost:9092
    
cd /var/www/devops/kafka-setup/kafka_2.12-3.7.0/

# run zookerper
/var/www/devops/kafka-setup/kafka_2.12-3.7.0/bin/zookeeper-server-start.sh config/zookeeper.properties

# run kafka server oon 9092
/var/www/devops/kafka-setup/kafka_2.12-3.7.0/bin/kafka-server-start.sh config/server.properties

# run localer terminal and create a topic 
/var/www/devops/kafka-setup/kafka_2.12-3.7.0/bin/kafka-topics.sh --create --topic kafka-logs-topic1 --bootstrap-server localhost:9092

# show logs in terminal 
/var/www/devops/kafka-setup/kafka_2.12-3.7.0/bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic kafka-logs-topic1 --from-beginning

# test server logs , mamnually create a log in kafka to check working 
/var/www/devops/kafka-setup/kafka_2.12-3.7.0/bin/kafka-console-producer.sh --topic kafka-logs-topic1 --bootstrap-server localhost:9092

Python Integration

For Python integration, the kafka-consumer.py and kafka-logger.py scripts can be used. These scripts demonstrate how to interact with Kafka using Python.

Example of Using kafka-consumer.py

Make sure you have the kafka-python library installed:
pip install kafka-python
Run the consumer script:
python kafka-consumer.py

Conclusion

This guide provides a basic setup for Apache Kafka, covering installation, configuration, and usage of command-line tools. For advanced configurations and deployments, refer to the official Kafka documentation.
For troubleshooting, please check the logs located in the /tmp/kafka-logs directory or adjust the logging settings in the log4j.properties file found in the config/ directory.