Setup Kafka Schema Registry with Avro serialization for data processing

Advanced 45 min Apr 19, 2026 144 views
Ubuntu 24.04 Debian 12 AlmaLinux 9 Rocky Linux 9

Configure Confluent Schema Registry with Avro serialization for production Kafka deployments. Includes schema evolution, producer/consumer integration, and SSL security configuration.

Prerequisites

  • Apache Kafka cluster running
  • Java 11 or higher
  • Administrative access
  • 4GB+ RAM recommended
  • Network access to Kafka brokers

What this solves

Schema Registry provides centralized schema management for Kafka topics, enabling schema evolution and data compatibility across your streaming applications. This tutorial sets up Confluent Schema Registry with Avro serialization, authentication, and SSL security for production data processing pipelines.

Step-by-step installation

Install Java and dependencies

Schema Registry requires Java 11 or higher. Install the required dependencies first.

sudo apt update
sudo apt install -y openjdk-11-jdk wget curl unzip
sudo dnf update -y
sudo dnf install -y java-11-openjdk java-11-openjdk-devel wget curl unzip

Create dedicated user for Schema Registry

Run Schema Registry under a dedicated user account for security isolation.

sudo useradd --system --home-dir /opt/schema-registry --shell /bin/false --create-home schema-registry

Download and install Confluent Platform

Download the Confluent Platform which includes Schema Registry, or install just the Schema Registry component.

cd /tmp
wget https://packages.confluent.io/archive/7.5/confluent-7.5.0.tar.gz
tar -xzf confluent-7.5.0.tar.gz
sudo mv confluent-7.5.0 /opt/confluent
sudo chown -R schema-registry:schema-registry /opt/confluent

Configure Schema Registry properties

Create the main configuration file with connection details for your Kafka cluster.

# Schema Registry listeners
listeners=http://0.0.0.0:8081

Kafka cluster connection

kafkastore.bootstrap.servers=localhost:9092 kafkastore.topic=_schemas kafkastore.topic.replication.factor=3

Schema Registry identity

schema.registry.group.id=schema-registry

Security settings

kafkastore.security.protocol=PLAINTEXT schema.registry.inter.instance.protocol=http

Avro compatibility settings

avro.compatibility.level=BACKWARD schema.compatibility.level=BACKWARD

Response cache settings

response.mediatype.preferred=application/vnd.schemaregistry.v1+json response.mediatype.default=application/vnd.schemaregistry.v1+json

Configure Schema Registry with SSL

Enable SSL encryption for production deployments. Create SSL configuration.

# HTTPS listeners
listeners=https://0.0.0.0:8081

SSL keystore configuration

ssl.keystore.location=/opt/confluent/ssl/schema-registry.keystore.jks ssl.keystore.password=changeme ssl.key.password=changeme

SSL truststore configuration

ssl.truststore.location=/opt/confluent/ssl/schema-registry.truststore.jks ssl.truststore.password=changeme

SSL client authentication

ssl.client.auth=required

Kafka SSL connection

kafkastore.bootstrap.servers=localhost:9093 kafkastore.security.protocol=SSL kafkastore.ssl.keystore.location=/opt/confluent/ssl/client.keystore.jks kafkastore.ssl.keystore.password=changeme kafkastore.ssl.key.password=changeme kafkastore.ssl.truststore.location=/opt/confluent/ssl/client.truststore.jks kafkastore.ssl.truststore.password=changeme

Create SSL certificates

Generate SSL certificates for Schema Registry and client authentication.

sudo mkdir -p /opt/confluent/ssl
cd /opt/confluent/ssl

Generate CA certificate

sudo keytool -genkey -keyalg RSA -alias ca-cert -keystore ca.keystore.jks -validity 365 -dname "CN=example.com,OU=IT,O=Example,L=City,S=State,C=US" -storepass changeme -keypass changeme

Export CA certificate

sudo keytool -export -alias ca-cert -file ca-cert -keystore ca.keystore.jks -storepass changeme

Create truststore and import CA

sudo keytool -import -alias ca-cert -file ca-cert -keystore schema-registry.truststore.jks -storepass changeme -noprompt

Generate server certificate

sudo keytool -genkey -keyalg RSA -alias schema-registry -keystore schema-registry.keystore.jks -validity 365 -dname "CN=schema-registry.example.com,OU=IT,O=Example,L=City,S=State,C=US" -storepass changeme -keypass changeme

Sign server certificate with CA

sudo keytool -certreq -alias schema-registry -file schema-registry.csr -keystore schema-registry.keystore.jks -storepass changeme sudo keytool -gencert -alias ca-cert -infile schema-registry.csr -outfile schema-registry.crt -keystore ca.keystore.jks -storepass changeme -validity 365 sudo keytool -import -alias ca-cert -file ca-cert -keystore schema-registry.keystore.jks -storepass changeme -noprompt sudo keytool -import -alias schema-registry -file schema-registry.crt -keystore schema-registry.keystore.jks -storepass changeme

Set permissions

sudo chown -R schema-registry:schema-registry /opt/confluent/ssl sudo chmod 600 /opt/confluent/ssl/*.jks

Create systemd service

Create a systemd service to manage Schema Registry startup and monitoring.

[Unit]
Description=Confluent Schema Registry
After=network.target
Requires=network.target

[Service]
Type=simple
User=schema-registry
Group=schema-registry
ExecStart=/opt/confluent/bin/schema-registry-start /opt/confluent/etc/schema-registry/schema-registry.properties
ExecStop=/opt/confluent/bin/schema-registry-stop
Restart=on-failure
RestartSec=10
StandardOutput=journal
StandardError=journal
SyslogIdentifier=schema-registry
KillMode=process
TimeoutStopSec=300

Security settings

NoNewPrivileges=yes PrivateTmp=yes ProtectSystem=strict ProtectHome=yes ReadWritePaths=/opt/confluent/logs

Environment

Environment=JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64 Environment=SCHEMA_REGISTRY_HEAP_OPTS="-Xmx1G -Xms1G" Environment=SCHEMA_REGISTRY_JVM_PERFORMANCE_OPTS="-server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35" [Install] WantedBy=multi-user.target

Configure log directory and permissions

Set up logging directory with correct ownership and permissions.

sudo mkdir -p /opt/confluent/logs
sudo chown schema-registry:schema-registry /opt/confluent/logs
sudo chmod 755 /opt/confluent/logs

Configure authentication

Enable basic authentication for Schema Registry access control.

# Enable authentication
authentication.method=BASIC
authentication.realm=SchemaRegistry
authentication.roles=admin,developer,readonly

JAAS configuration

schema.registry.resource.extension.class=io.confluent.kafka.schemaregistry.security.SchemaRegistrySecurityResourceExtension confluent.schema.registry.auth.mechanism=JETTY_AUTH schema.registry.auth.mechanism=JETTY_AUTH

Create user credentials file

Define users and their access levels for Schema Registry.

# Format: username: password,role1,role2
admin: admin123,admin
producer: producer123,developer
consumer: consumer123,readonly
app-user: app-secret,developer

Enable and start Schema Registry

Start the Schema Registry service and enable it to start on boot.

sudo systemctl daemon-reload
sudo systemctl enable schema-registry
sudo systemctl start schema-registry
sudo systemctl status schema-registry

Configure Avro schema management

Create sample Avro schema

Define an Avro schema for your data structure with proper field types and evolution support.

{
  "namespace": "com.example.users",
  "type": "record",
  "name": "User",
  "version": "1",
  "fields": [
    {
      "name": "id",
      "type": "long",
      "doc": "User unique identifier"
    },
    {
      "name": "username",
      "type": "string",
      "doc": "User login name"
    },
    {
      "name": "email",
      "type": "string",
      "doc": "User email address"
    },
    {
      "name": "created_at",
      "type": "long",
      "logicalType": "timestamp-millis",
      "doc": "User creation timestamp"
    },
    {
      "name": "metadata",
      "type": ["null", {
        "type": "map",
        "values": "string"
      }],
      "default": null,
      "doc": "Optional user metadata"
    }
  ]
}

Register schema with Schema Registry

Upload the schema to Schema Registry using the REST API.

# Register schema for a subject (topic)
curl -X POST -H "Content-Type: application/vnd.schemaregistry.v1+json" \
  --data @/tmp/user-schema.json \
  http://localhost:8081/subjects/users-value/versions

Register with authentication

curl -X POST -H "Content-Type: application/vnd.schemaregistry.v1+json" \ -u admin:admin123 \ --data '{"schema": "{\"namespace\": \"com.example.users\", \"type\": \"record\", \"name\": \"User\", \"fields\": [{\"name\": \"id\", \"type\": \"long\"}, {\"name\": \"username\", \"type\": \"string\"}, {\"name\": \"email\", \"type\": \"string\"}, {\"name\": \"created_at\", \"type\": \"long\", \"logicalType\": \"timestamp-millis\"}, {\"name\": \"metadata\", \"type\": [\"null\", {\"type\": \"map\", \"values\": \"string\"}], \"default\": null}]}"}' \ http://localhost:8081/subjects/users-value/versions

Configure producer and consumer integration

Java producer configuration

Configure a Kafka producer to use Avro serialization with Schema Registry.

# Kafka cluster connection
bootstrap.servers=localhost:9092

Schema Registry settings

schema.registry.url=http://localhost:8081 schema.registry.basic.auth.credentials.source=USER_INFO schema.registry.basic.auth.user.info=producer:producer123

Avro serialization

key.serializer=org.apache.kafka.common.serialization.StringSerializer value.serializer=io.confluent.kafka.serializers.KafkaAvroSerializer

Producer settings

acks=all retries=2147483647 max.in.flight.requests.per.connection=1 enable.idempotence=true compression.type=snappy

Schema evolution

auto.register.schemas=false use.latest.version=true

Java consumer configuration

Configure a Kafka consumer to deserialize Avro messages using Schema Registry.

# Kafka cluster connection
bootstrap.servers=localhost:9092

Schema Registry settings

schema.registry.url=http://localhost:8081 schema.registry.basic.auth.credentials.source=USER_INFO schema.registry.basic.auth.user.info=consumer:consumer123

Avro deserialization

key.deserializer=org.apache.kafka.common.serialization.StringDeserializer value.deserializer=io.confluent.kafka.serializers.KafkaAvroDeserializer

Consumer settings

group.id=user-consumer-group auto.offset.reset=earliest enable.auto.commit=false

Avro specific settings

specific.avro.reader=true

Python client configuration

Configure Python producer and consumer using confluent-kafka with Avro support.

# Install Python dependencies
pip install confluent-kafka[avro] requests
from confluent_kafka import avro
from confluent_kafka.avro import AvroProducer
from confluent_kafka.schema_registry import SchemaRegistryClient
import json

Schema Registry configuration

schema_registry_conf = { 'url': 'http://localhost:8081', 'basic.auth.user.info': 'producer:producer123' }

Producer configuration

producer_conf = { 'bootstrap.servers': 'localhost:9092', 'schema.registry.url': 'http://localhost:8081', 'schema.registry.basic.auth.credentials.source': 'USER_INFO', 'schema.registry.basic.auth.user.info': 'producer:producer123' }

Load schema

with open('/tmp/user-schema.json', 'r') as f: user_schema = json.load(f)

Create producer

producer = AvroProducer(producer_conf, default_value_schema=avro.loads(json.dumps(user_schema)))

Send message

user_data = { 'id': 12345, 'username': 'john_doe', 'email': 'john@example.com', 'created_at': 1703097600000, 'metadata': {'role': 'user', 'department': 'engineering'} } producer.produce(topic='users', value=user_data, key='user_12345') producer.flush()

Configure schema evolution

Enable compatibility checking

Configure compatibility levels to control schema evolution policies.

# Set global compatibility level
curl -X PUT -H "Content-Type: application/vnd.schemaregistry.v1+json" \
  -u admin:admin123 \
  --data '{"compatibility": "BACKWARD"}' \
  http://localhost:8081/config

Set subject-specific compatibility

curl -X PUT -H "Content-Type: application/vnd.schemaregistry.v1+json" \ -u admin:admin123 \ --data '{"compatibility": "BACKWARD_TRANSITIVE"}' \ http://localhost:8081/config/users-value

Create evolved schema version

Add a new optional field to demonstrate backward-compatible schema evolution.

{
  "namespace": "com.example.users",
  "type": "record",
  "name": "User",
  "version": "2",
  "fields": [
    {
      "name": "id",
      "type": "long",
      "doc": "User unique identifier"
    },
    {
      "name": "username",
      "type": "string",
      "doc": "User login name"
    },
    {
      "name": "email",
      "type": "string",
      "doc": "User email address"
    },
    {
      "name": "created_at",
      "type": "long",
      "logicalType": "timestamp-millis",
      "doc": "User creation timestamp"
    },
    {
      "name": "last_login",
      "type": ["null", "long"],
      "logicalType": "timestamp-millis",
      "default": null,
      "doc": "Last login timestamp (new field)"
    },
    {
      "name": "metadata",
      "type": ["null", {
        "type": "map",
        "values": "string"
      }],
      "default": null,
      "doc": "Optional user metadata"
    }
  ]
}

Register evolved schema

Upload the new schema version and verify compatibility.

# Test compatibility before registering
curl -X POST -H "Content-Type: application/vnd.schemaregistry.v1+json" \
  -u admin:admin123 \
  --data @/tmp/user-schema-v2.json \
  http://localhost:8081/compatibility/subjects/users-value/versions/latest

Register new version if compatible

curl -X POST -H "Content-Type: application/vnd.schemaregistry.v1+json" \ -u admin:admin123 \ --data @/tmp/user-schema-v2.json \ http://localhost:8081/subjects/users-value/versions

Verify your setup

# Check Schema Registry health
curl http://localhost:8081/

List all subjects

curl -u admin:admin123 http://localhost:8081/subjects

Get latest schema version

curl -u admin:admin123 http://localhost:8081/subjects/users-value/versions/latest

Check compatibility settings

curl -u admin:admin123 http://localhost:8081/config

Verify service status

sudo systemctl status schema-registry

Check logs

journalctl -u schema-registry -f

Common issues

SymptomCauseFix
Schema Registry won't startKafka not availableEnsure Kafka cluster is running and accessible
SSL connection failsCertificate mismatchVerify certificate CN matches hostname
Schema registration failsAuthentication requiredCheck credentials in client configuration
Compatibility check failsBreaking schema changesAdd default values for new required fields
Consumer deserialization errorSchema not foundVerify subject name matches topic naming
Permission denied errorsIncorrect file ownershipchown -R schema-registry:schema-registry /opt/confluent

Next steps

Running this in production?

Want this handled for you? Running this at scale adds a second layer of work: capacity planning, failover drills, cost control, and on-call. See how we run infrastructure like this for European teams.

Automated install script

Run this to automate the entire setup

Need help?

Don't want to manage this yourself?

We handle managed devops services for businesses that depend on uptime. From initial setup to ongoing operations.