Kafka Schema Registry
1. Kafka Schema Registry
![[Figure 1] Kafka Schema Registry Architecture](/blog-software/docs/theory-analysis/kafka-schema-registry/images/kafka-schema-registry-architecture.png)
[Figure 1] Kafka Schema Registry Architecture
Kafka Schema Registry performs the role of managing the Schema of Kafka Messages between Kafka Producer and Kafka Consumer. [Figure 1] shows the Architecture of Kafka Schema Registry. Kafka Schema Registry uses Kafka’s _schema_ Topic instead of a separate Database to maintain state information.
Since all Schema-related information is recorded in the _schema_ Topic, Kafka Schema Registry has a Stateless characteristic and can easily perform Scale-out for load balancing. However, since Schema information stored in the Topic is also cached in Memory, the frequency of Kafka Schema Registry directly accessing the Topic is low, and generally accesses the Topic only when Schema is registered/changed/deleted or when Kafka Schema Registry is initialized.
| |
| |
The _schema_ Topic records Schema-related information in Key, Value format like [Schema Key 1] and [Schema Value 1]. Key is used as a unique identifier for Schema, and Value contains Schema information. [Figure 1] also shows the operation process of Kafka Schema Registry, showing the process of Kafka Producer sending Kafka Messages and Kafka Consumer receiving Kafka Messages.
1.1. Schema Registration Process
Kafka Schema Registry can register Schemas through REST API and performs the following process:
- Send a Schema registration request to Kafka Schema Registry through REST API.
- Kafka Schema Registry records the Schema in the
_schema_Topic. - Kafka Schema Registry then caches the Schema in Memory and returns the cached Schema when Producer or Consumer requests the Schema.
1.2. Message Serialization and Deserialization Process Using Schema
The process of Producer and Consumer using Schema is as follows:
- Producer requests a Schema registered in Kafka Schema Registry and receives Schema and Schema ID by requesting with name and Version.
- Producer serializes the Message based on the received Schema and sends the serialized Message along with Schema ID to Kafka Topic.
- Consumer receives the serialized Message along with the Schema ID passed from Kafka Topic.
- Consumer requests Schema from Kafka Schema Registry based on the received Schema ID.
- Consumer deserializes the Message based on the received Schema.
| |
[Code 1] shows Python example code where Producer requests Schema, serializes Message based on Schema, and sends the serialized Message along with Schema ID to Kafka Topic. These processes can be easily implemented through the confluent_kafka.schema_registry Python Package.
| |
[Code 2] shows Python example code where Consumer receives the serialized Message along with Schema ID passed from Kafka Topic, requests Schema from Kafka Schema Registry based on the received Schema ID, and deserializes Message based on the requested Schema. Similarly, it can be easily implemented through the confluent_kafka.schema_registry Python Package.
2. References
- Kafka Schema Registry : https://medium.com/@tlsrid1119/kafka-schema-registry-feat-confluent-example-cde8a276f76c