esthesis CORE - Documentation Help

Avro support

Introduction

Apache Avro is a widely used data serialization system developed by the Apache Software Foundation. It provides a compact, fast, and schema-based format for data exchange between different components of distributed systems. Avro focuses on efficient data storage and transmission, making it suitable for use cases where performance and interoperability are critical.

At its core, Avro uses a schema to define the structure of the data. The schema describes the fields, their types, and the relationships between them. Avro schemas are defined using JSON, which makes them easy to read, write, and interpret by different programming languages.

Avro data is serialized in a binary format, which results in compact data representation and efficient storage. This binary format enables fast serialization and deserialization, making Avro suitable for high-performance applications and systems with stringent latency requirements.

Design

esthesis communicates with external sources using the esthesis Line Protocol (eLP). eLP is a simple protocol, human-readable, which can be manipulated very easily with rudimentary programming effort and resources. It is therefore a convenient protocol for devices with limited resources.

However, being a text-based protocol, it is not the most efficient way to use when data has already arrived at esthesis and needs to be further processed by the various Dataflow components of esthesis Core. For this reason, esthesis CORE supports internally the Avro data serialization.

Here is how it works:

esthesis CORE
eLP
eLP
eLP
eLP
Avro
Avro
Avro
Avro
Avro
Avro
Kafka
Dataflow receiver
Dataflow receiver
Dataflow #1
Dataflow #2
External source
esthesis device agent
Data broker MQTT
Data broker (any)
Data persistence

As depicted above, external data sources send data to a data broker using eLP. The esthesis device agent is using MQTT, however other custom external data sources may choose to use any other protocol, provided a dataflow is available to receive the data.

Once the data is received by the dataflow receiver, it is processed using eLP and then converted to Avro. The converter Avro data is then sent to a Kafka topic, where it is available for further processing by other dataflow components.

In case a dataflow component needs to put a processed message back to Kafka for another dataflow to pick it up, it is also using Avro.

Specification

esthesis uses three different Avro schemas as presented next.

esthesis Data Message

The Avro schema used to serialize data received from external sources. It is defined in esthesis-data-message.avsc file.

esthesis Command Request Message

The Avro schema used to serialize command request messages. It is defined in esthesis-command-request-message.avsc file.

esthesis Command Reply Message

The Avro schema used to serialize command reply messages. It is defined in esthesis-command-reply-message.avsc file.

Helpers

There is an eLP to Avro helper that we are using in our Dataflows, available in AvroUtils. If you are creating a new Dataflow, you can use this helper to convert eLP to Avro to save you some time.

Last modified: 06 May 2025