Telemetry Consumers
Technical Integration Guide
This guide is intended for use by those who have telemetry processing applications that need to consume data from the Data Interchange. Before proceeding with integration, you should have credentials that allow you to login to the web interface of the Data Interchange.
As a short overview of the process, the Data Interchange publishes telemetry data via the Apache Kafka messaging protocol. To consume data, applications authenticate against the Data Interchange using API key pairs, which restricts access to topics shared with the Telemetry Consumer by the Data Owner. For data privacy, all connections to the Data Interchange are encrypted with SSL.
Each message retrieved from Kafka will include a required set of Headers and a message Body. Each message may also include one or more optional Headers. For additional detail on message Headers, review the Message Headers. With respect to the message Body, the Data Interchange does not publish (or enforce) standards on what types of data are exchanged, so it is the responsibility of Telemetry Providers to document the structure and details of any data that they publish to the Data Interchange – and to publish these details via a Data Dictionary to the Data Interchange on a regular interval.
Key concepts
Telemetry Consumers will connect to one or more Kafka topics to pull data from the Data Interchange. A topic is a named location where data is published and stored. Data is stored in an immutable array-like data structure that is split into multiple pieces called partitions. A consumer pulls data from partitions by specifying an offset to read data from. Offsets are usually stored in Kafka and are associated with a consumer group. Kafka manages what partitions and offsets are assigned to each consumer in the consumer group. A much more detailed description of these concepts is available at Confluent Intro to Kafka
The group id that a Telemetry Consumer can use when connecting as a consumer to Kafka is restricted by a namespace prefix. For example, if the Telemetry Consumer is assigned the prefix dx.consumer.my-consumer. then they can join with a group id that begins with that prefix such as dx.consumer.my-consumer.data-warehouse.
Consumer groups allow for parallel processing by assigning a subset of partitions to each consumer in a consumer group. They also enable recording which offsets are already consumed (which is important when a consumer rejoins to consume from a topic). There can be more consumers than partitions which results in some partitions being assigned to multiple consumers. Consumers that share a partition and consumer group will be blocked from reading from the partition while another consumer reads data, in other words consumers that share a partition and consumer group will not consume from that partition in parallel.
Creating an API key
Telemetry consumers can create a API key by logging into the Data Interchange and visiting the Organization section. A managing agent can use the Add Key button to generate a new API key. The API secret is not stored by the Data Interchange so make sure to save its value somewhere secure before accepting the new key. If new credentials are needed a managing agent can use the Replace Key button to generate a new key. A Telemetry Consumer can only have one active key.
Consuming telemetry data from the console
During the integration process, it is very helpful to establish connectivity to the Data Interchange from your workstation console / command-line. The following steps will enable you to consume telemetry data from the Data Interchange from the command-line.
Install Kafkacat or use a Docker Image
Login to the Data Interchange, navigate to Organization section, and identify the
BROKER_ENDPOINT,API_KEY, andAPI_SECRET, if they are not already known.Create a
config.propertiesfile with the following content in a secure location on your workstation.
bootstrap.servers="{{ BROKER_ENDPOINT }}"
ssl.endpoint.identification.algorithm=https
security.protocol=SASL_SSL
sasl.mechanism=PLAIN
sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="{{ API_KEY }}" password="{{ API_SECRET }}"
Login to the Data Interchange, navigate to Destinations section, and lookup the destination
TOPIC_NAMEfrom which your organization is authorized to consume data.Use the following command to consume telemetry messages from the Data Interchange:
$ kafkacat -F config.properties -C -t {{ TOPIC_NAME }} -o beginning -c 10 -e
Consuming telemetry data from a dedicated consumer application
For ongoing operation, the best practice is to run a dedicated application to consume telemetry data from the Data Interchange. The Kafka message protocol is supported in a wide variety of different languages (see: Kafka clients and Code Examples). In addition, New Horizon has developed a Java reference consumer (TBA) to provide a quickstart option to new Telemetry Consumer organizations.
Consuming with a dedicated client is similar to the console example above with the additional requirement of a group id which allows Kafka to store offsets for consumers and manage consumers. The group id must begin with the GROUP_PREFIX, which is found on the Telemetry Consumer Organization view.
If multiple instances of the consumer application are reading from the same topic they should all join with the same group id. It is recommended to append some name to the end of the prefix to support multiple groups. For example, if the GROUP_PREFIX were dx.consumer.widgetmaker. then all instances of app1 might join with a group id of dx.consumer.widgetmaker.app1 and any instances of app2 might join with dx.consumer.widgetmaker.app2.
Keep in mind a consumer will usually fetch more than one message from Kafka at a time. It is also recommended to keep the consumer application processing loop quick and simple. For example, a consumer application may immediately store the Kafka messages in an object store or database and then acknowledge the message was consumed.
It is beyond the scope of this document to provide detailed integration guides for each different language and client.