EnOS™ Stream Analytics targets to meet the real-time data processing requirements of devices and assets, and also the requirements of processing data that is integrated from the offline message channel. Powered by Apache Spark™ Streaming with Envision customization and optimization, the EnOS Stream Analytics engine offers high scalability, high throughput, and high fault-tolerance.
Broadly speaking, the generation of data can be considered as a series of discrete events. When drawing these discrete events on a time axis, an event stream or data stream is formed. Stream data consist of these endless event streams. Stream data is generated continuously by a lot of data sources. However, the size of the stream data is normally smaller than that of the offline data. The common sources of stream data can be the devices connected to a data center, the telemetry data of devices, and the log files generated by mobile or web applications.
EnOS Stream Analytics can be used in the following scenarios:
- Aggregating and calculating asset raw data
- In most business scenarios, you might need to filter the raw data received from devices, aggregate the data by a certain algorithm, and save the aggregated data for further processing.
- Computation of device states
- In some business scenarios, you might need to obtain certain state parameters of a device to confirm its status. The stream analytics framework maintains the state of devices and sites internally. The system updates both the measuring point values and the device connection status to the latest state.
- Real-time and unbounded stream data processing
- The computation the engine processes is real-time and streaming, and the data streams are subscribed and consumed by stream computing in chronological order. Because the data is generated continuously, the data streams are integrated to the streaming system continuously. For example, website access log is a type of stream data, the log continuously records data as long as the website is on line. Thus, the stream data is always real-time and unbounded.
- Continuous and efficient computation
- The computation models of stream computing are “event triggered”. The trigger is the unbounded stream data mentioned in the previous section. Once new stream data is sent to the system, the system immediately initiates and performs a computation task. Therefore, stream computing is a continuous process.
- Stream and real-time data integration
- The result of stream computing triggered by stream data is recorded directly into the destination data storage. For example, the data can be directly written into the time series database (TSDB) for report rendering. Therefore, the computing results of the stream data are continuously recorded into the target data storage.
Data Processing Flow¶
The procedure of EnOS Stream Analytics is as follows:
- Processing of raw data
- Original measuring point data is sent to Kafka through the EnOS connection layer. The messages received are analyzed by the stream computing process of measuring points. Before processing, the data is filtered by the specified threshold. Data exceeding the threshold will be processed by interpolation algorithm.
- Performing calculation
- In this step, data is computed by the defined algorithm in the processing strategy.
- Output of computation
- The data from the streaming module flows into In-memory Database (IMDB) and Kafka, and the downstream continues to subscribe all data from Kafka and record them to time series database (TSDB) or other storage systems. The stored data can be used as the data source for offline data processing and further processing and analysis.
Data Processing Templates¶
EnOS Stream Analytics provides the following template for developers to configure stream data processing jobs quickly: