A New IoT Data Store

~

Restricting Developers and Data Engineers to preconceived notions about how data should be shaped only serves to restrict innovation. Data warehouses should be adaptable and able to manage a dynamic schema, and we should be able to easily manage data from multiple devices which may be unrelated to one another. New data from devices should be made available immediately without extensive design and implementation lead times.


The Issue


The underlying issue was related to IT ability to efficiently turn-around schema changes. Once identified, this enabled an effective solution to be formed that would handle the needs of the business more effectively.

 

One particular organization had decades of hardware generations with some years producing over 25 different models of devices. While the company had made an effort to standardize a specific set of schema elements, many devices also required unique elements; another important consideration was that the older IoT devices were built before the company's attempt to standardize data elements. This prevented them from having a single unified system that could enable analytics across multiple devices and had presented serious limitations to experimental device data collection. As their competency improved, business analysts and data engineers  outpaced the IT departments ability to make new schemas throughout the systems in line with how quickly the devices could be configured to send different data payloads. By focusing on the needs of dynamic schemas for current and future devices, an implementation was designed around NoSQL technology. Using NoSQL allowed for each record to contain its own custom schema, that was tailored to the payload delivered. This allowed for a deep focus on a generic ingestion system that could still support legacy devices and development field mapping techniques to enable any payload to be stored.

The Result

As a result of relocating the responsibility for data comprehension and schema to the business owners and data engineers, the IT department was able to relinquish themselves of labor-intensive schema designs. This led the way to the handling of many different IoT device types using different schemas, with a much faster turn-around time for new data collection that did not require IT involvement for each amendment. Analysts could now query multiple generations of devices in a wholly unified way. Data engineers who could customize data payloads no longer required any IT engagement to design a schema, simply to store the device-specific payload. Rapid and regular innovation was moving forward at full pace as designs could be tested at will, without delay. Cross-cutting analytics that had been virtually impossible before due to disparate storage systems were now possible with the unified datastore.

Regardless of the form, data should be storable for utilization at a later time, and schemas can be created dynamically, allowing for wide-ranging ingestion. By seeking to understand the root processes that were inhibiting innovation, it is possible to develop technology systems that remove the hindrance and then focus on the business solution to the perceived problems.

Understanding and re-defining the responsibility for elements of the process can enable new solutions that were once unviable to be leveraged before. Novel thinking allowed for a system which could enable the business users to make changes and receive instantaneous feedback, negating the long lead-times that were 
inherently related to schema changes with IT.