There was an event last week in New York City that I attended. It was Tech In Motion's panel on IoT trends. It was a good panel staffed by Senior Analyst & Managing Director at CLSA Ed Maguire, Dash's founder and CEO Jamyn Edis, Tomorrow Lab's founder Ted Ullrich, and wot.io's founder and CTO Thomas Gilley. It was a great event that took place in the Verizon building that is in New York's Financial District.
The panel covered a wide variety of IoT topics, and one of the questions I had was around the sheer volume of streams that were coming in, and that's where the discussion started going towards edge processing.
This week in Las Vegas, we are at the SAP TechEd 2015 conference. In Steve Lucas' keynote speech, as well as the majority of the IoT and HANA Cloud Platform presentations, the idea of edge processing is something that is appearing again as well (and shown at the connected charging station demonstration). It's a recurring theme across discussions such as these, and for good reason.
The main point of this is that the number of data streaming coming into an enterprise has dramatically increased since the price for sensors are low enough where we are ingesting data at unprecedented volumes and speeds. The effect?
This creates new challenges for enterprises everywhere. For a few years now a lot of the strategy around this was simply to create a Hadoop based "data lake," and just throw all the data in there. Figuring out what to do with it quickly became an afterthought. When looking at the data, a lot of the time the data is simply telling us that things are okay. The engines are running fine. The freezer didn't break. The bank transaction is legitimate. There may be data science reasons to keep all that "I'm okay" information on-premise, but to assume that we automatically will just throw it into Hadoop and deal with it later isn't necessarily the wisest choice. It creates a lot of excess data that we might not necessarily need, thus increasing management and processing complexity and costs. Much in a similar vein to the movie Jurassic Park...
Dealing with this kind of situation is where the idea of edge processing comes into play. The concept behind this is where we go beyond having your connected devices merely having the capability to stream data back to the event handling platform (things such as SAP Smart Data Streaming (SDS), SAP HANA Cloud Platform, etc.) but to have the ability to perform complex event processing (CEP) tasks right on the devices themselves. This is a feature of SAP HANA SPS10 called Streaming Lite.
This is a tiny lightweight CEP engine based off of SDS which is designed to execute on a small Linux machine (RHEL), or a Raspberry Pi (currently those are the two supported platforms). The key here is that it is capable of running the same CEP tasks that SDS executes as a standalone Linux process on these smaller machines. The purpose of doing so is to provide the ability to perform time based queries and filter out the "I'm okay" messages (or any other events deemed unnecessary to send back to the main platform) directly at the source instead of having to either create custom code to do so or be forced to send all data back up to the platform. The ability to execute the same scripts as SDS also means that you can leverage the same skill set to do so without introducing additional complexity to your software development lifecycle.
I think it's important to keep all this in mind when designing any sort of IoT project. The SAP HANA platform in conjunction with Hadoop provides the flexibility to allow customers to store every single event if there is a business need to do so. On the other side of the spectrum, however, if the business requirements drive towards a solution for edge processing, customers can take advantage of the SAP platform to achieve that goal as well with a unified environment.