Tag Archives: pipeline

Who’s building your IoT data pipeline?

data pipelines require labor too

IoT data pipelines require labor too

There is a lot of excitement around IoT systems for companies, institutions, and governments that sense, aggregate, and publish useful, previously unknown information via dashboards, visualizations, and reporting.  In particular, there has been much focus on the IoT endpoints that sense energy and environmental quantities and qualities in a multitude of ways. Similarly, everybody and their brother has a dashboard to sell you. And while visualization tools are not yet as numerous, they too are growing in number and availability. What is easy to miss, though, in IoT sensor and reporting deployments is that pipeline of data collection, aggregation, processing, and reporting/visualization. How does the data get from all of those sensors and processes along the way so that it can be reported or visualized? This part can be more difficult than it initially appears.

Getting from the many Point A’s to Point B

While ‘big data’, analysis, and visualization have been around for a few years and continue to evolve, the new possibilities brought on by IoT sensing devices have been most recent news.  Getting an individual’s IoT data from the point of sensing to a dashboard or similar reporting tool is generally not an issue. This is because data is coming from only one (or a few) sensing points and will (in theory) only be used by that individual. However, for companies, institutions, and governments that seek to leverage IoT sensing and aggregating systems to bring about increased operating effectiveness and ultimately cost savings, this is not a trivial task. Reliably and continuously collecting data from hundreds, thousands, or more points, is not a slam dunk.

Companies and governments typically have pre-existing network and computing infrastructure systems. For these new IoT systems to work, the many sensing devices (IoT endpoints) need to be installed by competent and trusted professionals. Further, that data needs to be collected and aggregated by a system or device that knows how to talk to the endpoints. After that (possibly before), the data needs to be processed to handle sensing anomalies across the array of sensors and from there, the creation of operational/system health data and indicators is highly desirable so that the new IoT system can be monitored and maintained. Finally, data analysis and massaging is completed so that the data can be published in a dashboard, report, or visualization. The question is, who does this work? Who connects these dots and maintains that connection?

who supplies the labor to build the pipeline?

who supplies the labor to build the pipeline?

The supplier of the IoT endpoint devices won’t be the ones to build this data pipeline. Similarly, the provider for the visualization/reporting technology won’t build the pipeline. It probably will default to the company or government that is purchasing the new IoT technology to build and maintain that data pipeline. In turn, to meet the additional demand on effort, this means that the labor will need to be contracted out or diverted from internal resources, both of which incur costs, whether direct cost or in opportunity cost.

Patching IoT endpoint devices – Surprise! it probably won’t get done

Additional effects of implementing large numbers of IoT devices and maintaining the health of the same include:

  1. the requirement to patch the devices, or
  2. accept the risk of unpatched devices, or
  3. some hybrid of the two.

Unless the IoT endpoint vendor supplies the ability to automatically patch endpoint devices in a way that works on your network, those devices probably won’t get patched.  From a risk management point of view, probably the best approach is to identify the highest risk endpoint devices and try to keep those patched. But I suspect even that will be difficult to accomplish.

Also, as endpoint devices become increasingly complicated and have richer feature sets to remain competitive, they have increased ability to do more damage to your network and assets or those of others on the Internet. Any one of the above options increase cost to the organization and yet that cost typically goes unseen.

Labor leaks

Anticipating labor requirements along the IoT data pipeline is critical for IoT system success. However, this part is often not seen and leaks away. We tend to get caught up in the IoT devices at the beginning of the data pipeline and the fancy dashboards at the end and forget about the hard work of building a quality pipeline in the middle. In our defense, this is where the bulk of the marketing and sales efforts are — at each end of the pipeline. This effort of building a reliable, secure, and continuous data pipeline is a part of the socket concept that I mentioned in an earlier post, Systems in the Seam – Shortcomings in IoT systems implementation.

With rapidly evolving new sensing technologies and new ways to integrate and represent data, we are in an exciting time and have the potential to be more productive, profitable, safe, and efficient. However, it’s not all magical — the need for labor has not gone away. The requirement to connect the dots, the devices and points along the pipeline, is still there. If we don’t capture this component, our ROI from IoT systems investments will never be what we hoped it would be.


Side effect of IoT growth – more attack platforms


Rapid growth brings many good things, but also drives how we manage risk. [Image: theconnectivist.com http://bit.ly/1owv1dp]

The rapid growth of the Internet of Things (IoT) phenomenon, along with its corresponding rapid growth in device count, has been the talk about town over the past year or so. While IoT promises many good things, more conversation is being directed toward the risk brought about by the Internet of Things. Often this is in the form of someone will hack your web cams, steal your FitBit health information, hijack your routers and printers, or monkey with your thermostat remotely. While all important risks and concerns, I think that the bigger IoT risk has more to do with the sheer numbers of devices.

IoT devices as attack enablers

In all of the hoopla and coolness and excitement of the Internet of Things, we can sometimes forget the underlying subtle and amazing thing that they are all networked computing devices, many with well known and well understood operating systems. So, for a moment, forget that cool thing that the IoT device does in its local environment (capture video, audio, biometric authentication information, health information, temperature, humidity, refrigerator status, air composition, etc) and just remember that they are networked computing devices — many of these with substantial computing resources.

What this means is that IoT devices are not just targets themselves, but can also act as attack enablers or attack platforms. This can occur via direct hack or by unwitting participation in a botnet.


Baku-Tbilisi-Ceyhan (BTC) pipeline near the eastern Turkish city of Erzincan on Aug. 7, 2008.

From this recent analysis of a 2008 Turkish pipeline hack and sabotage:

“As investigators followed the trail of the failed alarm system, they found the hackers’ point of entry was an unexpected one: the surveillance cameras themselves.

The cameras’ communication software had vulnerabilities the hackers used to gain entry and move deep into the internal network, according to the people briefed on the matter.

Once inside, the attackers found a computer running on a Windows operating system that was in charge of the alarm-management network, and placed a malicious program on it. That gave them the ability to sneak back in whenever they wanted.”

So, the networked computing presence of the cameras themselves were used as a stepping stone (aka attack point) into the larger network. Some weakness in the operating system (OS) of the camera devices themselves provided a point of entry (‘vector’ in geek speak) into the pipeline’s operational network.

Big numbers

So, if we look at the growth in the number of IoT devices and consider them, for now, only as networked computing devices capable of being compromised, that’s a lot of new stepping stones for attacks.

These growing number of devices can enable & assist attacks by:

1) providing many more attack platforms, which …
2) provides more opportunities for indirection in attack, which …
3) makes attribution more difficult

buttonsLet’s get transitive – Kauffman’s buttons

At the risk of being a little bit tangential, all this reminds me of another network phenomenon, dealing with botnets, that I believe occurs. It is one that is exacerbated by the rapid increase in networked computing nodes, eg from IoT growth and has to do with how quickly the character of a network can change under fairly simple conditions.

I’ve always been intrigued with this ‘toy problem’ that Stuart Kauffman describes in his book, At Home in the Universe. He says to imagine that you have a bunch of buttons on the floor and some pieces of thread. You arbitrarily pick two buttons and then connect them with a piece of thread, a button at each end. Then you arbitrarily pick two more buttons and connect those two. (The original buttons are not excluded; they are still contenders. ) Keep doing this. While doing so, create a graph and plot the thread to number of buttons ratio on the X axis and the size of the largest cluster on the Y axis.


Not too much happens at first. Early on, the largest button cluster stays pretty small. Then, at a certain point, the size of the largest cluster leaps. Logically, it’s not surprising. You can see how it happens. However, I still find myself staring at that big jump. That’s a real phase change for at least one aspect of that button network.


Quite a leap — https://keychests.com/media/bigdisk/pdf/16096.pdf


I think a similar thing happens with some botnets, particularly P2P botnets, as they grow in size. We can make the reasonable assumption that some botnet sizes are more effective than others at carrying out their varied nefarious tasks, eg 1000 is probably better than 10. While individual bots in botnets do not connect to all of the other bots on the network, they do connect to many.

IoT growth => More buttons

In this environment, I think Kauffman’s toy problem still applies. Namely, at some point, the largest cluster size grows very rapidly. Maybe not with the near-vertical drama of Kauffman’s problem where everything can be connected, but still with a significant acceleration in growth of the largest cluster once a critical point is reached. And if the largest cluster size suddenly meets or exceeds that putative optimal botnet size, well then, we’ve got ourselves an effective botnet.

So if the rapid growth in IoT provides many more buttons, then there are also many more buttons/potential botnet participants for the network. And the fact that these botnets can fairly suddenly (aka seemingly arbitrarily) reach their optimal effectiveness adds another air of uncertainty and difficult-to-predictness to the whole thing.

Not gloom & doom, but evolving risk picture

The sky is not falling and the Internet of Things holds much promise, but the way we look at risk will need to change. The advent and rapid growth of the Internet of Things will change some of the math on the Internet. More botnets will come online and they will do so in unpredictable ways. I’m not saying the end is near, but rather the way we look at risk will have to change.