Tag Archives: ics

FTC IoT guideline describes complexity, nuance of IoT

FTC IoT development guidelines http://1.usa.gov/1LeGOpX

FTC IoT development guidelines http://1.usa.gov/1LeGOpX

The Federal Trade Commission (FTC) has issued a guideline to companies developing Internet of Things (IoT) products and services. The guideline addresses security, privacy, encryption, authentication, permission control, testing, default settings, patch/software update planning, customer communication and education, and others.

IoT irony

The irony is that the comprehensiveness of the document, the things to plan for and look out for when developing IoT devices and systems, is the same thing that makes me think that the preponderance of device manufacturers will never do most of the things suggested. At least not in the near term. Big companies that have established brand, (eg Microsoft, Cisco, Intel, others) will have the motivation (and capacity) to participate in most of these recommendations. However, the bulk of the companies and likely the bulk of the total IoT device/system marketplace entries will be from the long tail of companies and businesses.

These companies are the smaller companies and startups that are just trying to get into the game. They won’t have an established brand across a large consumer base. This can also be read as, ‘they don’t have as much to lose’. Their risk and resource allocation picture does not include an established brand that needs to protected. They don’t have a brand yet. For most of these startup and small companies, they will view their better play to be:

  • throw our cool idea out there
  • get something on the market
  • if we get a toehold & start to establish some brand, then  we’ll start to worry about being more comprehensive with the FTC suggestions


Again, to be clear, I am appreciative of the FTC guideline for manufacturers and developers of Internet of Things devices. It’s a needed document and is thoughtful, well-written, and thorough. However, the same document can’t help but illustrate all of the variables and complexities of networked computing regarding privacy and security concerns — the same privacy and security concerns that most companies will have insufficient resources and motivation to address.

We’re in for a change. It’s way more complicated than just ‘bad or good’. Where we help protect and manage risk for our organizations, we’re going to have to change how we approach things in our risk management and security efforts. No one else is going to do it for us.

Cerealboxing Shodan data

luckycharmsIn 2010, Steve Ocepek did a presentation at  DefCon where he introduced an idea that he called ‘cerealboxing’.  In it, he made a distinction between visibility and visualization. He suggested that visualization uses more of our ability to reason and visibility is more peripheral and taps into our human cognition.  He references Spivey and Dale in their paper Continuous Dynamics in Real-Time Cognition in saying:

“Real-time cognition is best described not as a sequence of logical operations performed on discrete symbols but as a continuously changing pattern of neuronal activity.”

Thinking on the back burner

Steve’s work involved building an Arduino-device that provides an indication of the source country of spawned web sessions while doing normal web browsing.  The idea was that as you do your typical browsing work, the device, via numbers and colors of illuminated LEDs would give an indication of how many web sessions were spawned on any particular page and where those sessions sourced from.  I built the device myself, ran it, and it was enlightening (no pun intended).

Using Steve’s device, while focused on something else — my web browsing, I had an indication out of the corner of my eye that I processed somewhat separately from my core task of browsing.  Without even trying or ‘thinking’, I was aware when a page lit up with many LED’s and many colors (indicating many sessions from many different countries).  I also became aware when I was seeing many web pages, regardless of my activity, that came from Brazil, for example.


Steve named this secondary activity ‘cerealboxing’ as when you mindlessly read a cereal box at breakfast.  From one of his presentation slides:

  • Name came from our tendency to read/interpret anything in front of us
  • Kind of a “background” technology, something that we see peripherally
  • Pattern detection lets us see variances without digging too deep
  • Just enough info to let us know when it’s time to dig deeper

Back to excavating Shodan data

As I mentioned in my last post, Shodan data offers a great way to characterize some of the risk on your networks.  The challenge is that there is a lot of data.

One of the things that I want to know is what kinds of devices are showing up on my networks? What are some indicators? What words from ‘banner grabs’ indicate web cams, Industrial Control Systems, research systems, environmental control systems, biometrics systems, and others on my networks?  I started with millions of tokens.  How could I possibly find out interesting or relevant ‘tokens’ or key words in all of these?

To approach this, I borrowed the cerealboxing idea and wrote a script that continuously displays this data on a window (or two) on my computer. And then just let it run while I’m doing other things. It may sound odd, but I found myself occasionally glancing over and catching an interesting word or token that I probably would not have seen otherwise.


unordered tokens

So, in a nutshell, I approached it this way:

  • tokenize all of the banners in the study
  • I studied banners from my organization as well as peer organizations
  • do some token reduction with stoplists & regular expressions, eg 1 & 2 character tokens, known printers, frequent network banner tokens like ‘HTTP’, days of the week, months, info on SSH variants, control characters that made the output look weird, etc
  • scroll a running list of these in the background or on a separate machine/screen

I also experimented with sorting by length of the tokens to see if that was more readable:


sorted by order — this section showing tokens (words) of 5 characters in length

In the course of doing this, I update a list of related tokens.  For example, some tokens related to networked cameras:


And some related to audio and videoconferencing:


This evolving list of tokens will help me identify related device and system types on my networks as I periodically update the sample.

This is a fair amount of work to get this data, but once the process is identified and scripts written, it’s not so bad. Besides, with over 50 billion networked computing devices online in the next five years, what are you gonna do?

Excavating Shodan Data


A shovel at a time

The Shodan data source can be a good way to begin to profile your organization’s exposure created by Industrial Control Systems (ICS) and Internet of Things (IoT) devices and systems. Public IP addresses have already been scanned for responses to known ports and services and those responses have been stored in a searchable web accessible database — no muss, no fuss. The challenge is that there is A LOT of data to go through and determining what’s useful and what’s not useful is nontrivial.

Data returned from Shodan queries are results from ‘banner grabs’ from systems and devices. ‘Banner grabs’ are responses from devices and systems that are usually in place to assist with installing and managing the device/system. Fortunately or unfortunately, these banners can contain a lot of information. These banners can be helpful for tech support, users, and operators for managing devices and systems. However, that same banner data that devices and systems reveal about themselves to good guys is also revealed to bad guys.

What are we looking for?

So what data are we looking for? What would be helpful in determining some of my exposure? There are some obvious things that I might want to know about my organization. For example, are there web cams reporting themselves on my organization’s public address space? Are there rogue routers with known vulnerabilities installed? Industrial control or ‘SCADA’ systems advertising themselves? Systems advertising file, data, or control access?

The Shodan site itself provides easy starting points for these by listing and ranking popular search terms in it’s Explore page. (Again, this data is available to both good guys and bad guys). However, there are so many new products and systems and associated protocols for Industrial Control Systems and Internet of Things that we don’t know what they all are. In fact, they are so numerous and growing that we can’t know what they all are.

So how do we know what to look for in the Shodan data about our own spaces?


My initial approach to this problem is to do what I call excavating Shodan data. I aggregate as much of the Shodan data as I can about my organization’s public address space. Importantly, I also research the data of peer organizations and include that in the aggregate as well. The reason for this is that there probably are some devices and systems that show up in peer organizations that will eventually also show up in mine.

Next, using some techniques from online document search, I tokenize all of the banners. That is, I chop up all of the words or strings into single words or ‘tokens.’ This results in hundreds of thousands of tokens for my current data set (roughly 1.5 million tokens). The next step is to compute the frequency of each, then sort in descending order, and finally display some number of those discovered words/tokens. For example, I might say show me the 10 most frequently occurring tokens in my data set:


Top 10 most frequently occurring words/tokens — no big surprises — lots of web stuff

I’ll eyeball those and then write those to a stoplist so that they don’t occur in the next run. Then I’ll look at the next 10 most frequently occurring. After doing that a few times, I’ll dig deeper, taking bigger chunks, and ask for the 100 most frequently occurring. And then maybe the next 1000 most frequently occurring.

This is the excavation part, gradually skimming the most frequently occurring off the top to see what’s ‘underneath’. Some of the results are surprising.

‘Password’ frequency in top 0.02% of banner words

Just glancing at the top 10, not much is surprising — a lot of web header stuff. Taking a look at the top 100 most frequently occurring banner tokens, we see more web stuff, NetBIOS revealing itself, some days of the week and months, and other. We also see our first example of third party web interface software with Virata-EmWeb. (Third party web interface software is interesting because a vulnerability here can cross into multiple different types of devices and systems.) Slicing off another layer and going deeper by 100, we find the token ‘Password’ at approximately the 250th most frequently occurring point. Since I’m going through 1.5 million words (tokens), that means that ‘Password’ frequency is in the top 0.02% or so of all tokens. That’s sort of interesting.

But as I dig deeper, say the top 1500 or so, I start to see Lantronix, a networked device controller, showing up. I see another third party web interface, GoAhead-Webs. Blackboard often indicates Point-of-Sale devices such as card swipers on vending machines. So even looking at only the top 0.1% of the tokens, some interesting things are showing up.


Digging deeper — Even in the top 0.1% of tokens, interesting things start to show up

New devices & systems showing up

But what about the newer, less frequently occurring, banner words (tokens) showing up in the list? Excavating like this can clearly get tedious, so what’s another approach for discovery of interesting, diagnostic, maybe slightly alarming words in banners on our networks? In a subsequent post, I’ll explain my next approach that I’ve named ‘cerealboxing’, based on an observation and concept of Steve Ocepek’s regarding our human tendency to automatically read, analyze, and/or ingest information in our environment, even if passively.

Poor Man’s Risk Visualization II

Categorizing and clumping (aggregating) simple exposure data from the Shodan database can help communicate some risks that otherwise might have been missed.  Even with the loss of some accuracy (or maybe because of loss of accuracy), grouping some data into larger buckets can help communicate risk/exposure. For example, a couple of posts ago in Poor Man’s Industrial Control System Visualization, Shodan data was used to do a quick visual analysis of what ports and services are open on publicly available IP addresses for different organizations. Wordle was used to generate word clouds and show relative frequency of occurrence where ‘words’ where actually port/service numbers.

Trading-off some accuracy for comprehension

This is great for yourself or colleagues that are also fairly familiar with port numbers, the services that they represent, and what their relative frequencies might imply. However, often we’re trying to communicate these ideas to business people and/or senior management. Raw port numbers aren’t going to mean much to them. A way to address this is to pre-categorize the port numbers/services so that some of them clump together.

Yes, there is a loss of some accuracy with this approach — whenever we generalize or categorize, there is a loss of information.  However, when the domain-specific information makes it difficult or impossible to communicate to another that does not work in that domain (with some interesting parallels to the notion of channel capacity), it’s worth the accuracy loss so that something useful gets communicated. Similar to the earlier post of port/service numbers only, one organization has this ‘port number cloud’:


A fair amount of helpful quick-glance detail consumable by the IT or security professional, but not much help to the non-IT professional

Again, this might have some utility to an IT or security professional, but not much to anyone else. However, by aggregating some of the ports returned into categories and using descriptive words instead, something more understandable by business colleagues and/or management can be rendered:


For communicating risk/exposure, this is a little more readable & understandable to a broader audience, especially business colleagues & senior management

How you categorize is up to you. I’ll list my criteria below for these examples. It’s important not to get too caught up in the nuance of the categorization. There are a million ways to categorize and many ports/services serve a combination of functions. You get to make the cut on these categories to best illustrate the message that you are trying to get across. As long as you can show how you went about it, then you’re okay.


One way to categorize ports — choose a method that best helps you communicate your situation

The port number and ‘categorized’ clouds for a smaller organization with less variety are below.



A port number ‘cloud’ for a different (and smaller) organization with less variety in port/service types


The same port/service categorization as used above, but for the smaller organization, yields a very different looking word cloud

One challenge with the more clear approach is that your business colleagues or senior management might leap to a conclusion that you don’t want them too. For example, you will need to be prepared for the course of action that you have in mind. You might need to explain, for example, that though there are many web servers in your organization, your bigger concern might be exposure of telnet and ftp access, default passwords, or all of the above.

This descriptive language categorization approach can be a useful way to demonstrate port/service exposure in your organization, but it does not obviate the need for a mitigation plan.

Poor Man’s Industrial Control System Risk Visualization

The market is exploding with a variety of visualization tools to assist with ‘big data’ analysis in general and security and risk awareness analysis efforts in particular. Who the winner is or winners are in this arena is far from settled and it can be difficult to figure out where to start. While we analyze these different products and services and try some of our own approaches, it is good to keep in mind that there can also be some simple initial value-add in working with quick and easy, nontraditional (at least in this context), visualization

Even simple data visualization can be helpful

I’ve been working with some Shodan data for the past year or so. Shodan, created by John Matherly, is a service that scans several ports/services related to Industrial Control Systems (ICS) and, increasingly, Internet of Things sorts of devices and systems. The service records the results of these scans and puts them in a web accessible database. The results are available online or via a variety of export formats to include csv, json, and xml (though xml is deprecated). In his new site format, Matherly also makes some visualizations of his own available. For example, here’s one depicting ranked services for a particular subset of IP ranges that I was analyzing:

Builtin Shodan visualization -- Top operating systems in scan

One of the builtin Shodan visualizations — Top operating systems

Initially, I wanted to do some work with the text in the banners that Shodan returns, but I found that there was some even simpler stuff that I could do with port counts (number of times a particular port shows up in a subset of IP addresses) to start. For example, I downloaded the results from a Shodan scan, counted the occurrences for each port, ran a quick script to create a file of repeated ‘words’ (actually port numbers), and then dropped that into a text box on Wordle.

Inexpensive (free) data visualization tools

Wordle is probably the most popular web-based way of creating a word cloud. You just paste your text in here (repeated ports in our case):

Just cut & paste ports

Just cut & paste ports into Wordle

Click create and you’ve got a word cloud based on the number of ports/services in your IP range of interest. Sure you could look at this in a tabular report, but to me, there’s something about this that facilitates increased reflection regarding the exposure of the IP space that I am interested in analyzing.



VNC much? Who says telnet is out of style ?

[For some technical trivia, I did this by downloading the Shodan results into a json file, used python to import, parse, and upload to a MySQL database, and then ran queries from there. Also, Wordle uses Java so it didn’t play well with Chrome and I switched to Safari for Wordle.]

In addition to quickly eyeball-analyzing an IP space of interest, it can also make for interesting comparisons between related IP spaces. Below are two word clouds for organizations that have very similar missions and staff make up. You would, I did anyway, expect their relative ports counts and word clouds to be fairly similar. As the results below show, however, they may be very different.


Organization 1’s most frequently found ports/services


Organization 2’s most frequent ports/services — same mission and similar staffing as Org 1, but network (IP space) has some significant differences

Next steps are to explore a couple of other visualization ideas of using port counts to characterize IP spaces and then back to the banner text analysis. Hopefully, I’ll have a post on that up soon.

If you’re doing related work, I would be interested in hearing about what you’re exploring.

Shodan creator opens up tools and services to higher ed



The Shodan database and web site, famous for identifying and cataloging the Internet for Industrial Control Systems and Internet of Things devices and systems, is now providing free tools to educational institutions. Shodan creator John Matherly says that “by making the information about what is on their [universities] network more accessible they will start fixing/ discussing some of the systemic issues.”

The .edu package includes over 100 export credits (for large data/report exports), access to the new Shodan maps feature which correlates results with geographical maps, and the Small Business API plan which provides programmatic access to the data (vs web access or exports).

It has been acknowledged that higher ed faces unique and substantial risks due in part to intellectual property derived from research and Personally Identifiable Information (PII) issues surrounding students, faculty, and staff. In fact, a recent report states that US higher education institutions are at higher risk of security breach than retail or healthcare. The FBI has documented multiple attack avenues on universities in their white paper, Higher Education and National Security: The Targeting of Sensitive, Proprietary and Classified Information on Campuses of Higher Education .

The openness and sharing and knowledge propagation mindset of universities can be a significant component of the risk that they face.

Data breaches at universities have clear financial and reputation impacts to the organization. Reputation damage at universities not only affects the ability to attract students, it also likely affects the ability of universities to recruit and retain high producing, highly visible faculty.

This realm of risk of Industrial Control Systems combined with Internet of Things is a rapidly growing and little understood sector of exposure for universities. In addition to research data and intellectual property, PII data from students, faculty, and staff, and PHI data if the university has a medical facility, universities can also be like small to medium sized cities. These ‘cities’ might provide electric, gas, and water services, run their own HVAC systems, fire alarm systems, building access systems and other ICS/IoT kinds of systems. As in other organizations, these can provide substantial points of attack for malicious actors.

Use of tools such as Shodan to identify, analyze, prioritize, and develop mitigation plans are important for any higher education organization. Even if the resources are not immediately available to mitigate identified risk, at least university leadership knows it is there and has the opportunity to weigh that risk along with all of the other risks that universities face. We can rest assured that bad guys, whatever their respective motivations, are looking at exposure and attack avenues at higher education institutions — higher ed institutions might as well have the same information as the bad guys.

Managing the risk of everything else (and there’s about to be more of everything else)

see me, feel me, touch me, heal me

see me, feel me, touch me, heal me

As organizations, whether it be companies, government, or education, when we talk about managing information risk, it tends to be about desktops and laptops, web and application servers, and mobile devices like tablets and smartphones. Often, it’s challenging enough to set aside time to talk about even those. However, there is new rapidly emerging risk that generally hasn’t made it to the discussion yet. It’s the everything else part.

The problem is that the everything else might become the biggest part.


Everything else

This everything else includes networked devices and systems that are generally not workstations, servers, and smart phones. It includes things like networked video cameras, HVAC and other building control, wearable computing like Google Glass, personal medical devices like glucose monitors and pacemakers, home/business security and energy management, and others. The popular term for these has become Internet of Things (IoT) with some portions also sometimes referred to as Industrial Control Systems (ICS).

The are a couple of reasons for this lack of awareness. One is simply because of the relative newness of this sort of networked computing. It just hasn’t been around that long in large numbers (but it is growing fast). Another reason is that it is hard to define. It doesn’t fit well with historical descriptions of technology devices and systems. These devices and systems have attributes and issues that are unlike what we are used to.

Gotta name it to manage it

So what do we call this ‘everything else’ and how do we wrap our heads around it to assess the risk it brings to our organizations? As mentioned, devices/systems in this group of everything else can have some unique attributes and issues. In addition to using the unsatisfying approach of defining these systems/devices by what they are not (workstations, application & infrastructure servers, and phones/tablets), here are some of the attributes of these devices and systems:

  •  difficult to patch/update software (& more likely, many or most will never be patched)
  •  inexpensive — there can be little barrier to entry to putting these devices/systems on our networks, eg easy-setup network cameras for $50 at your local drugstore
  • large variety/variability — many different types of devices from many different manufacturers with many different versions, another long tail
  • greater mystery to hardware/software provenance (where did they come from? how many different people/companies participated in the manufacture? who are they?)
  • large numbers of devices — because they’re inexpensive, it’s easy to deploy a lot of them. Difficult or impossible to feasibly count, much less inventory
  • identity — devices might not have the traditional notion of identity, such as having a device ‘owner’
  • little precedent — not much in the way of helpful existing risk management models. Little policies or guidelines for use.
  • everywhere — out-ubiquitizes (you can quote me on that) the PC’s famed Bill Gatesian ubiquity
  • most are not hidden behind corporate or other firewalls (see Shodan)
  • environmental sensing & interacting (Tommy, can you hear me?)
  • comprises a growing fraction of Industrial Control and Critical Infrastructure systems

So, after all that, I’m still kind of stuck with ‘everything else’ as a description at this point. But, clearly, that description won’t last long. Another option, though it might have a slightly creepy quality, could be the phrase, ‘human operator independent’ devices and systems? (But the acronym ‘HOI’ sounds a bit like Oy! and that could be fun).

I’m open to ideas here. Managing the risks associated with these devices and systems will continue to be elusive if it’s hard to even talk about them. If you’ve got ideas about language for this space, I’m all ears.


Chuck Benson’s Information Risk Management Video Lectures

Slide from lectures -- Building an Information Risk Management Toolkit -- Week 9My lectures on Information Risk Management are on deck again this week in the University of Washington & Coursera course Building an Information Risk Management Toolkit.

(Use the link above & click on Video Lectures on left & then go to Week 9.  The video “Bounded Rationality” is a good place to start. Just need e-mail & password to create a Coursera account if you don’t have one).

slide from lectures -- Building an Information Risk Management Toolkit

Slide from lectures — Building an Information Risk Management Toolkit


Satellite communication systems vulnerable

dishSimilar to other Industrial Control System (ICS) and Internet of Things (IoT) devices and systems, small satellite dishes called VSAT’s (very small aperture terminals) are also exposed to compromise.  These systems are often used in critical infrastructure, by banks, news media, and also widely used in the maritime shipping industry. Some of the same problems exist with these systems as with other IoT and ICS:

  • default password in use
  • no password
  • unnecessary communications services turned on (eg telnet)

According to this Christian Science Monitor article, cybersecurity group IntelCrawler reports 10,500 of these devices being exposed globally.  Indeed, a quick Shodan search just now for ‘VSAT’ in the banner returns over 1200 devices.

The deployment of VSAT devices continues to rapidly grow with 16% growth demonstrated in 2012  (345,000 terminals sold) and 1.7 million sites in global service in 2012.  2012-2016 market growth is expected to be almost 8% for the maritime market alone.

Clearly, another area in need of buttoning up.

[Image:WikiMedia Commons]

Don’t forget the water


Grand Coulee Dam

Changes in water temperature and water availability will lead to more power disruptions in the next decades.

Ernie Hayden points out several often overlooked facts regarding electrical power generation in his Infrastructure Security blog.

Water is critical in electricity generation. Heat is generated by fueled power sources which spin the generators, whether combustion engines, coal-fired, nuclear, or other. That heat has to go somewhere. The primary coolant used in industrial power generation is water. Warmer water and diminished water flow reduce the ability to take that heat away which in turn reduces power generation.

Hayden points out a few examples of where warmer water or reduced water flow caused power degradation or complete shutdown:

* Millstone Nuclear Plant, Connecticut, 2012 — natural cooling water source (Long Island Sound) became too warm (almost 3 degree F ambient increase since plant’s inception in 1975). Plant shutdown for 12 days
* Browns Ferry Nuclear Plant, Alabama, 2011 — shutdown multiple times because water from Tennessee River was too warm
* Corette Power Plant, Montana, 2001 — plant shut down several times due to reduced water flow from Yellowstone River

Estimates of thermoelectric power generating capability are expected to drop by as much as 19% due to lack of cooling water.  Further, incidents of extreme drops in generation capability, ie complete disruption, is expected to almost triple.

Keep up your scan

An Information Risk Management 'scan' can be similar to a cockpit scan

An Information Risk Management ‘scan’ can be similar to a cockpit scan

Much like flying an aircraft, we have ‘keep up our scan‘ when analyzing these system risks. These are complex interconnecting systems. We are becoming increasingly concerned about cyberattacks on electrical and smart grid systems. That attention is good and overdue, but that is only part of the puzzle. We have to train ourselves to constantly scan the whole system — just because there’s a big fire in front of us, it doesn’t mean that there’s not another one burning somewhere else.

For want of a nail the shoe was lost.
For want of a shoe the horse was lost.
For want of a horse the rider was lost.
For want of a rider the message was lost.
For want of a message the battle was lost.
For want of a battle the kingdom was lost.
And all for the want of a horseshoe nail.


[Image 1: Wikimedia Commons: Farwestern / Gregg M. Erickson. Image 2: author’s]