Data Analytics

Three Key Aspects To Collecting Meaningful Data For Successful IoT

This blog, written by Nicola Thorn of ANDtr, dives into ‘the three R’s’ of collecting data for a successful Internet of Things (IoT) project.

Get in Touch

Share this blog

This blog was written by Nicola Thorn, Director of AND Technology Research. With over 39 years’ experience in technology development, AND have worked on all kinds of projects, across a wide range of sectors. If you would like more information or need help getting an IoT initiative off the ground, then they would love to hear from you. Please contact Nicola Thorn for more details [email protected]

Did you know that only 26% of companies are successful in implementing IoT initiatives? Some of the main reasons why IoT projects fail are because development takes too long, insufficient value is created, there is a lack of expertise or ultimately the data just isn’t good enough[1].

With over 350 projects under our belt, we at AND Technology Research know what is important for delivering a successful IoT project. It begins with the data and ensuring that the three key attributes for data collection are covered.

The Three R’s…

The 3Rs of data collection

Representative

Data needs to be representative of the situation it is measuring. This means it should be accurate and bias free. In our minds accuracy is a given, for example, equipment needs to be correctly calibrated and sensors need to capture the correct value. Bias-free is perhaps more difficult to achieve. In a world of Artificial Intelligence (AI) and Machine Learning (ML), it is equally as important. Machines learn from data sets. They learn patterns and trends. They therefore too learn the bias within the data, whether that be a selective bias or collective bias.

One area that highlights this problem is facial recognition software and the bias towards male subjects with lighter skin tone. For example, studies carried out by Gendershades.org, [2] showed that 95.9% of the faces misgendered on one of the top selling facial recognition software were those of female subjects. Further analysis of another package revealed that 93.6% of faces misgendered were those of darker subjects.

screen depicting representative measuring

Reliable

Disclaimer: we know that the word reliable has a scientific meaning, and so in this case, please forgive us for hijacking its meaning.

One of the biggest issues any data scientist will face is missing data. Am I missing key moments in the system? Am I getting the data when I need it so that I can act on it?

Data collection and transmission needs to be reliable. It needs to be gathered, sent and received at the right time.  Missing data and events can cause all sorts of problems. Take a car for example. The modern car has approximately 70 electronic and software processes with over 2,000 signals travelling through the car at any one point [3]. Some processes are more important than others and some are more crucial, such as airbag deployment and braking systems. Failure of these signals to be generated and passed on at the right time, every time, could lead to disastrous outcomes.

List of sensors in a car

Moreover, missing data can cause bias, skewing data sets and algorithm outputs. One classic illustration of this problem is in medical surveys. Patients with a medical app to track symptoms might only engage with the app when they are feeling well, potentially affecting the conclusion of any data analytics.

Scoping out all technologies before development can save a lot of time and money down the line. In our experience, particularly in IoT systems, one of the main causes of lacking data is connectivity. It is important to make sure the right connectivity options for the system are chosen. Despite the name ‘Internet of Things’ the internet isn’t the only option. It often isn’t the right one for the system or its intended environment. Wired communication, for example like the CAN system used within cars, can be more reliable and in critical systems are often selected over other protocols.

Diagram of protocols which can transfer data

Robust

Technology is constantly evolving, even those that have already been deployed. Therefore, it is paramount that the correct provisions are in place from the beginning to cater for updates and ongoing maintenance. Any successful IoT system will be robust to change, both externally and internally. Whether it be a new security update, interoperability into external tools, or the addition of new functionality remotely.

Systems fail when technologies or requirements change but the devices used to collect the data cannot adapt quickly enough, or at a low enough cost. Building flexibility into the system will reduce the need for deployment recall.

The smart meter roll is a prime example; [4] some meters are not able to cope with consumers changing suppliers and as a result has left customers with “dumb” meters which are unable to integrate into different systems.

How do you ensure the three R’s within your system?

  • Define the problem you are trying to solve, and quantify the system before you begin.Make sure you fully understand the problem you are trying to solve. And perhaps more importantly, know what you aren’t trying to solve. This way you can quantify what your system will achieve. From there, you can begin to plan out the work required into smaller chunks.
  • Don’t just focus on the tech. Factor in people.
    IoT isn’t just about tech. People are key. Listen to the end users and make sure the system is delivering value. Value is only achieved when the value is realised by people.
  • Don’t make assumptions about the data. Let it tell its own story.
    One of the easiest ways to introduce bias into your data is making assumptions about what the data is going to tell and collecting data that only re-affirms preconceptions. As much as possible, let the data tell its own story.
  • Constantly review your data and think about what other data might be readily available.
    When reviewing your data, make sure you are looking for bias within your data set. At which points do you have missing data and why? Have we trained the model on a diverse range of datasets? Is there extra data that the model requires, for little extra effort such as meta-data?
  • Think about scalability and security from the start and always have life-cycle in mind.
    Mapping out your product development and journey in the initial stages will help provision for changes further down the line. Keep on top of life-cycle management and stay ahead of technology.
  • The internet isn’t the only option.
    There are vast number of sensing and communications technologies to choose from. Explore different technologies which might be more suited, off-the-shelf solutions aren’t always the best approach.

With over 39 years’ experience in technology development, AND have worked on all kinds of projects, across a wide range of sectors. If you would like more information or need help getting an IoT initiative off the ground, then they would love to hear from you. Please contact Nicola Thorn for more details [email protected]

References:

[1] https://www.slideshare.net/CiscoBusinessInsights/journey-to-iot-value-76163389

[2] http://www.gendershades.org

[3] https://www.eenewseurope.com/design-center/automotive-service-era-electronic-car-1

[4] https://eandt.theiet.org/content/articles/2019/03/over-50-per-cent-of-smart-meter-users-face-problems-when-switching-supplier/