Most machine learning projects require copious amounts of data for training and testing. While you can download pre-made datasets, I would like to demonstrate how to create your own collection system.
Artificial intelligence on the edge (Edge AI) is a growing field that combines the properties of machine learning (with other AI techniques) and the Internet of Things (IoT). See this article if you wish to learn more about Edge AI.
Over the next few tutorials, I would like to walk you through the process of creating an end-to-end Edge AI system that can be used to detect anomalies in an electric motor. Specifically, we will use a ceiling fan to emulate a piece of industrial equipment and “anomalies” will be anything that throws the fan off-balance (e.g. touching a moving fan blade, a different speed, or weights attached to the end of a fan blade).
If you would like to watch a video explaining these concepts, see here:
Anomaly detection is an important area of focus in engineering and data science, as it can be used to save lives and potentially millions of dollars in costly repairs of machines, industrial equipment, robots, etc. If you want to dig into it further, here is a great article on anomaly detection with some Python examples.
If you think about large industrial machines, scientific test equipment, and expensive space-faring robots, repairs can be time-consuming, expensive, and sometimes impossible.
Now, think about a car for a moment. How expensive and time-consuming is it to replace the oil? What happens if you don’t replace the oil? You could potentially damage the engine, costing thousands of dollars in repairs at best or rendering the vehicle inoperable at worst. While oil changes are on a set schedule for your car, the basic idea remains: maintenance is generally cheaper than repairing something that breaks.
Scale this up to larger machines where repairs could be in the hundred thousands or millions of dollars. Or maybe you have a robot cruising around on Mars: repairing it would be nearly impossible. What if you could monitor the health of the equipment to determine that something was close to breaking? You might be able to perform some basic maintenance before something breakes and requires repairing.
One way to detect problems before they occur is to use anomaly detection algorithms. Most of these algorithms fall into machine learning, where you create a mathematical model features of normal machine operation. If you detect something that falls outside of these “normal” parameters, then you classify it as an anomaly.
Once you know that an anomaly has occurred, you can notify operators via the Internet, shut down operation, or perform any other required action.
Steps to Create an Anomaly Detection System
To create an anomaly detection system, we need to first collect a bunch of data to characterize what “normal” operation of the system looks like. If you can emulate an anomaly, it can help to collect data from that as well for testing.
Once we have the data, we can extract features from it to train a machine learning model. You’ll want to test the model against normal and anomalous data (if you have it) to make sure everything is working before deploying it to your end system.
The end system can take on various forms. The easiest is to collect raw data from a microcontroller embedded in the machine and send it to a remote server that is running the anomaly detection algorithm. Another method is to embed the model and inference code in the microcontroller that sits on the machine. This would save bandwidth, as you’re not transmitting raw data, but requires more work and possibly a more powerful microcontroller.
For this tutorial, we’ll start with data collection. There are myriad types of data you can collect from an electric motor: sound, video, vibration, current draw, etc. For this demo, we’ll focus on finding anomalies in just vibration. For that, we’ll use a simple 3-axis accelerometer.
While you can save samples directly on the microcontroller or to a storage device like a microSD card, I wanted to make an IoT type of device that sends data directly to a server on a local network.
If you wish to recreate this demo, you will need the following components:
Connect the MSA301 to the ESP32 board through I2C.
We are going to use our computer as a server and the ESP32 as a node on the same network. The ESP32 will run an Arduino sketch that connects to the server and performs an HTTP GET request. The server will respond with a simple status code: 1 for ready and 0 for not ready.
If the ESP32 sees that the server is ready (received a 1), it will collect 200 measurements of all 3 axes from the accelerometer over the course of 1 second. It then bundles this data up in a JSON format and transmits it to the server as an HTTP POST request.
The server parses the JSON message and saves the x, y, z measurements in comma separated values format in a new file. The server continues to store these files so long as it is running.
The first code we need to write is the Arduino sketch. Make sure you have the Espressif ESP32 library for Arduino installed by following these instructions. You will also need the ArduinoJson library, which can be installed using these instructions.
Head to the GitHub page here to copy the code for the ESP32. Copy it into your Arduino IDE and change the following variables for your particular device:
Upload this program to your Huzzah32.
As long as DEBUG is set to 1 in the program, you can open a Serial Terminal to verify that the program is working. It should try to connect to your WiFi and then send a GET request to a server (which we haven’t written yet).
Use some tape to attach the Huzzah32, MSA301, and battery to the top of your ceiling fan’s motor shroud.
Next, we need to write a quick server in Python that responds to the requests from the ESP32.
WARNING: We are going to use the simple Python http.server class for this exercise. Note that it lacks some basic security features and is not recommended for production environments! So, don’t open it up to the larger Internet, but letting it run on your local network is generally OK.
In a new .py file, copy in the code found here: https://github.com/ShawnHymel/tinyml-example-anomaly-detection/blob/master/http-accel-server.py
Run the Python script with the following parameters:
python http_accel_server.py -d <DIRECTORY> -p <PORT> -t <TIME TO RUN SERVER>
The output directory is where you want the files to be saved, the port should match the port in your Arduino code (1337 by default), and the time is how long the server should run for.
Try turning the motor off and set the directory to something like fan_0_off_0_weight and run it for 2400 seconds. This seems to be enough time to collect a few hundred samples.
Then, turn the fan on high and collect data for fan_0_high_0_weight. Repeat this process for medium and low speeds.
You can tape a coin to the end of one of the fan blades to emulate an anomaly. Collect data again at low, medium, and high speeds with fan_0_[low, med, high]_1_weight as the directory names.
When you’re done, you should have a few thousand samples stored as CSV files. If you wish to skip the data collection process, you can download my raw data from here: https://github.com/ShawnHymel/tinyml-example-anomaly-detection/tree/master/ceiling-fan-dataset (note that you’ll probably need to download the entire repository to access them).
Resources and Going Further
Anomaly detection can have amazing benefits from detecting credit card fraud, analyzing network traffic for malicious activity, to predicting machine failure. If you’d like to read more about anomaly detection for condition monitoring of machines, see these two fantastic articles:
If you would like to work with real data from live equipment (and not ceiling fans), NASA spent a few years collecting such data for various pieces of equipment, which you can download here: https://ti.arc.nasa.gov/tech/dash/groups/pcoe/prognostic-data-repository/. Specifically, the Bearing Data Set is used in the articles listed above and can be great fun to analyze.