Remaining useful life (RUL) prediction is the study of predicting when something is going to fail, given its present state. The problem has a prophetic charm associated with it. While a soothsayer can make a prediction about almost anything (including RUL of a machine) confidently, many people will not accept the prediction because of its lack of scientific basis. Here, we will try to solve the problem with scientific reasoning.
A component (or a machine) is said to have failed when it can no longer perform its desired task to the satisfaction of the user. For example, Li-Ion battery of an electric vehicle is said to have failed when it requires frequent recharging to travel a small distance. Similarly, a bearing of a machine is said to have failed, if level of vibration produced at the bearing goes above some acceptable limit. Other examples can be thought of for different applications. The goal then is to predict beforehand when something is going to fail. Knowledge of a component's expected time of failure will help us prepare well for the inevitable. In industrial setting, where any unplanned shutdown of a critical component has huge monetary cost, knowing when a machine is going to fail will result in significant monetary gains.
There are many techniques developed over the years to predict RUL of a component. All those techniques can be broadly divided into two categories.
- Model Based Methods
- Data-Driven Methods
In model based methods, we try to formulate a mathematical model of the system under consideration. Then using that model we try to predict RUL of the component. Though model based methods are used in some cases, there are many other applications where formulating a full mathematical model of the system is extremely difficult. In some cases, the underlying physics is so complex that we have to make many simplifying assumptions. Whether the simplifying assumptions are justified or not is determined by collecting real data from the machine. Therefore, it requires extensive domain knowledge and thus is a territory of only a select few who can actually do these things.
In contrast, in data-driven methods all information about a machine is gained from the data collected from it. With readily available sensors we can collect huge amounts of data for almost any application. By analyzing that data we can get an idea about the condition of the machine. That will help us in making an informed decision about the RUL of the machine. In this process we make no assumptions about the machine. Increasingly, data-driven methods are getting better at making reliable predictions. As the name of the project suggests, we will only focus on data-driven methods for RUL prediction. The problem of RUL prediction is also know as prognosis in some fields. Some people also call it prognostics. We will only use the term RUL prediction. In the beginning, we will mainly focus on predicting RUL of mechanical components. Later we will explore other application areas.
Like my previous project on fault diagnosis, aim of this project is to produce reproducible results for RUL prediction. RUL prediction is a broad subject that can be applied to many problems such as RUL prediction of Li-Ion batteries, RUL prediction of machinery bearings, RUL prediction of machine tool, etc. We will start with mechanical applications and then gradually move to other applications over time. As our aim is reproducibility, we will use publicly available datasets. Interested readers can download the data and use our code to get exact results as we have obtained. As we will use well known datasets, readers might observe that, in some cases, our results are in fact worse than some reported results elsewhere. Our goal is not to verify someone else's claim. If someone else claims a better result, onus is on them to demonstrate their result. Here, whatever results I have claimed can be reproduced by readers by just running the jupyter notebooks after downloading relevant data.
This is an ongoing project and modifications and additions of new techniques will be done over time. Python and R are two popular programming languages that are used in machine learning applications. We will use Python to demonstrate our results. At a later stage we might add equivalent R code. To implement deep learning models, we will use Tensorflow.
Results using NASA's Turbofan Engine Degradation Dataset
We will first apply classical machine learning methods (so-called shallow learning methods) to obtain results and then apply deep learning based methods. Dataset description and preprocessing steps can be found at this link. We will use the same preprocessing steps, with minor changes, in all notebooks. We strongly encourage readers to first go over data preparation notebook before using results notebooks. In the table below, we report Root Mean Square Error (RMSE) values. Click on the numbers in the table to view corresponding notebooks.
Note on last column of following table: The last column specifies the degradation model used in the notebooks. There are two common degradation models that are used for this particular turbofan dataset: Linear degradation model and Piecewise linear degradation model. For more details about both, see this. When we use piecewise linear degradation model, we have to assume an early RUL value. This is nothing but the value of RUL that is assumed when the component is relatively new. In literature, different people use different early RUL values. In our examples, when we specify an early RUL value, that means that we apply the same early RUL across all 4 datasets.
Method | FD001 | FD002 | FD003 | FD004 | Degradation Model |
---|---|---|---|---|---|
Gradient Boosting | 19.06 | 28.97 | 20.55 | 29.49 | Piecewise Linear (Early RUL = 125) |
Random Forest | 19.15 | 29.00 | 20.53 | 29.75 | Piecewise Linear (Early RUL = 125) |
Support Vector Regression (SVR) | 18.28 | 30.50 | 21.37 | 34.11 | Piecewise Linear (Early RUL: 125 (FD001, FD003), 150 (FD002, FD004)) |
Gradient Boosting | 33.24 | 29.88 | 47.94 | 40.34 | Linear |
In this section, we will apply deep learning to predict RUL of Turbofan dataset. Due to the nondeterministic nature of operations used in deep learning and dependence of libraries like Tensorflow
on computer architecture, readers might obtain slightly different results than those in the notebooks. For reproducibility of our results, we also share the saved models of each notebook. All saved models for Turbofan dataset can be found at this link. A notebook describing the steps to use the saved models can be found here.
Method | FD001 | FD002 | FD003 | FD004 | Degradation Model |
---|---|---|---|---|---|
LSTM | 15.16 | 27.57 | 15.54 | 28.21 | Piecewise Linear (Early RUL: 125 (FD001, FD003), 150 (FD002, FD004)) |
1D CNN | 15.84 | 30.38 | 15.78 | 32.35 | Piecewise Linear (Early RUL: 125 (FD001, FD003), 150 (FD002, FD004)) |
What is attention? We recommend Chapter 10 of this book for more details. We provide notebooks that implement GRU based additive attention for RUL prediction. For reproducibility of our results, we share trained weights. All trained weights can be found here. We also provide separate notebooks describing steps to use trained weights to reproduce exact results as obtained by us.
Method | FD001 | FD002 | FD003 | FD004 | Degradation Model |
---|---|---|---|---|---|
GRU + Additive Attention | 14.21 | 27.99 | 14.64 | 26.77 | Piecewise Linear (Early RUL: 125 (FD001, FD003), 150 (FD002, FD004)) |
(This table will be updated gradually.)
For attribution, cite this project as
@misc{Sahoo_Data-Driven_Remaining_Useful_2020,
author = {Sahoo, Biswajit},
doi = {10.5281/zenodo.5890595},
month = {9},
title = {Data-Driven Remaining Useful Life (RUL) Prediction},
url = {https://biswajitsahoo1111.github.io/rul_codes_open/},
year = {2020}
}
Readers should cite original datasets separately.