In my last blog I made an attempt to demystify “Big Data”. In this post I’ll talk about how to use data to make a prediction of a future state of a machine. This is the first in a series of posts since the subject is broad and evolving.
First, let’s just take a look at one of the easiest sources of data that can be used for making a time to failure call. That is sensor data. The reason is that this data type is structured and generated from calibrated sensors. For the most part, it can be trusted and other than different units of measure it has lower variability. It is the easiest place to start.
Over the past 10 years I have been working with this sort of data from high horsepower diesel engines. OEMs embed some sensors in modern engines including temperature, rpm, pressure, etc. From a few sensor measurements calculated parameters can be derived such as horsepower, fuel consumption, torque etc. All in all a good set of processed sensor data is available. OEMs also provide alarms. The best alarms that can be delivered use normalized data. What that means is that certain conditions need to exist before on board data is generated. Conditional normalization might include only taking data at full load for example. Normalization allows data points to be compared. One issue with OEM alarms that we have found and the whole reason to process more data is those alarms tend to come pretty close to failure. They don’t have much prognostic value.
In the work we did in building a prognostic solution we started with the hypothesis that we could gain more insight to the condition of a machine component if we were able to use all the data available. We started by obtaining data sets that were collected before known physical failures. Then the data was subjected to the type of analysis that I will describe in some detail in a future article. Basically the mathematician that did the work was able to detect fine changes many days prior to the failure. Where we had actually missed the failure and ended up with a downed machine had we used all the data available the failure would not only have been prevented it would have been detected many days in advance giving time to plan, schedule and basically take a reactive job to one done on condition.
We did learn, however, that sensors do go out of calibration and also fail in service. This aberrant data then can produce false positives. Therefore once data is acquired it does need to be validated. Validation is the second step on the way to a prognostic determination. Start with trusted data and ignore (but mark) the outliers. Predictive software has been pretty good at handling this step even from the early days. A technician builds his alarm sets and establishes ranges of acceptable data. Once data is validated it can be subjected to alarms. Basically, as I am sure you all know an alarm is a deviation from normal.
Now this gets difficult because normal can mean a lot of different things. Is it absolute normal, change from mean, rate of change, etc? A good alarm set must be able to look at all the data in different ways. I want to get into setting alarms in future blogs as it is an interesting subject and a somewhat misunderstood one. However, if the data that has been processed and normalized throws an alarm the game is on! This is where it gets fun and the value starts to develop. All an alarm tells you is that something is deviant and maybe gives an idea of severity-how far from normal. A prognostic determination cannot be made from a single alarm. What can be made however is a diagnosis. Diagnosis is Latin passed from the Greek that means “to know.”
An alarm or group of alarms can be subjected to expert systems. An expert or group of experts can write rules that will diagnose a condition. There is commercially available software that provides rules based expert systems that you can buy. Yes, as you thought, in a future blog I will get right into how to write a set of rules. So let us assume that the alarm value is moving and that the data is changing. If this was a simple linear process and you had established the commendation point, it would be easy to use a slope intercept calculation to determine time to failure and a lot of failures actually progress in this way. However when disparate data sources are used and the data doesn’t cooperate, in a nice sloping straight line the heavy math enters. I think we are almost at the end of this part of the series. I have had enough anyway and if you have actually read this far thank you. Next week we talk Prognostics-from the Greek meaning an omen of death!
Getting this part of your reliability program right lets you enter the 21st century and beat that big iron. Knowing when something is going to fail has obvious advantages but knowing something is not going to fail is the big money maker when you move your program to on condition. At XRT we have mastered some of this but are constantly learning. However bringing the science of predictive analytics to machinery health is the future.
“My interest is in the future because I am going to spend the rest of my life there”. Charles Kettering
Until next time,