Description

Predictive Maintenance with MATLAB: An Engine Health Case Study

Overview

Do you work with operational equipment that collects sensor data? In this webinar, we will showcase an aircraft engine health example to walk through how you can utilize that data for Predictive Maintenance, the intelligent health monitoring of systems to avoid future equipment failure. Rather than following a traditional maintenance timeline, predictive maintenance schedules are determined by analytic algorithms and data from sensors. With predictive maintenance, organizations can identify issues before equipment fails, pinpoint the root cause of the failure, and schedule maintenance as soon as it’s needed.

Highlights

Organizing, visualizing, and preprocessing data
Extracting useful features for training predictive models
Using machine learning to detect faults and predict remaining useful life
Deploying predictive algorithms in production systems and embedded devices

About the Presenters

Russell Graves: Russell is an Application Engineer at Mathworks focused on machine learning and systems engineering. Prior to joining MathWorks, Russell worked with the University of Tennessee and Oak Ridge National Laboratory in intelligent transportation systems research with a focus on multi-agent machine learning and complex systems controls. Russell holds a B.S. and M.S. in Mechanical Engineering from The University of Tennessee.

Peeyush Pankaj: Peeyush Pankaj is a senior application engineer at MathWorks, where he has been promoting MATLAB products for data science. He works closely with customers in the areas of predictive maintenance, digital twin, enterprise integration and big data. Peeyush has 10 years of industry experience with a strong background in Aviation. Prior to joining MathWorks, he has extensively worked on aircraft engine designs, testing and certification. He has filed 25 patents on Advanced Jet Engine technologies and Prognostic Health Monitoring of aircraft engines. Peeyush holds a master’s degree in advanced mechanical engineering from the University of Sussex, UK.

Recorded: 15 Feb 2023

Full Transcript

Hello, everyone, and welcome to today's webinar on predictive maintenance with MATLAB, an engine health case study. My name is Rachel, and I'm joined today by my two colleagues, Peeyush Pankaj and Russell Graves. Today we're going to be taking you through the process of going from a raw data set all the way through developing two different algorithms in MATLAB.

The first one is a machine learning algorithm. It'll be used for detecting and classifying specific faults in an aircraft engine. So we'll train our algorithm to differentiate between healthy operation and seven different types of faults. Next, we'll develop a model for estimating the remaining useful life once we start to detect that our engine is degrading.

This way, we can use these algorithms to make smarter maintenance decisions, scheduling maintenance at just the right time, reducing downtime, and preventing unplanned failures. But if you don't work in the field of aircraft engines, don't worry. This workflow can actually apply to any type of machine-- well, as long as you have enough data.

So to start out, there are lots of challenges that you might encounter in predictive maintenance, and our challenge today starts with the data. Our data comes from sensors from a fleet of 128 aircraft engines with seven different failure modes under various flight conditions and flight lengths. Flight after flight, our engines eventually will run to failure.

So we're going to take this data and use it to address a few common challenges in predictive maintenance today, such as, we have a lot of data. Our data is really big. So how do we begin to understand it? Can we wrap our brains around it? Can we reduce it while maintaining the important details for training and algorithm development?

Our data is complex and noisy, so can we clean it up? And even if you're an expert on aircraft engines, or even data science, you might not have expertise in predictive maintenance. It really requires both. And finally, once we've constructed an algorithm, how do we deploy it? How do we put it to work so that we can make these maintenance decisions in real life?

To do this, collaboration is really key. So we're going to enlist the help of a domain expert, someone who really understands the meaning of these sensor readings and how engines operate. The domain expert will be able to apply some meaning to the data. They will know where to start, and some meaningful features for training our algorithms.

But we'll also need a data scientist. So this is the person who understands how to work with big data, how to work with the data structure and clean it up, and how to actually build those machine learning algorithms we'll need. So today, Peeyush is our engine expert, and Russell will be our data scientist. And they'll really need to work together to be successful on this project.

So here's the process that Russell and Peeyush are going to follow. I'll be guiding you through each of these steps today. First they're going to explore their data, get an understanding of its structure and what it includes. Then they'll reprocess it and reduce the data to make it easier and cleaner to work with.

Then they're going to engineer some features from the data that are useful for training a machine learning algorithm that will detect and classify different types of faults, and that will accomplish their first goal. Finally, they'll achieve their second goal by creating condition indicators that indicate the health state of the engine as it degrades in order to train a remaining useful life algorithm.

So with all of that, let's start out by exploring the data. So Peeyush, what can you tell us about what's in the data set? What do we know about it right now?

All right. So what we have got in the data set is 128 different aircraft engines of the same make. And this engine data is actually split into the training set, as well as the test set. More importantly, the data is actually split into eight different sub data sets. And each of these sub data sets actually correspond to a specific sub-module failure inside the engine.

So with respect to the turbofan engine, there's some modules of fan low pressure compressor, high pressure compressor, high pressure turbine, and the low pressure turbine. Now, this data set is more about the flow parameters, the airflow parameters, more specifically, across the engine length.

And we'll certainly look into what sensor readings do we have in the data set, but I want to mention that each of these flights that the engine actually takes is being classified into a short duration, a medium duration, and a long duration flight. And this is what is represented as flight class one, two, and three in the data set.

Now, we also have multiple different flight phases actually represented in the data. And after diving into the data, we realized that these flight phases are climb, cruise, and descent phase. And we do not have any information about their takeoff and the ground idle phase. Now with that background, let's quickly load the first file in the data set into the memory by running this section of the script.

Now, once I run the next section, basically, we can look at the number of sensor readings that we have. So we have things like the temperature values across the engine length for the airflow. Also, the pressure fields are represented, a static pressure and the total pressure both, as well as, we have got the flow parameters.

And then we have some more information about the stall margin across the fan low pressure compressor and the high pressure compression module. Apart from that, there are some operational settings represented, such as the unit number, cycle number, and the flight class information. And then we have a specific column, or a specific feature, called health state, which is a binary field.

It represents whether the same model degradation has actually been triggered or not. And apart from that, we have got some more information related to the altitude, Mach number, and the throttle resolver angle inside the aircraft engine. Now, for this first file alone, when we look at the total number of rows, we have got nearly about 5 million rows of data in this first file alone.

So you can imagine that with the whole data set, we have a big data challenge at our hands.

Yeah. No, thanks for the over-- thanks for the overview, Peeyush. Sounds great. But it looks like this data is in something called parquet format. So Russell, can you talk to us about what that means? And how is that data structured?

Yeah. Thanks, Rachel. So as Peeyush mentioned, this data set's been stored in the parquet file format. Now, this is a typical format you find when you're working with large data sets, and I particularly like it for a few features that are present in MATLAB that let us really access and dive into that data little bit more effectively and efficiently than we would otherwise.

And when we got the data, it was stored in H5 file format, and that's fine. That's pretty typical, as well. I prefer parquet. MATLAB works with both, and I can pretty easily read and convert one format to the other. So where Peeyush read in one file, I've kind of given him this dropdown menu to be able to parse through the different files we have available.

I can use a little asterisk, a little wild card, to read in all of our data, and we can get a higher level overview of the entire data set. So to do that, I first can use the parquet info, this little get parquet info function that I built, to extract the parquet info from one file just to kind of explore it. Again, you can see the variables that Peeyush went over. We can also see these little rogue groups.

So rogue groups are an artifact present in parquet files that let us sort of intelligently partition the data out into individual sections. So in this case, each rogue group is one flight cycle. So that means that if I called the read to read in one segment of the parquet file data, it would pull in one flight. This is just set up to allow us to very intuitively access and work with that data.

Next, we can use what we call a data store, so a file or a data type in MATLAB, that basically points MATLAB to where your data lives can be. In this case, we're pointing it to all of the parquet files in our data folder. So we're telling MATLAB we want to deal with all of that. And the next thing we're doing after we've established this parquet data store is to tell MATLAB that we want to treat that data store, all the data that's there, as if it's a table.

And so what we're doing is we're calling tall, and MATLAB is going to create what's called a tall table. So a tall anything is basically indicating that it's of the data type table, but it is too big to fit entirely into memory. In this case, it's something in the order of 18 gigabytes. I might be able to read that into memory, but that might not be the most efficient or effective way to work with it.

So you can see here I've just called head to get an idea for what that tall table looks like. And as you expect, it's got all the columns that we want. It's got all of our data coming in. It's coming in tabular format. And now, really importantly, without too much rework in my code architecture, my typical coding practices, I can now operate on that 18 gigabyte data set, again, without really altering what I'm used to.

So in this case, I'm going to use a MATLAB function called group summary, which is going to give me a roll up of some of the different columns present in the data set. In this case, I want to roll up a file name, unit, flight class, and cycle, and I want a roll up of the file name, unit, failure mode, and cycle. And when we're working out of memory data, MATLAB uses what we call deferred execution.

So it's very important to know that when I tell it that I want the flight class summary is equal to this group summary, it's not going to immediately give me that. It's going to wait until I call this gather command. And gather is asking MATLAB to go out and execute all of the different calculations, or all of the different functions that I've asked for in relation to that tall table.

And it's going to optimize how it goes about gathering so that it has to go through the data set the minimum number of times possible. So again, letting you worry about what you want out of the data set, and let MATLAB handle the behind the scenes of how those calculations are carried out.

So once we've gathered our two summaries into memory, we can get an idea about the data spread. So one of the things you might want to know if you're dealing with, especially, a classification algorithm is, is my data imbalanced? Do I have one class that has a ton more samples than any other? And in this case, our data is pretty even. The most we have is this HBT and LPT failure, but I don't foresee any significant issues from the spread.

And if we find that our algorithm is not performing as we expect when we design the fall classifier later, we can always go back and revisit this. And MATLAB has got some neat ways we can deal with the data imbalance if necessary. In this case, it shouldn't be. The next thing that Peeyush pointed out is that we have three separate flight classes. So there's short flights, there's medium flights, and there's long flights.

And one thing we might want to consider, especially when we're trying to maybe come back and normalize this data later, is the different length of flight, the different flight durations-- do those have an impact on how we go about normalizing the data? And in this case, since we have a pretty even spread of short, medium, and long flights, it might make sense to, and indeed we will later, normalize this data based on the different flight classes we have.

Now we have a pretty good understanding of the types of flight classes and failure modes we have in the data. I want to pass this back to Peeyush, but I want to give him the ability to really quickly access different failure modes, so that he can apply some domain expertise to help us understand what additional variables we might want to create from the signals we have.

So to that end, I provided a little dropdown here that lets him select from the different failure modes. And what we're going to do is leverage another built in parquet feature called predicate push down. And what this is going to let us do is establish a row filter that's going to go through the entire parquet data store and find just the files, just the flights, that are experiencing the selected failure mode.

So we're filtering on that failure mode, and using that predicate push down enables us to really rapidly go in and grab all the data we want while ignoring the other data that we don't really want to deal with at that time. And that's going to give Peeyush the ability to quickly and effectively analyze in more detail one or more of these failure modes.

Thanks, Russell. So is there anything else that we need to know about this data set before we start processing it and building our algorithms, Peeyush?

So the first important step in data analytics would be to visualize the data and understand if the data is actually making sense to you or not. Now in this use case, we are looking at a gas turbine. And I want to look at this data set and evaluate some basic thermodynamics principles to understand whether the data is actually making sense to me or not.

So any domain expert would know that a gas turbine is actually based out of a Brayton cycle principle. So here we have a specific compressor section in the aircraft engine and turbine section in the aircraft engine. And the compressor's job is to compress the air, increase its pressure, and then add some enthalpy to the compressed air in the form of combustion, and then allow the exhaust gases to expand onto the turbine, thus generating a lot of work.

So in terms of the thermodynamics basic principles, we have got the P-v diagram, and we have very generic D-s diagram, as well. So when the air actually gets compressed in the compressor, it's pressure actually increases from P2 to P3. And similarly, when the air actually expands onto the turbine, it's pressure actually decreases from P4 to P5.

And not only the pressure, but likewise, the temperature also gets decreased from T4 to T5. Now I want to look at this data set. And since it is more about different submodule failures-- so if I'm looking at some model, for example, SPT, high pressure turbine, what I want to understand is as the engine actually ages, or the high pressure turbine actually ages, does the temperature drop?

That is, P4 to P5 on this test diagram. Does this temperature drop start to decrease over a period of time, for the same amount of inlet temperature, T4? So that is, for example, what I'm doing in this particular section of this script. I'm just evaluating, what is the temperature drop with respect to similar levels of speed and temperature? And then plotting it in the form of a scattered plot.

So let's see, what is the output of this particular section? So here we can see that when the engine is actually new, that is where the relative temperature drop is in this blue regime. But as the engine actually starts to age right and it starts to hit the end of life, we can definitely see that the temperature drop also starts to reduce, meaning that the lesser amount of work is actually generated out of the turbine module.

And likewise, for different sub modules in the engine, we can consider some things like pressure ratio and also the change in flow parameters, as well. So these would be some domain related features you know that we would like to add in the data set going forward.

Great. Yeah, it sounds like the team now has a good understanding of what's actually in the data set, and the data set that's in this parquet format. So let's just recap. So the team has now explored their data. They've explored some relationships that they might want to investigate based on Peeyush's domain expertise that will be helpful for building those predictive algorithms, and they've really dug into this big data.

So next, the team will actually need to preprocess the data and make sure that it's clean and useful for extracting features. This data set is very big. So Russell, how are you planning on managing this data? Is there really any way to make the process faster?

We can definitely speed this up, Rachel. So in cases where we have large amounts of data, it often benefits us to start small. We can prototype on a small subset of the data all of our preprocessing steps, and get everything in order before we go to scale up.

So in the case of this data set, I want to give Peeyush and myself the ability to drill down through a single file, to a single engine unit, and then finally, to a single flight cycle of data. So here we have one flight of data loaded into MATLAB, ready for us to prototype our preprocessing steps on.

So the first step in data preprocessing would be to label the different flight phases that we have in the data set. So across all the operational cycles, we will be actually labeling this different flight phases, because when we are actually monitoring the health state of different sub modules, it is much easier to deal with a steady state population data compared to a transient data.

And that is where one would understand that the climb mode operation and the descent mode operation is more likely to be a transient operation where you are continuously having a change in altitude. Along with the change in altitude, you have the ambient air present around the engine itself changing, meaning that the pressure and temperature values will be different at different altitudes.

Along with that, your rotor speeds will also be changing as you are climbing or as you are descending in a flight mission, whereas something like a cruise mode operation is more likely to be a steady state operation. And that is where, with the steady state of personal data, we can track the health degradation of any particular submodule more accurately.

And then I also want to provide Russell an option to actually pick and choose which mode of operation, or which flight phase of operation, he would like to choose while going for a unit of feature engineering or for classification approach later on. So with that, let's label these flight phases.

So what I'm doing over here is actually monitoring the change in altitude with respect to the different timestamps that we have in the data set and storing it in the parameter called altChange. Now, you also want to smooth in the data, because we have some transient population present in the data. So that is where in MATLAB, in the live tasks, we have a lot of live tasks around data preprocessing.

And from this gallery, I am picking and choosing this data smoothing live task. And basically, it actually gives me a UI window where I can just select my input table-- and that is all changed for me-- and select what is my smoothing method. So I am picking something very simple called moving mean.

And once I run this particular section, you will see that my output is actually overlaid on top of the input data. So I did not write any code to actually plot this charts. It is all automated. So you can see that, at the beginning, my altitude change is mostly positive. That is indicating the climb mode. Then in the middle, I have the cruise mode, where altitude changes near about zero.

And then in the descent mode, I have mostly some negative altitude change. So looking at this chart, I can formulate some very simple data filtering operations around altitude change. So we can mention maybe that if your altitude change is actually greater than a value of 2, then it is most likely a climb operation. If it is less than a value of minus 2, then it is a descent operation.

And somewhere in the middle, you have a steady state operation, or a cruise mode operation. So let's use this logical filtering option. And then we can see that the algorithm is able to label the flight phases really well. Now instead of just labeling the flight phase on one cycle, we can just see how the algorithm actually performs on more number of cycles.

So I am here picking some random number of cycles-- that is, 60 number of cycles-- and running through a for loop And once the output of this for loop is there. So here it is. And we can see that mostly, this algorithm is actually doing a really good job of classifying the different flight phases, except for maybe here and there, there is some mislabelling.

But I am really satisfied with this operation already. And wherever there is some mixed results, that is where throwing in some more parameters like rotors speed changes, and some sensor value changes, could really help the algorithm improve. So I think this is a pretty good result already. So with that, I will hand it over back to Russell in order to carry some more data preprocessing operations.

Yeah, Peeyush, that flight phase extraction algorithm looks pretty cool. So before we scale it up to the full data set, one other thing to consider is, how we're going to make sure that the footprint of this data remains reasonable, even as we continue to either collect data from the sensor arrays we have, or generate artificial data.

Whichever direction we're moving in, we've already got 18 gigabytes, and this could quickly balloon. So one thing we might want to do is downsample this data. So a typical method for doing this might be just to take every 10 samples and retain those, or every four samples. But in this case,

I'm afraid that if we did that, we might throw out some really useful information or some really key data that we need to differentiate between different failure modes, or identify the remaining useful life. So rather than go about it that way, what I'm going to try to do here is use a change points detection live task to extract change points from each signal.

So in this case, I can see here that this is T30, and I've kind of tailored the thresholds using the live task again to get between 40 and 60 change points for each of these different signal variables. Then I can combine the time indices of these change points and extract that data and use it as our reduced data set.

So in this way, we're trying to retain key pieces of information from each signal and make sure that we maintain enough information in our reduced data set to be able to really effectively perform our fault classification and our remaining useful life detection.

So when we try to go scale, or when we go to scale this up, what I did first is I actually went over here and converted these tasks, both Peeyush's task and mine, to editable code, and refactored them. Because when we go to apply this to the full data set after we've prototyped at small scale, we're going to use something called a data store transform.

And in that workflow, what we're doing is we're telling MATLAB about our two data stores. Again, we're pointing it to where the data lives, and then we're establishing a transform function. So this is where those refactor live tasks come into play. If I actually go and open this data store transform function I've created, we can see the first half of this is our flight phase identification that Peeyush put together, and the second half of this is our change points reduction that I was working on.

With both of those in the same transform function, I can now tell MATLAB that I want to transform the existing data store based on that function, and then I can call read all with the used parallel flag true, and MATLAB is going to handle all the back end stuff, go out and execute that transformation pipeline, use that function to manipulate and change my data, and then give me the result.

I can then write that data back out, in this case, to another parquet file, as my reduced data set. And we go from a footprint of about 18 gigabytes down to a footprint of 7 gigabytes. So we're cutting that a little over in half.

All right. Thanks, Russell. So let's take a minute to recap what's been done so far. So the team started out by preprocessing a small subset of the data, and then they scaled up to the full data set. Peeyush actually smoothed out the noise in the data set, and then separated it out by flight phase.

Then Russell extracted change points from each of the flight trajectories, which drastically reduced the size of the data set, while still maintaining the key information needed for training and algorithm. So now with this newly reduced data set, the team can start extracting features and training a fault classification algorithm. So Russell, can you talk us through this process?

Absolutely. Feature engineering is one of the most important steps in any AI or machine learning design workflow. So in this case, we've got our signal variables, which are coming from the data set. We know that we're trying to distinguish our labels between these different failure modes. And we have our derived variables. So these are some variables that Peeyush, our domain expert, pointed us towards.

Instead, we should look at things like that-- the temperature difference across the high pressure turbine that he explored earlier. So what we're going to do is we're going to load in our data set, and then I'm just quickly, again, just using this sort of preprocess small and then move to the full data set workflow. I'm going to start by identifying one flight of data, so one flight cycle.

I'm going to go for the cruise phase, again, using domain expertise here. Peeyush let me know that the cruise phase is probably the best place to look when we're trying to distinguish between the different failure modes. So I'm pulling that flight phase data out, and I'm going to quickly hijack this function called group stats, which allows me to give MATLAB a list of summary statistics that I want to extract for each flight, for each of those signal variables, and for each of the derived variables.

And you can see here the table that comes out at the bottom. We've got about 360 different statistic features. So the next thing we're going to do is scale this up and extract these features for the cruise phase for the entire data set. And you see the head here. And we can see some of it. Here's that high pressure turbine temperature difference, and all the difference statistic features associated with it

I've then gone in and I've made our labels a little bit more legible or readable for both us and MATLAB. We're leveraging something called a categorical variable here. So I'm able to specify different categories such as healthy, high pressure turbine failure, low pressure turbine failure, so on and so forth, so that I can easily read that, and MATLAB also understands what we're talking about.

We're kind of speaking the same language. And now that we've got all these features and we have all of our labels, the next thing we want to do is rank them. So in this case, the data set came with a training or dev set and a testing set. There's methods that we have for splitting this up, or for partitioning the data set, if it's not done already.

But in this case, we can go ahead and jump into another one of these apps. These are really powerful graphic interfaces for enabling you to move through pretty complicated workflows, just by basically fill in the blanks. So in this case, I'm going to use the Classification Learner app, and we're going to investigate which Classification Learner style or type is going to be the most effective for this particular data.

So once we open the app, we can start a new session up here. And we can use a data that's present in the workspace. So I'm going to use my cruise dev data. So this is the cruise data that I extracted, and I'm going to tell MATLAB that I want to try to differentiate between different failure modes. And I can go down and manually exclude certain data from this data set.

Like, I don't need unique ID, or remaining useful life, or flight class, and I can start a session here. And then directly in this app, if I haven't already reduced the number of features that I'm looking at, I can move into feature selection here. It's critically important to understand which features you need to retain.

For example, if I'm trying to load this onto an embedded device, I might only want to retain the top 30 features here, because that might be all I have space for, or compute power to deal with, on that edge device. And I don't want to have all 361 features to deal with. So once I remove all but 30 of those features, I can then come up here and ask Matlab to please train every one of these classifiers.

In that dropdown, we have a ton of different built in classification algorithms. I might not know which one offhand is going to perform the best. So I can hit this with a big stick and train all of them, and just see what shakes out. So I'm just going to hit this button, and Matlab is going to use my existing parallel resources-- just my local machine in this case-- to speed up the training for each one of these different classifiers.

And so once everything's finished training, it kind of looks like this. And we can sort this by accuracy. So in this case, we're using the validation accuracy. And we can see here that the top performer is our medium neural network. What we could do next is we can go in and investigate each of these individually. We can bring up things like typical plots you might find in classification, like a confusion matrix.

So for those of you that might not be familiar, across the x-axis here we have what the algorithm thought the failure mode was. In this case, in the second column, it thought it was experiencing an HPT failure. On the y-axis, we have what the true class was. In this case here, we can see that 11 times it was healthy, but we thought it was an HPT failure.

The next thing we might want to do is to actually test these algorithms. So feed it some data that it hasn't seen before, and see if the performance is the same. So we can feed it data it hasn't seen before. In this case, we've got this test table, which is an identical format to our cruise data table that we use to train.

And MATLAB is going to recognize that, and it's going to automatically select our failure mode. And if we scroll down to the bottom here, I can just show you that it de-selected the same signals that we didn't include the first time. So everything's identical. We hit import, and now we're ready to test directly in the app, again, without leaving your sort of one stop shop for designing a classification learner.

And we can just hit test all, and that's going to run through all of these, again, using parallel. And it's going to test them all now with the unseen, or previously unseen, data. Now that the test is concluded, we can see that the accuracy values have switched from accuracy validation to accuracy test. And both of the neural networks, medium and narrow, are performing in that same 94% band.

In general, once you're done in the app and you're happy with where this classification algorithm is, then we can come up here to our export options. We can either export the model directly, so the trained classifier, get that out into MATLAB to do more things with it, export it for deployment, et cetera.

Or we can generate a function, which is called train classifier, which will take-- we could feed this new training data if we had more data, if we're continuing to either simulate data or acquire it from sensors. Then we can feed that new data into this train classifier function and get the exact same medium neural network, for example, back out of that.

At this point, the team has achieved their first goal. They developed a fault classification algorithm that can identify faults with about 93% accuracy. But now, they still want to understand how the engines degrade over time to be able to predict the remaining useful life of the engines. This way, they can schedule maintenance before the failures occur. So Peeyush, can you talk us through how to do this?

So before we go ahead and predict the remaining useful life, let's first talk about, what is remaining useful life? So we understand that any particular component actually goes through a performance deterioration over a period of time. And remaining useful life is nothing but from the current operational state, how many number of operational cycles do you have left before that component is no more in a usable condition?

So your remaining useful life could be represented as a time value. It could be represented as a distance value, as well, or you could just represent it as number of operational cycles you have left. So with that background, let's bring in the feature table that Russell had generated in the feature classification, or the feature engineering section, rather.

And in here we are bringing both the training data, as well as the test data. And what you will see, in this feature table, we have a lot of statistical information from the time series signal. And then we have some domain engineered features, such as pressure ratios, temperature drops, flow ratios, et cetera. So that is what is there in the training set and the test set.

Now, in this particular data set, we have multiple flight classes also present. And it is important to understand that engines, which are running in the field for longer duration-- that is, flight class three-- are likely to fail much faster compared to the engines in flight class one just because they are spending more number of hours in the field.

So that is where we will normalize the training data set with respect to the flight class information, by using a very simple command in MATLAB called normalize. And the mean and standard deviation information from our training set could be used to normalize our test data, as well. So here we have normalized both the training data and the test data.

And now, let's look at the different failure labels that are actually present in the data set. So we would see that there are 1,300 instances, or 1,300 cycles, of healthy data captured in our training set. And then there is a good mix of multiple other failure modes also captured in our data set. So this looks to be a pretty well-balanced data set to me.

And that is where we can go ahead and use it for our annual model training. Now for remaining useful life estimation, it is important to tag health indicator corresponding to the failure that you are trying to predict. So in the data exploration script, we had established that HPT module temperature drop could be used as one of the features for monitoring the health state of the HPT module, right?

Likewise, I've been bringing in my domain expertise. I went ahead and actually used the fairness tool margin, high pressure compression, relative temperature increase across the module, and then LPC stall margin, and LPT relative temperature drop as the health indicators for each of these submodule failures.

Now once I plot these degradation curves with respect to the number of operational cycles, one would notice that HPT, FAN, HPC, and LPC modules, we see a very clear trend towards the end of life. But nothing is actually making sense for LPT degradation. And that is where, so far, I have used my domain expertise, and just used one particular feature from the entire feature table in order to get to these degradation codes.

But let's say that you are not a domain expert. So that is where you can actually use something called a trendability analysis from our predictive maintenance toolbox. So what it does is it assigns a score, a trendability score, to multiple different features actually present in the data set. And the higher the score is, meaning that particular parameter in the data set is the most trending.

So you can see that the highest scores here are assigned to HPT module pressure ratio, even though the data that we have pulled in for readability analysis is only corresponding to the LPT module failure. So that is where we can go ahead and pick this HPT pressure ratio as the parameter for estimating the remaining useful life for LPT module.

And as I plot the degradation curve, you will notice that now it is making much more sense here, our data is actually much more trending compared to what we had got earlier for the LPT module degradation. Now in this data set, or in this use case, I'll be demonstrating how to predict the remaining useful life for only one specific submodule failure, and that is going to be the high pressure turbine module, HPT module.

But the same effort can be actually duplicated on other modules, as well. So here, I am actually extracting the health indicator values for our test data, as well. I've been corresponding to the HPT failure. And now, before I go ahead and train our new model, I would like to pull up this HPT degradation and sample.

So this is our training data set that we'll be feeding to the RUL estimator model. And you will notice that in this particular data set, we have got 19 different engine information. And corresponding to each of this information, we have the number of cycles and the corresponding health state value. That is the HPC temperature drop that we have kept as the health indicator.

So this is what is there in the training data set. Now, MATLAB actually provides you three different families of RUL estimated models. And those are similarity based model, survival model, and their degradation model. And it really depends upon, what kind of data do you have at hand for choosing and training one specific family of model?

So if you have complete run to failure history of the data, you can pick something like a similarity based model. If you have just the survival data, that is, just the failure data, that is where you can train a survival model. If you have some operational data along with the knowledge of safety threshold, that is where you can train a degradation family model.

Now, in each of these families, we have multiple RUL estimator models. That is where you can go ahead and read more about those information in our documentation section. But in here, now it is a pretty obvious choice that we can go ahead and pick similarity based model, just because we have complete rental failure history.

But it is also important to note that in this data set, we have multiple different types of failures actually happening simultaneously, meaning that both HBT and LPT failure could be happening simultaneously. And that is where when I applaud this HPT failure and LPT failure data for same engines.

Across the number of operational cycles you would notice that some of these engines are actually failing within a safety threshold zone for HPT speedy failure, whereas some modules may have witnessed only partial failures. And then they are actually failing for the LPT module. So that is where picking something like a similarity based model is something that we can rule out, and go ahead with exponential degradation model training.

Now, training an RUL model is very straightforward in MATLAB. We need to mention the model name. That is the exponential degradation model that we are picking over here. And I have just given some model definitions, like my lifetime unit is captured in a number of cycles.

And then, I'm executing a fit command over here, where I'm throwing in the HPT degradation sample to this model for training, along with providing the information in terms of number of cycles and the degradation values stored in this particular column. And then for predicting the remaining useful life, we have used the lower extreme of the safety threshold that we had seen previously.

So this is a value of minus 3.6. And then for predicting the remaining useful life, it is one very simple function in MATLAB. That is this predict RUL. So this predict RUL takes in the trained model and then gives you the remaining useful life estimates, along with the confidence interval and the probability density function. So let's go ahead and run this section, and it will bring up a very nice animation for us.

So in this animation, you would see that there are three different subplots. On the top left, you have the health indicator value, which will degrade over a period of time. So right now, it is running flat. That is where remaining useful life estimates has actually started right now. And that is where the solid line that you see over here is kind of staying constant here, about 70 cycles.

Once this threshold value actually crosses a value of health indicator as zero, that is where the remaining useful life estimate actually starts. And you see that my remaining useful life estimation starts moving pretty close to the ground true value. And that is what is being represented on the top right chart over here, where the orange curve is the remaining useful life estimation, and the blue one is the ground true.

So you can see that in just a couple of cycles, the model was able to learn the degradation rate. And then from there on, it has been pretty close to the ground truth value. And that is how you can actually predict the remaining useful life for any subcomponent degradation. While I showed you an example for HPT module failure, it can be duplicated for all of the submodule failures in this use case, as well.

Great. Thanks, Peeyush. So let's just recap what happened there. So just to accomplish that second goal of training a remaining useful life algorithm, Peeyush started by exploring just one degradation mode at a time. And he actually used his expertise with aircraft engines to identify useful condition indicators, the ones from earlier, that had good trendability to train that exponential degradation model with some known failure thresholds.

And in the end, he was able to visualize the results using an animation that maybe he could take and deploy to a dashboard or something that could be shown to a maintenance crew in real time to see how much life an engine has . So working together, the team has satisfied both of their goals for today.

They developed a fault classification algorithm using machine learning, and developed an algorithm to estimate remaining useful life. Ultimately, you probably want these algorithms to be used to make real maintenance decisions. So that means putting them into operation. And there are many options when it comes to deploying algorithms in operation from MATLAB.

So you could deploy directly to, say, an embedded system by automatically translating the high level MATLAB code into, say, C++, If you need to run on hardware, like a PLC or microcontroller, something with limited memory. Or for other applications, you might generate Cuda code for GPUs, or HDL code for SPTAs. There's a lot of options here.

You could also compile your algorithms in apps directly to desktop, web, or enterprise environments, and then integrate those as custom software components, such as .net or Python packages within your existing systems. So really, at the end of the day, MATLAB is flexible and allows you a lot of options to get your algorithms up and running.

And if you aren't quite sure how to get started we can actually help. And we know it can be challenging to do predictive maintenance, and our application engineering team, people like Russell and Peeyush can help get you started with guided evaluations working directly with your data.

And we offer lots of professional training courses in topics like MATLAB fundamentals, signal processing, app development, and predictive maintenance. Also, particularly for predictive maintenance, our consulting team can work with you to help you really achieve those results a lot faster by ramping up your team and making sure that you're becoming self-sufficient with the tools.

But really, the bottom line is, we have a lot of resources to help you. So please feel free to reach out. We're here to help you be successful. So thank you for attending today, and we really hope you learned something from this webinar. And we will now take questions.