Description

Low Code Data Analysis with MATLAB

Overview

Learn how you can analyze and model data using interactive tools in MATLAB. Through live demonstrations and examples, you will see how you can solve many steps in a data analysis workflow without writing any code yourself. The interactive tools can then generate the MATLAB code you need to programmatically reproduce your work. This session is targeted for those who are new to MATLAB. However, experienced MATLAB users will also benefit from the session as the presenter will be covering new tools, tips, and tricks from the latest releases of MATLAB.

Highlights

Accessing data from many sources (files, other software, hardware, etc.)
Using interactive tools for data visualization, cleaning, and modeling
Automatically generating the code to replicate your interactive work
Capturing your work in easy-to-write scripts and functions
Sharing your results with others by automatically creating reports
Growing your programming skills beyond the basics

About the Presenter

Adam Filion is a Senior Product Marketing Manager at MathWorks where he focuses on building demonstration and teaching materials for the MATLAB platform. You can also find him teaching the Practical Data Science with MATLAB specialization on Coursera and in many other MathWorks videos. He has a BS and MS in Aerospace Engineering from Virginia Tech.

Recorded: 15 Feb 2024

Full Transcript

Hello, and welcome to today's webinar on low code data analysis with MATLAB. My name is Adam Filon, and I'll be taking you through the presentation today. I'll start with a little bit of my background.

I have been with MathWorks for 12 years now in many different roles, most of them focused on data science. These days, I'm in our product marketing group, where I spend most of my time building demonstration and teaching materials to help people learn MATLAB. In fact, you might know me from our specialization on Coursera called Practical Data Science with MATLAB, where I was one of the instructors

In today's presentation, we'll be talking about low code data analysis. And in just a minute, I'll talk more about what exactly I mean by that. But first, I wanted to give you an idea of what it is we're building towards in today's demonstration.

So I'm going to tab over to a report that was automatically generated for us by MATLAB. In this report, you can see different sections for the kinds of analysis we're going to do today. You can see things like importing data from Excel. You can see some MATLAB code, as well as the results, a number of charts that we generate. And we'll even get into a bit of machine learning.

So our end goal is to get to the point where we can click one button in MATLAB, generate a report from the work that we've done, and along the way, write practically no code ourselves. So that's the goal we're working towards. But before we dive into MATLAB, I'm going to go back to the slides for just a minute.

So a brief agenda-- I'm going to spend a few minutes talking in more detail about what we mean by low code data analysis. Then we'll hop over to MATLAB, where we will spend most of our time today. And at the end, we'll wrap up with some learning resources, so you know where to go after today's presentation.

So the biggest idea I want you to walk away with is that MATLAB simplifies the data analysis workflow with low code tools. So what do we mean by that? Well, the data analysis workflow we think of as happening in three big steps. And the first step is access-- just getting access to your data wherever it lives.

Today, we'll be working with an Excel file. But your data could be coming from anywhere. It could be coming from other file types like Parquet. It could be living in the cloud. It may be coming from databases or being read directly from hardware. But wherever it is, your first step is to just read it in the MATLAB.

Once you've brought it in, the second step is explore and discover. So this is ideally where you want to be spending the majority of your time, doing things like generating lots of different plots, running different analysis, maybe building some models, or even building a nice graphical front end to your application. But eventually, you'll want to share the results that you've built with other people. Today, we'll look at an example of building a report. But you can also take MATLAB and hook it directly into downstream design software, or even take the work that you've done in MATLAB and package it up, so that people who don't have access to MATLAB can still make use of it.

Now, this slide may look a little bit like a linear workflow. But if you've ever done this kind of work before, this is anything but linear. You will want to import lots of different data from different sources, make lots of different plots and generate lots of different models, and share your results with many different people in different forms. So along the way, we're also going to look for opportunities to automate our work, both the individual steps, as well as the larger workflow.

So it's a bit on the data analysis workflow. But today, we're specifically talking about how MATLAB's low code tools help to simplify this. So what do we mean by low code? Well, low code is a term from the software industry that describes tools that allow you to rapidly build software without writing a lot of code yourself. And we'll look at a lot of examples today, but I wanted to show you a quick one up front.

So if you've used MATLAB at all before, I'm sure you've used the plot command. But you may not know the commands for how to customize plots. So if you wanted to add some labels or grid lines, you could do that interactively in MATLAB. And then MATLAB will actually generate the code you need to do the same thing and let you insert that back into your program. So the end result is we have all the code we need to programmatically do what we want, but we didn't have to write much of it ourselves.

This is an example of a low code tool. And there are a lot of benefits to low code tools. First, they create a shallow learning environment. So it makes it very easy to get started.

They can also teach you how to code. If you don't know how to write the code to do something, you could look at the documentation. Or you could open up a low code tool, do it the interactive, way and then have the tool generate the code you need to go do the same thing. They also allow you to focus on solving your task first, and then figuring out the code later.

So for these reasons and more, these tools are not just for beginners. Even if you are an advanced programmer like myself, I have been writing MATLAB code almost every day for almost 20 years. I still use these tools all the time because they allow me to get my job done faster, and that's the thing I really care about.

So with that in mind, let's take a look at our case study. We will be looking at some commercial aircraft data, which is provided by NASA. Our objective is a common one from the world of engineering. The idea is we have some physical asset-- in this example in aircraft, and we want to deploy this asset out into the real world.

After we deploy it, there is some part of this, some state, that we want to continue to monitor. But after we deploy it, this becomes either impossible to observe directly or it's just very expensive to observe. So the idea is we want to create a virtual sensor to estimate what the state is based on other more readily available sensors. So that's the objective for today. With that, let's go ahead and hop over to MATLAB.

So this is the MATLAB desktop for those of you who haven't seen it. I will introduce the various aspects as they become relevant. I'm going to start today over here on the left with the Current Folder browser. This shows us the directory MATLAB is currently looking at and the various files in it. And you'll see an Excel file here, which is the one we'll be working with today. And to start, I'm just going to open this up in Excel. So you can see this is just a table of data, as you would commonly see in Excel. We have some column headers. The first column is a timestamp for when this data was recorded, and then we have the various sensor values.

So our first step is access. We need to figure out how to get this data from Excel into MATLAB. If you're a MATLAB user, you may already know the preferred command for doing this, which is readtable. But if you want the low code way, we can come up here and open the import tool.

So the import tool is a great environment for importing data from common tabular formats like CSV and Excel. It shows us a preview of how the data will be imported and allows us to customize this. So we can see it's by default going to read this in as a table of data.

We could see the name it will give it. It has figured out that the first row is not actually data. That's the name of the various columns in our table, and that the first column is a Datetime.

And if I want to customize this, I can easily, let's say, ignore some of these sensors if I don't care about them, or even set up custom rules for what to do with unimportable cells. So what if there's a blank inside of our Excel sheet? But eventually, I'll get this set up how I like, and we can import the data into MATLAB.

We'll see our table pop up down here in our workspace, which shows us the data we have available. And if we open that up in our Variables editor, you'll see this looks very much like it just did in Excel. Only now, we have the data in MATLAB.

So that's all for the first step of access. But of course, we don't want to have to come back to this tool every time. We want to find a way to automate this.

So the import tool can also generate either a live script or a script or a function to capture what we just did. We'll see a live script a little later. I'm going to generate a function.

So you can write your own functions in MATLAB. This is a way to create your own custom commands to do whatever you want. Here, the import tool has automatically generated a function for us called importfile that will read in an Excel file using all the customizations that we just set up without us having to figure out how to write the code to do that. This green text here is just comments explaining how this function works, and it will even give us an example. So I can actually copy-paste this over here to the command line, and we'll just give it the name of our file.

But it looks like we forgot to save it. So in order for MATLAB to know that this is a new function, we need to save this file. So we're going to save it just in our current directory.

We'll give it the same name as the name up here. MATLAB tried to take a guess as to what command it thought we were using. But of course, this command didn't exist yet.

Now you can see, we can read in our data programmatically with just this one line. So now we've been able to solve that first step of access. We have a function we can use to do this over and over again whenever we get similar data.

And if we wanted to, we could scroll through this function. We could look at the code for how to do this. But let's say for today, we don't really care, we have the code we need, and we're ready to move on.

So we entered this command here at the command window, which is a good environment for quick throwaway commands where you don't care about saving the results. But here, I do care about saving these results. So I'm going to open up what's called a live script. So a live script is a Notebook-type interface where you can create sections and add comments.

So let's say we want to import our raw data where we're going to read data from Excel. And we can interleave with this text our actual MATLAB code. And when we run this code, the results will get captured in line so that we have our richly formatted descriptions, our MATLAB code, and our results, all right here in the same place. So this is a great way to create executable documents.

So now we've been able to read our data in. What's the next thing we might want to do? There's a lot of places we could go now, but my favorite is always to just do some very quick plotting. I find visualizations are the fastest, easiest way to figure out what's going on in my data.

So if you're a MATLAB user, you're probably familiar with the plot command. But there is a huge library of different visualizations available to you. So a great way to explore them is to insert something called a live task.

So you can see these live tasks here for different types of steps in our workflow-- data preprocessing, doing control systems, and signal processing. I'm going to just add the Create Plot live task. So this will insert a small UI into my live script. That allows me to interactively customize this step.

For Create Plot, you can see I've got a gallery of different types of visualizations that I can choose from. Let's say I want to start with just the basic plot command. And then I can pick out the X and Y data from my Data table.

Let's say I want to put Time on the X-axis, and then put TrueAirSpeed on the Y. I think I forgot to mention this earlier, but TrueAirSpeed is the thing we are trying to predict here. So that will create a plot for the TrueAirSpeed versus Time.

And because this is MATLAB, this is not just a static image. We can easily zoom and pan to get a better sense of what's going on in our chart. And as I do so, I want you to notice two things.

So first, you can see MATLAB again generating code for us to insert back into our script. So if we wanted to hard code these zoom limits, we could do that. But also notice the TrueAirSpeed starts out at 0 for a while.

And if you pan over to the end, it does that at the end as well. And this actually makes sense because aircraft start on the ground stationary. And if we want to build a virtual sensor to predict what the airspeed is during flight, well, we probably want to remove this from our data.

So we've identified something here-- an artifact in our data, that we need to clean up at some point. But for now, let's say I want to look at a few other sensors. Let's say I want to look at oil pressure.

So we can see that here. And you can see there's something strange happening here. It looks like the sensor is dropping out periodically, where it's going from a normal value down to 0, and right back up.

And this is actually a very common problem in real-world sensors. And we'll even see this in some of the others, like OilTemperature, which you can see here, it's experiencing the same sort of sensor dropouts. So these are some other anomalies in our data that we'll have to clean up at some point.

So it's great to be able to view these sort of sensors versus time. But this is a flight. We have latitude and longitude coordinates. I'd like to be able to see those.

So let's say we want to look at longitude on the X and the latitude on the Y. And that is technically doing what I asked it to do, but I really want to see this on a map. So we could go scrolling through our library of charts, or we could just search for something which is geographic and open up a geoplot. And it looks like I put our latitude and longitude in backwards, so let's flip those around. And there we go.

So now we can see our flight. It went from Halifax to Detroit. But you can see, at the very end, it's going all the way over here to 0, 0 latitude, longitude.

And this is actually not uncommon for data loggers. Many data loggers will have a certain default value that they output for a short amount of time immediately after being turned on or before being turned off. So this is another anomaly in our data that we'll have to clean up.

So using the Create Plot live task, we are able to very quickly generate lots of different visualizations, explore our data, find some anomalies in it. But the entire time, what it's really doing is it's actually generating code for us. So if we don't want to look at these controls anymore, we can easily hide those and just look at the code, or even hide it. Or if we really just want the code, we can even convert this back into just straight editable code, no different than if we had written it ourselves from scratch.

So that live task is great for exploring lots of different plots, and it exposes the most common ways to customize plots. But you probably know that MATLAB has a huge variety of ways to customize visualizations to meet your needs. If you aren't sure how to do that for any particular plot, I encourage you to look at the documentation for that command.

So the documentation in MATLAB is one of the things we pride ourselves on. Every function in MATLAB has a documentation page that follows the same format. You can see syntax, different ways you can use this command, and then lots of examples. And the examples are my favorite part, because this is for me the easiest way to learn how to use a new command. And as we scroll through these examples, you might notice some that are interesting, like this one is a lot more colorful.

So the default geoplot uses this two-tone background. But if you want to have a more colorful map, you can see in this example that this uses something called geobasemap. So this is how we can modify the background of the map. And if we want to incorporate that, we can just copy-paste this over into our script.

And now when we rerun this section, we'll now have that more colorful background. If we wanted to do some light editing of this code and change the options ourselves, as I start typing, MATLAB is going to give me suggestions for how to complete this. So let's say I want to make that line a bit thicker, so it's easier to see, and maybe I want to change the color to something that stands out against that background a bit more-- maybe red.

So we can start with the low code tools-- things like the live task, do a lot of interactive work there. And then if we want to make further customizations, we can do some very light coding. And MATLAB has a lot of tools and help documentation to help us along the way.

So let's say we're happy with our visualizations. Now we need to turn our attention to some of those anomalies in our data that we saw earlier. And maybe we want to start with the true airspeed.

We saw that started and ended at 0, and we need to cut that off. Well, up here in this table output here in our live script, we can go over to the TrueAirSpeed column. And for any one of these columns, we can pop them out and see a distribution for what values do these take, and start to threshold on which values do we want to keep and what do we want to throw away.

So let's say I want to have a lower limit of 10. This is 10 knots. And here, MATLAB is again generating the code for me to do that. It's also giving me the option for what to do about missing data-- so NaNs, not in numbers. It's telling me there are zero rows that are missing right now, but we want to think ahead to future data where we might have missing values.

Now, there's no right answer for what to do with missing data. But for our example, we want to predict true airspeed. And if true airspeed is missing, well, then that data isn't useful for making predictions. So I probably don't want to include missing data here. And now that I've made those modifications, I can look at the code MATLAB generated for me to see how to do that myself.

If I want to get just a subset of my table, I can start with the table name FlightData. And then inside of parentheses, I can index into it. I can select the subset that I want to keep. I'll specify the rows first. We'll look at our FlightData table, look at the TrueAirSpeed column, and we'll keep anywhere where this column is greater than or equal to 10.

In the second input, we'll tell it which columns do we want to keep. And I want to keep all of them, so we'll use the colon to denote that. And if I want to keep this, I can just click Update Code, put this back into my script, and rerun it. And now MATLAB will import my flight data, and then immediately throw out any rows where the TrueAirSpeed is less than or equal to 10. And if we regenerate our plot, we'll notice this solved our other problem at the same time.

So this type of thresholding is a really common task. But this value of 10 is one that I just picked out of thin air. Ideally, we really want to experiment with this threshold to see what is really the best value to pick for this application. So we could write a bunch of code to do that. But I'm instead going to introduce some custom interactivity into our script.

So I'm going to turn this value of 10 into a numeric slider. So let's configure this we'll go from 0 to, let's say, 400 knots in steps of 10. And whenever the value has changed, we are going to run all sections. So we'll just rerun the whole script.

So let's say I want to crank this all the way up to 400. So any row where the true error speed is under 400 knots, we're going to throw that away, and then we're going to regenerate our plot. And we can see that this throughout the vast majority of the flight. It's just one small area over Canada, where the flight was near its top speed, that it was over 400.

So we probably want to back this down to something more reasonable-- maybe around 100 knots or 120. And this will again rerun our script. We'll bring it back in where we filter down to just 120. And this looks a lot more reasonable. So by using these live controls, we can start to introduce our own activity into the script, so that we can iterate and experiment and figure out what's going to work best without writing a lot of code.

So far, we've looked at accessing our data. We looked at visualizing it. We did some basic cleanup with some simple thresholding.

But now we need to think about those sensor dropouts that we saw. Those might need some more sophisticated techniques to clean up. So to deal with that, I'm going to go back over to the Home tab, and we're going to open up what we call the Data Cleaner.

So as the name implies, this is an app designed for cleaning data. And it is especially good for this type of columnar numeric data, as you often see with sensors from equipment. So we're going to import our FlightData table, and the Data Cleaner app will allow me to pick out which of these sensors do we want to look at.

Let's say I just want to bring up all of them because I don't know what's in this table. And for each sensor, it will generate a chart, as well as descriptive statistics. So I can at a glance just scroll through and get a sense for what's going on with all the various sensors in my table.

For time, since this is just a datetime, it's just a simple histogram, we can see that this time looks regularly sampled throughout. We can see in our oil sensors the sensor dropouts. And for some of our other sensors, there's maybe a little bit of noise that we may or may not worry about today.

But let's say we want to start by cleaning up these sensor dropouts. So we're going to focus just on the oil pressure and temperature sensors. And then in order to clean those up, we'll come up to this catalog up here of different common cleaning operations we can do in the app. So we can deal with things like missing data, outliers, smoothing noise, normalizing data.

We can re-time sensors so that they are all on a regular sample rate, or even stack and unstack tables, which is a little bit like a pivot. So here, we're going to deal with outlier data. This will open up a new view of our data, where we can start to customize how we're going to detect, remove, and fill in outliers.

Over on the right, we can see the various options for us. In this case, we want to fill in our outliers with something new rather than deleting them. Let's say we want to fill them with a modified Akima cubic

And then we have different settings for how to define an outlier, because even with the exact same dataset, if we change the analysis that we're doing, we may change our definition of what counts as an outlier. So in this case, let's say we want to stick with a moving median and keep a threshold factor of 3 mean absolute deviations. And we'll define our window as centered, and let's say we want to put 100 data points in that window.

The charts then show us what is the effect of applying this algorithm to our data. And let's zoom in on one of these outliers. Now, this chart down here shows us a lot of information, but I find it's a bit easier to read if we turn off some of these for now. So let's say we want to turn off everything, except for the input data and what's called the outlier thresholds.

So what's happening here is the raw data is in light blue. And then we have these outlier thresholds in gray. So as this algorithm is being applied, as this moving median window is passing through our sensor, it's determining the thresholds in this gray line for what counts as an outlier. So each data point that falls outside of those bounds is getting flagged as an outlier.

And then it is getting replaced and filled in with a new data value based on the Fill method that we picked. And the result is that we have a nice cleaned-up sensor data at the end with none of those big, spiky outliers. So this type of tool is really critical for cleaning up your data, because there is no way to know ahead of time what sort of values over here for how to clean your data are going to work best until you just try them out, and then visualize what the effect is on your dataset.

So we could iterate here to our heart's content. Let's say the default is good enough for today. We'll accept this step. And you'll see it get added back to our cleaning steps.

And this is a really important part of the app because data cleaning is not just a single step. It's really a workflow. And while you may need to experiment with individual steps, such as cleaning outliers to see, OK, have we really cleaned up all the outliers? Once I get rid of this original data and just look at clean data, does this look reasonable now?

But you may also need to experiment with just the larger order of operations. And if we had more time today, we would get into more steps of data cleaning, and experiment with just dragging and dropping and reordering the steps. But let's say for today, this one step of cleaning outlier data is good enough. Well, just like the import tool that we saw earlier, we can export those results to a function.

And this will generate a new function for us that does all our customized cleaning behavior. In this case, since we just did one step, it's just one command-- Fill outliers. And here, it's showing us how to apply that outlier-filling algorithm to our data for things like using modified Akima and a moving median with 100 data points to the window and so on.

So let's give this a better name. It'll be mycleaning. And we'll remember to save our function this time before we use it. And now back over my live script, I can call this by assigning a new output variable.

Let's call it FlightDataClean. This is going to be equal to mycleaning function, and we'll pass in the flight data as input. And there we can see the results.

So now we've gone through, we've accessed our data, visualized it. We've cleaned it up. Now we're ready to move on trying to build a model. So we can actually make some predictions about that state about the TrueAirSpeed. And for this, I'm going to come over to the Apps tab.

So the Apps tab contains a gallery of all the various apps from MATLAB and across the entire MathWorks product line. And there are over hundreds of them in here for applications as varied as machine learning and deep learning, optimization, control systems, robotics, automotive, signal processing, image processing, connecting with hardware, connecting with databases. There's over hundreds of these tools, and they all follow the same kind of workflow that you've already seen from the import tool in the Data Cleaner. Do things the easy way once, interactively explore, experiment, figure out what works for you, and then the tool will generate the code you need to programmatically go do it again.

For today, we're going to open up the Regression Learner app. So this is an app for building regression models, which are machine learning models where the thing you're trying to predict is a continuous output. We'll import our data from the workspace.

So here, we're going to remember to pick our cleaned data. And we're going to pick the output that we want to predict. So that is the TrueAirSpeed. We can then go through and decide which of the other columns in our table do we want to include as predictors, as inputs. The Time column is a datetime that doesn't really make sense to include.

We also probably want to turn off latitude and longitude. Hopefully, our model would figure out for itself that latitude and longitude is not a great predictor of airspeed. But if we know it is not a good predictor, we should exclude it from the start. You also see some options over here for Validation and Test. I'm going to just keep the defaults of a five-fold cross-validation.

This will then import the data into our app, and allow us to try out different models with our data to see what is going to work best. And you can see, there's a wide variety of models-- things like linear regression, decision trees, SVMs, shallow neural networks, ensembles, and so on. But let's say I want to just try a bunch of things that are all relatively quick. So I can say, give me all the quick-to-train models. And then we're going to train everything.

So this will queue up a number of different models that are relatively quick to train. And we'll even train them in parallel across my multiple CPUs, because training each model is independent of the others. And so we can actually train multiple ones at the same time.

For each model, once it completes, we can then see a chart of how well it performed. So in blue is the correct answer, and in yellow is what the model predicted. And these prediction results, this is the result of what's called cross-validation, where it's actually trying to predict on data it didn't see while it was being trained.

So the linear regression here does OK. It gets the general shape. But its RMSE value means it's off by about 13 knots on average. The best-performing model of the ones we tried is called a fine decision tree. And you can see that it gets the overall shape much better.

Now, the ability to try out lots of different models, like this very quickly, is really important. In machine learning, there's something called the no free lunch theorem, which basically means there's no way to know what kind of model is going to work best on your particular data until you just try a bunch of them out and see what happens. And so the Regression Learner app makes that very easy.

You could try all of the various linear models or all of the regression trees, or just literally everything in the catalog. If you do that, maybe go get some lunch. That takes a little while. That's a lot of models to try.

There are also a lot of very advanced options available right through the app, such as optimize models. So this uses an advanced technique called Bayesian hyperparameter tuning to pick out what are the best optional arguments to get the best possible performance. So there's a lot more that we could do here in the app.

Since I've done this before, though, I'm just going to go pick a model that I know works really well, even with just the default settings, which is something called a bag decision tree. If you go reading machine learning literature and you look at common applications, you'll see bag decision trees are one of the most used kinds of models. And you can see that it again matches the shape very clearly, and it gets a bit better RMSE than any of the other ones that we tried.

So there's a lot more that we could do here in the app for things like Feature Selection, hyperparameter tuning. We could look at other kinds of results, like residuals, and look at explainable AI. But eventually, we want to take what we've done and export it outside of this app.

So just like we've seen a couple of times already today, we can generate a function to take the work that we've done in the app and create a new custom malloc command for us. So we can see that this command will take in our training data, the data we want to use to train the model, and will give us back the model itself, as well as some statistics about how this model did. We could see the comments explaining how it works, as well as examples for how to train the model, and even how to make predictions with new data.

So again, let's save our new function. And we'll copy over this command to our script. We just need to replace the name of the input table to our FlightDataClean.

So this will go through, and we'll retrain that bag decision tree, but it will do it with any new set of data that we have. And that's the great thing about now having this command. If we get new data, or if we get more data and we just want to vertically concatenate it, we can now take that new data set and train another model just by passing it to this command here.

So we can see that the output that we get here, trainedModel, this is something called a struct. And a structure is just a data container. It's a way to hold different pieces of information together.

Earlier, we were working with a table in MATLAB. This is also a data container. It's a data container which is great when you are having just columns of data.

Struct is a bit more general data container, where the different pieces don't have to all be the same type. So you could see inside of the struct, there is something called a predict function. So this is the function we can use to predict, what are the required variables, the actual model itself, and hopefully something called how to predict, which actually tells us exactly how to use this to make predictions. So if I want to make a prediction, I can take my trained model, use the predict function, and I can pass in the input data.

Now, in most machine learning applications, you would want to predict here with new data the model has never seen before. But for our purposes today, we'll just pass in the same dataset again. And you could see that we are now at a place where we can take our trained machine array model, and make predictions on new data.

So now we've completed our task. We started with data in Excel. We brought it in, we explored it, cleaned it up, built a model.

And now we're ready to make predictions. So if we wanted to take the work that we've done and we wanted to share it with someone else's report, we'd first need to save our live scripts. So we'll call this whatever we want.

And then once we save it, we can take it and export it to a PDF, Word doc, HTML, or LaTeX-- any one of these formats. Let's say you want to export this to a PDF. This will take my report, capture it, and export it to a standalone document that I can now give to anyone.

And you can see that this is exactly how I generated the report earlier. This one looks a little uglier because we didn't add a lot of comments, and we just printed out this huge table. We probably don't need to do that.

So if we really wanted to hand this off to somebody else, we should do them a favor. Clean it up, add more comments, not print out huge tables. To save ourselves a bit of time, I've already done that today in this live script over here.

So this is all the same work that we did earlier. We just added more comments to explain what's going on, and generally cleaned it up. And so then if we take this live script and we again export it to PDF, then you can see that this is how we generated that report originally.

OK, so that finishes the demonstration for today. We were able to solve that workflow from end to end, starting with Excel and ending with the report. Let's hop back to the slides to wrap up.

So again, what I hope you walk away with from today is that MATLAB simplifies the data analysis workflow with low code tools. Today, we started with some data in Excel. We did some basic data cleaning, generated some plots, did some machine learning.

And then we finally shared our results by generating a PDF report. And if you're curious, the products we used along the way, most of it was just base MATLAB. But we also used the Statistics and Machine Learning Toolbox for the Regression Learner app.

However, we really just scratched the surface of what's possible today. Within the access step, we saw the input tool, which is great for text, CSV, and Excel. But if your data is coming from SQL databases, we also have the Database explorer from Database Toolbox, which gives a very similar experience, but now for data coming in from databases.

And if you are ever connecting with hardware or working with industrial data, we have a large catalog of apps for connecting with and reading data from many different sources. And if you're willing to write a bit of code yourself, you can really get data from anywhere into MATLAB, whether that's time series data, video, N-D data like you might find in climatology. And the data could be living anywhere-- databases, the cloud, HDFS. Anywhere you have it, you can get it into MATLAB.

For the explore and discover phase, as we saw, there are over 100 low code tools for data analysis engineering and AI. For data analysis, we saw the data cleaner today. But there are many others for statistics and optimization. I'm a big fan of the Optimize live task you see here, which is a great tool for setting up complicated optimization problems.

In the world of engineering, we have many apps for control systems, signal processing, and image processing and computer vision. My personal favorite is the signal analyzer, which you see here, which is a fantastic tool for exploring signals in both the time domain and frequency domain. And for machine learning and AI, we have a family of apps for tasks like ground truth labeling, designing networks, training and validating them, such as with the Deep Network Designer seen here, and even quantizing the networks and getting them to run directly on embedded hardware.

In the share step, you saw the Live Editor today. But there's actually much more to it than we saw. You could also add rich text, equations, images, hyperlinks.

You can include animations with controls, and even export them. There's many different ways to control how your report is going to look. And of course, in the end, you can export it to several different formats.

So we talked a lot about low code tools today. But MATLAB is much more than just a low code platform. And as your needs grow, the MATLAB language grows with you. You can start very simply. MATLAB actually first became famous for making it very easy to write code to do math, such as solving systems of linear equations with just five characters.

You can then build up to creating the kinds of scripts that we saw today. MATLAB has many programming aids to make this easier for you. You can also create your own reusable functions, and even scale all the way up to object-oriented programming if that's something you need to do.

You can also grow from coding by yourself to coding in a team. Projects are a great way to keep track of large collections of files, both MATLAB and otherwise, that all correlate to the same project. MATLAB integrates very nicely with source control.

It has a robust software testing suite, as well as integrations with continuous integration and continuous deployment systems. And it has external interfaces to other languages and environments. So if there are other people in your team who are working in a different environment, like Java or Python, you can interface those and use them together with your MATLAB application.

You can also take your work in MATLAB and deploy it to other people who don't have access to MATLAB. There are two main paths for this. One is using what's called MATLAB Compiler, which just generates a lightweight wrapper around your MATLAB code, which you can then give to anybody else. And they can use it for something like an executable or a web app or an Excel add-in.

There is also an extension compiler SDK, which will put a wrapper around your code to turn it into a software library for environments like C++, Java, Python, and .NET. And if you're a Simulink user, there's even a Simulink compiler. But all of these require compilers because they are just putting a wrapper around your MATLAB code. They require the end user to have something called MATLAB runtime, which is a freely distributable version of the amount engine.

But some environments, like embedded processors, can't handle the MATLAB runtime. So there is also the coder products, which do a full language translation from MATLAB into pure code in, C C++, HD, CUDA, or structured text. So you have many different paths to take the work you built in MATLAB and share it with others who don't have access to MATLAB.

And with that, let's wrap up with some learning resources. So I encourage all of you to check out the MATLAB Central community. This is a place where MATLAB and Simulink users from all over the world come together. There are over two million monthly active users there. You can find things like MATLAB Answers in an online Q&A forum, the File Exchange where people exchange code and tools that they've built, as well as blogs and many other interesting areas.

MathWorks also provides support to our customers in many different forms. We have fantastic technical support. That's actually where I started at the company. We also provide on-site workshops, guided evaluations, full on-site training, as well as full consulting services .

If you're new to MATLAB, I strongly encourage you to check out the MATLAB Fundamentals training. This is a three-day course that will get you from step one to being a competent beginner MATLAB user. It handles importing, analyzing, and exporting data, writing programs to automate things, doing calculations, creating visualizations-- really, a lot of the stuff that you saw today. But it's going to be hands-on with an instructor.

And if you want to try before you buy, there are free Onramp courses online. These are two-hour courses that run entirely in your browser. Anybody can run them at any time for topics ranging from core MATLAB to image processing, deep learning, and a number of others. And with that, that's all I have for you today. Thank you very much for tuning in.