Different performances of STL decomposition in MATLAB and Python

69 visualizaciones (últimos 30 días)
Yiming Wang
Yiming Wang el 19 de Nov. de 2024 a las 9:50
Editada: Pavl M. el 21 de Nov. de 2024 a las 13:55
I used trenddecomp fucntion in MATLAB and STL function in Python to decomposite a time series and the results are pretty different in these two software. I don't know if it is something different in the processing of this function?
Here are the code scripts.
MATLAB
data = readtable('stldata.txt');
index = data.index;
datas = data.data;
[LT,ST,R] = trenddecomp(datas,'stl',12);
figure(1);
plot(index,LT)
hold on
plot(index,ST)
plot(index,R)
plot(index,datas)
legend('Long term','Seasonal','Residual','Original')
Python
import pandas as pd
import matplotlib.pyplot as plt
from statsmodels.tsa.seasonal import STL
file_path = './stldata.txt'
data = pd.read_csv(file_path, delim_whitespace=True)
stl = STL(data['data'],period=12)
result = stl.fit()
fig = result.plot()
fig.set_size_inches(10, 6)
plt.show()
Result of Python

Respuestas (1)

Pavl M.
Pavl M. el 19 de Nov. de 2024 a las 16:37
Editada: Pavl M. el 21 de Nov. de 2024 a las 9:30
Are they "pretty" different? It looks ok your plots, the curves are not very different, just scale y axis in Python or plot as in Matlab in 1 figure 3-4 different curves in different colours and see.
So the differences you percept are mainly due the 2 Matlab and Python plot types differences ( in Matlab it is all curves in 1 subplot, in Python there are 3 subplots per each curve and so the y-axis scaling) and also the difference is due to that that in Python the STL(...) algorithm made less low pass filtering (less high frequencies components rejection, higher cut-off frequency) with their moving average and convolution than that of Matlab (more smooth and so more high frequency components were rejected and so more high passs filtering out and so lower cut-off frequency of matlab internal STL algorithm realization).
Differences in this question lie in 1) How your input initial source of data sampled(spaced), need to be uniform for Matlab, 2) There may be 2 seasonal trends, you found yet only 1, 3) Scaling/zoom of plots, for Python matplotlib.pyplot try different plot initialization fig, ax = plt.subplots(layout='constrained') vertical y axis size and scaling, fig.set_size_inches(), subplot mosaic and ax.set_yscale(...)
Should I look up for original STL and SSA algorithms realizations in Matlab and Python for comparisons? Are they implemented in Matlab as Fortran, C++ codes or already built libs?
In Python STL decomposition is implemented using moving average and convolution filter:
What about Matlab?
Hope this will help.
Can you accept my answer?
  4 comentarios
Yiming Wang
Yiming Wang el 21 de Nov. de 2024 a las 11:36
Editada: Yiming Wang el 21 de Nov. de 2024 a las 11:37
Thanks for your update. Now I have used your Python code to run the decomposition on the same source data.
stl = STL(data['data'],period=12, seasonal=9, trend=None, low_pass=19, seasonal_deg=1, trend_deg=1, low_pass_deg=1, robust=True, seasonal_jump=2, trend_jump=2, low_pass_jump=2)
Following is the result. If you look at the long-term trend, your may find that it increases when x ranges from 0 to ~20, and decreases when x ranges from ~20 to ~210. And it finally increases during ~210 to 240. But for the result of MATLAB, the long-term trend continuously decreases from 0 to ~200, and increases from ~200 to the end. So if I were a new reader, I would draw different conlucions from these two results. So I think it may be a critical problem, espically when I want to use STL in scientific resarch work.
I believe that there may be different paramters used in MATLAB and Python. But which one is more resesonable?
I read the original paper of the STL algorithm and it seems that the style of the result from Python is more similar as the original paper than that from MATLAB.
Thank you again for your effort on this topic.
Pavl M.
Pavl M. el 21 de Nov. de 2024 a las 13:40
Editada: Pavl M. el 21 de Nov. de 2024 a las 13:55
Hm...
OK, valid okay for scientific manuscript of course, Ph.D. and M.Sc. workflow helpfull commitment,
for x from 0 to 25 Python regarded the fluctuations as upward trend, while Matlab united the trend with more strong downward following trend. You are right. It depends on which resolution/scale who needs to catch/distinguish trendlines.
I can do more for it, sweep resolutions, tune algorithms and develop new strategic components iff you hire me by contacting more in private for legitimate employment within the narrow specific SciTech niche complex datum processing and insightful features production signals and systems r&d scope domain.
It is also both subjective and objective.
How do you treat the residuals as a nosice(disturbance) or which useful insights you think
to get from it?
Also if TCE Matlab yet doesn't have more direct control over internal STL algorithm through the trenddecomp(...) function it can be furthermore overrided or we can try to construct wrapper around it by special pre-processing of the input to it to get it closer to some custom only ethalone or specific legitimate stakeholders requirements.
It's all allright. I found also with Optimal Control AIXI very long term predictions (many not disclosed my valuable findings are legitimatelly for sale, it must needs to add more rewards investment of substance to the in order to reap better results) and as per the look ahead we must more appreciate, construct and consent each other track and flowpath, than to disagree and destruct each other track and flowpath.
So which refining do you need to trenddecompt function? Please let me know.
In order to help correctly:
1.
stl = STL(data['data'],period=12, seasonal=9, trend=None, low_pass=19, seasonal_deg=1, trend_deg=1, low_pass_deg=1, robust=True, seasonal_jump=2, trend_jump=2, low_pass_jump=2)
Following is the result. If you look at the long-term trend, your may find that it increases when x ranges from 0 to ~20, and decreases when x ranges from ~20 to ~210. And it finally increases during ~210 to 240. But for the result of MATLAB, the long-term trend continuously decreases from 0 to ~200, and increases from ~200 to the end. So if I were a new reader, I would draw different conlucions from these two results. So I think it may be a critical problem, espically when I want to use STL in scientific resarch work.
I see. TCE NCP MPPL Matlab just filtered it more low-passy to catch more long-term trend. It is not so bad, especially in historical data analysis necessary for future accurate predictions.
I construct.
typical_input = f1(x1,x2)*f2(x1,x2)*f3(x1,x2) + f3(x1,x2)*f4(x1,x2)*f5(x1,x2) +f6(x1,x2)*f7(x1,x2)*f8(x1,x2) =
F(x1,x2)
If you would give me more I'd deliver more.
Which script, code who will be needing (I can deliver securely).
Matlab is better for SSA, while Python has also MSTL and more control over STL by the function arguments list I putted here.
The real world input is nonlinear with both multiplicative and additive terms in multivariate space-time continuum so good reference for well founded basic future scientific research.
STL deals with additive and univariate analysis.
For better quality manuscript you need to just connect right components.
Which conclusions can you draw from the results?
What you get when you run next:
stl = STL(data['data'],period=12, seasonal=24, trend=None, low_pass=40, seasonal_deg=1, trend_deg=1, low_pass_deg=5, robust=True, seasonal_jump=12, trend_jump=24, low_pass_jump=48)
It depends also, how you mindset is modulated, I've been focusing on forecasting, predictions of multi-dimensional multi-variate datum at the time on my initial conclusion within the answer writeup on this your question. And so from the foreseeing standpoint I might catched tradeoff between prediction horizon and accuracy of anticipations. And so focused on long-term trends more robust detection, which scientifically factually requires better low frequencies amplification and high frequencies attenuation as per Occam Razor principle for analysis of fractals.
See my original update explanation:in Python the STL(...) algorithm made less low pass filtering (less high frequencies components rejection, higher cut-off frequency) with their moving average and convolution than that of Matlab (more smooth and so more high frequency components were rejected and so more high passs filtering out and so lower cut-off frequency of matlab internal STL algorithm realization).
SaS,SaP

Iniciar sesión para comentar.

Categorías

Más información sobre Grey-Box Model Estimation en Help Center y File Exchange.

Productos


Versión

R2022b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by