APPLICATION OF THE FOREST CLASSIFIER METHOD FOR DESCRIPTION OF MOVEMENTS OF AN OSCILLATOR FORCED BY A STOCHASTIC SERIES OF IMPULSES

The article discusses the analysis of motion of an oscillator forced by a sequence of stochastic impulses with the use of decision tree algorithms and a random forest classiﬁer. The aim of this paper is to verify the accuracy of distinguishing distributions in the desired time period and to check whether the length of the time interval aﬀects the accuracy of data classiﬁcation. Moreover, the statistical parameters directly inﬂuencing classiﬁcation of distributions are presented. The analysis has been performed in Python environment, the data were obtained in computer simulation. The results of classiﬁcation for two classiﬁcation algorithms with regard to two divisions of the test and training set sizes are presented. In case of the decision tree classiﬁer, it has been observed that for each time interval this algorithm classiﬁes the data achieving a high level of accuracy, but for the purpose of data classiﬁcation for each time period it selects diﬀerent statistics, which makes it impossible to unequivocally determine which statistic inﬂuences the recognition of distribution. In case of the random forest classiﬁcation algorithm, the importance and inﬂuence of the parameters on the distribution between the three distributions are the same both in 5-minute and 10-minute intervals. The diﬀerences between signiﬁcance of the parameters depending on length of the interval are not signiﬁcant.


Introduction
The theory of random dynamical systems is an interdisciplinary domain (Sobczyk, 1983;Iwankiewicz and Kotulski, 2009;Zembaty, 2009;Socha and Soong, 1991;Liu et al., 2023;Banks et al., 2023).It is applied in mechanics (Litak et al., 2008;Bozzoni et al., 2011;Hračov and Náprstek, 2017;Weber et al., 2021;Smolnicki et al., 2013;Rączka et al., 2013) where, like in the present paper, the idea of research refers to non-deterministic mechanics.The state of a system observed at any given moment does not univocally determine the states of the system at consecutive moments, which issues from the random character of stimulaton.It is worth mentioning that in Poland the studies on random dynamic systems flourished in the third quarter of the 20th century (Piszczek, 1982;Skalmierski and Tylikowski, 1972) when, by means of a complex mathematical apparatus, a series of stochastic equations describing various kinds of dynamic mechanical systems have been determined.Having reached a certain level of knowledge, the studies have not been continued.
The research on systems forced by a random series of impulses also started in the second half of the 20th century (Roberts, 1966(Roberts, , 1972;;Roberts and Spanos, 2003) and have been continued ever since.The first attempts at verification of properties of a stochastic model by means of simulation methods were presented by Professor Iwankiewicz (Iwankiewicz, 1993).The research was also conducted by (Mazur-Śniady and Śniady, 1986; Jabłoński and Ozga, 2008) in Poland.
By introducing machine learning into the analysis of dynamic mechanical systems, we are starting the next level of research.In this original approach, thousands of samples have been analyzed for different parameters of random forcing in order to examine usefulness of developed algorithms.
In this study, the developed algorithms are aimed at solving an inverse problem, namely recognition of the distribution of size of impulses forcing vibrations of an oscillator.Although there exists a mathematical model that allows one to calculate the impulse distributions, its applicability, however, is limited.Thus, other solutions of this problem are searched for, and this stage of research is described in this paper.

Mathematical model of an oscillator forced by a random series of impulses
The random forest classifier discussed in this paper was carried out for a one-dimensional physical system (Jabłoński and Ozga, 2013) the state of which was described by random variable x(t).The equation of vibrations of an oscillator with damping was presented in a dimensionless form At this stage of research into the possibility of recognizing random distributions of excitation impulses, it is difficult to assess whether the oscillator will be in the form of a mechanical or electronic system.Since the value of x(t) is accepted as dimensionless, consequently, the units occurring with coefficients a, b, and f (t) are referred to time only.The f (t) is a series of random impulses with random strength η i occurring at random instants of time t i where δ(t − t i ) -Dirac distribution.
The time intervals between impulses τ i = (t i − t i−1 ) are independent continuous random variables for which the function of probability density assumes the form of an exponential function where the constant λ is the impulse occurrence frequency.For the random variable described by Eq. (2.3), the mean distance between impulses is 1/λ s, standard deviation also amounts to 1/λ s, while the median interval between impulses equals ln 2/λ.The values of the η i impulse are independent discrete random variables with a finite expected value.When intervals between the impulses and strength of the impulses are independent random variables, then the solution of this problem for zero initial conditions is where c = √ a 2 − b 2 .The article examines m i (t) -estimators of the k-th stochastic raw moments of the random variable x(t) calculated using the equation where h is the period of sampling, t is time.

Designing of experimental studies using the qualitative method of analysis
Studies on the stochastic model described by Eqs.(2.1)-(2.5)should be appropriately designed so that step changes do not occur in the computed ordinary moments some time after the start.Earlier analyses have shown (Jabłoński andOzga, 2010, 2012;Ozga, 2019) that oscillators with strong damping should be used, and that impulses should occur frequently enough for values of estimators of ordinary moments calculated from Eq. (2.5) to change to the least extent.The rate of the oscillator own vibrations should be selected appropriately for the impulse occurrence frequency λ.The last step consists in checking whether the value of random impulses is sufficient to force vibrations of the oscillator.The research is carried out in order to solve the inverse problem, namely to discern the distribution of impulses forcing vibrations of an oscillator at a shortest possible period of time.
There is an infinite number of possible cases for which simulation or experimental studies could be developed.Taking into account previous experiences as well as the time necessary to generate one trial, and applying the principle that simple systems will allow for clear presentation of the solution of the research problem, the authors selected an oscillator with damping b = 10 and the frequency of vibrations c = 20 for λ = 10 and h = 10 −3 (Fig. 1).In the presented simulation investigations, vibrations of the oscillator evoke three distributions Φ i of the pseudo-random variable The parameters of the distributions were selected so that: • the expected value of the distributions forcing vibrations in all three cases was the same and amounted to 75, • the distribution Φ 1 was characterized by two impulses η 1 and η 2 of a similar force influencing the oscillator, • the distributions Φ 2 and Φ 3 were characterized by two events of different forces of the impact.The value of η 1 symbolizes an impulse of a great force of impact while the value of η 2 an impulse of a little force of impact on the oscillator, • the differences in statistics (Table 1) of the three discrete random variables are distributed as follows: between Φ 1 and the remaining distributions, the differences are significant.The parameters of the distributions Φ 2 and Φ 3 are similar.Figure 1 represents execution of a single movement x(t) and the raw moments computed on the basis of this single execution.Thanks to strong damping occurring in the system at a certain time after the start (see Fig. 1 after 500 s), subsequent impulses do not cause step changes at the computed moments.Distributions were selected so that the mean value was the same in all three cases, hence the estimators of the first ordinary raw moment have similar values after It should be emphasized, however, that the presented results were obtained in the simulation that was organized in a specific way.In all three cases, the impulses worked at the same random time, and, what is more, if the strongest impulse was randomly chosen in the first distribution, the impulses selected in the second and third distributions were also the strongest ones.Similarly, the same pertains to the weakest impulses.
It should be taken into account that there exists an infinite number of possible movements, and the possibility of differentiating between the distributions when impulses occur at different random times and have different random values should be checked.In order to visualize this problem, thousand samples were generated for each of the discussed distributions.As it has already been mentioned, it is the estimators of the second raw moments computed on the basis of moments (2.5) that are used in the analysis.Calculations include all values of x(t) since the very beginning till the moment when the values are recorded in the file.Movement (2.4) and moments (2.5) were determined with the sampling frequency of 10 3 s, and the values of moments were saved in the file every second.The analyzed time series of the second order raw moments were presented as tunnels (Fig. 2) covering all recorded samples.The tunnel was created by determining the maximum and minimum values of each of the thousand samples separately for each second.The mean value of all the calculated estimators was also computed.
Visualization of the research in the form of tunnels shows that for the second moment the samples generated for the distribution Φ 1 differ significantly from the others.In further analysis, this distribution will act as a control group, and the research questions will concern the Φ 2 and Φ 3 distributions.
Based on the second diagram, we can also state that we have to do with three time-dependent phenomena.During the start, up to the 600th second, the distributions Φ 2 and Φ 3 are distinguishable in approximately 50% of cases.Between the 600th and 1800th second, the number of samples in which time series have similar statistical parameters decreases.After the 1800th second all three distributions are distinguishable.The time at which the tunnels formed from Fig. 2. The second stochastic raw moment calculated from the location x(t) for a thousand different samples presented as tunnels the generated samples for two similar distributions split up is an approximated value -this is how the samples were randomly distributed.It should be expected, however, that the time series of the subsequent samples will split up within an approximated time interval.These initial exploratory investigations allow for posing two research questions: 1. How precise differentiation of the distributions Φ 2 and Φ 3 before the 1800th second is possible?2. Does the duration of the analyzed time interval influence the accuracy of classification using the decision tree algorithm and random forest classifier?
In order to answer the research questions posed above, further exploratory investigations should be carried out.It is necessary to check the distributions for single samples in two intervals from 600th second to 2000th one and after 2000th second, presenting the values that occur in the time series in the form of a frequency distribution.The conducted analysis shows that depending on the time interval, statistical parameters describing discrete distributions are different -they differ in the mean value, dominant, variance, etc.All distributions represented in Fig. 3 are multimodal.
On the basis of Fig. 3 it can be assumed that for any several minutes, the long interval starting after the 1800th second, no matter whether we take its mean value, median or dominant, all three distributions will be distinguishable.We can take into consideration either all three statistical parameters or just one of them.Before the 1800th second, the situation is more complicated, hence it seems necessary to use supervised learning algorithms to discern the distributions which force vibrations on the basis of analysis of one sample.Moreover, neither the parameters that could be used for classification of the distribution nor the length of the time interval that should be taken into consideration are known.From the point of view of its application in engineering designs, the sooner we know what distribution we are to deal with, the better.Therefore, nine time intervals presented in Fig. 4 were assigned for further analysis.The intervals are five minutes long while the length of the time intervals with even indices is ten minutes.We will answer the research questions using the following supervised machine learning algorithms.They are the decision tree classifier and the random forest classifier, later described as DTC and RFC, respectively.Along with the indicated time intervals, an analysis was performed using the previously prepared code.At least about 300 impulses were randomly selected during five-minute time intervals, which for the exponential distribution given by Eq. ( 2.3) allowed one to obtain the λ imposed in the simulations with an accuracy of 0.1%.The tests carried out for shorter values of the time interval indicated the obtaining of less accurate classification results than those presented in the next Section.

Data analysis using the decision tree classifier and random forest classifier algorithms
Machine learning (ML) algorithms find patterns, information links in data sets, and then help making decisions and forecasts based on shared data.Out of the logistic regression, elements like naive Bayes, k-nearest neighbors, decision tree, random forest and support vector machine algorithms have been considered.The decision tree classifier and random forest tree algorithms (Szeliga, 2019) have been chosen.The first one is a decision tree classifier which is a supervised learning algorithm that is dedicated for classification problems.It is a solution which works as a flow chart.It divides the data points into a finite number of categories.This algorithm automatically selects the variables that differentiate the variable the most.
The second one is a random forest algorithm.It is an expansion of the decision tree classifier.First, it builds multiple decision trees based on training data.Then it matches the new data from the test set to one of the trees as a random forest.It averages the data to combine it with the closest random tree on the data scale.The random forest models are useful because they solve the decision tree algorithm problem of unnecessarily forcing data points within a category.

Determination of the most important statistical parameters describing the time series of the second raw moment estimators
The analysis started with the calculation of basic statistical parameters (Bąk et al., 2020) such as amplitude, one percent above the standard deviation, minimum, mean, and maximum value, maximum slope, percent close to the median, median, median absolute deviation, skewness, standard deviation and weighted mean.Statistical parameters have been defined for the second raw moment, separately for each of the considered samples.Using the same algorithm, the influence of a given parameter on the classification of distributions for two-time intervals was determined (Fig. 5).
Using machine learning methods, it was possible to determine basic statistics that make it possible to recognize the distribution.Figure 5 shows these statistics for nine time periods.In addition, a parallel classification was devised for divisions of the training and test sets in ratios 30/70 and 50/50.The proposed division of the test and training sets results from the typical division in the 30/70 ratio (Szeliga, 2019).One thousand trials is used in the classification.To obtain as many test cases as possible in the area belonging to both tunnels (Fig. 4), the division of 50/50 was also made.
Focusing on Fig. 5, it should be noted that there is a problem with determining the basic statistics that allow one to distinguish the distributions.For odd (300 s) and even (600 s) intervals using the DTC algorithm, each time interval shown in Fig. 5 (marked as T P x , where x is the next time interval) is described by different statistics.Moreover, it should be observed that the significance of the statistics changes for each time period.For this reason, this algorithm is not suitable for this type of analysis.
A similar verification was conducted for the same intervals with the use of a random tree classifier (Fig. 6).
From this analysis (Fig. 6), it can be seen that in the case of the intervals with a duration of 300 s, the standard deviation (24%), skewness (20%), amplitude (15%) and median absolute deviation (11%) have the highest impact on classification.The minimum value (1%) and the maximum slope (1%) have the lowest impact.In terms of the 600 s time intervals, the largest percentage was for skewness (29%), standard deviation (19%) and median absolute deviation (15%).The smallest share was that of the minimum value (2%).Additionally, in the case of RFCs, the percent close to median and mean value is not included in the classification.The statistics presented in Fig. 6 are the same for both divisions.The importance and influence of the parameters on the distribution between the three distributions is the same both in the 5-minute and 10-minute intervals.The differences between the significance of the parameters depending on the length of the interval are not significant.
Since in particular time intervals, the statistical parameters responsible for the classification do not change only in the case of random forest algorithms, only these algorithms will be used for the research presented in the following Sections of this article.

Definition of the best RFC parameters
For the time series generated during simulation tests, the selection of hyperparameters was performed using the grid search function.
Hyperparameters are the parameters that cannot be learned directly from estimators.It is recommended to search the hyperparameter space for the indicated time intervals to obtain the best cross-validation result.It allows one to find the best parameter values for a given estimator and the parameters which can be used to define the classification algorithm.Definition of these parameters reduces the time of the classification process.
In the case of DTC algorithms from the attributes like splitter, max depth, min samples split, min samples leaf, max features, max leaf nodes, class weight, we decided to define max depth, min samples leaf, min samples split and splitter.The max depth attribute defines the maximum depth of the tree.The min samples leaf attribute signifies the minimum number of the samples required to be at a leaf node.This parameter may affect the smoothing of the model.The min samples split determines the minimum number of samples that is required for internal division in the case of classification.
In the case of RFC algorithms, we need to mention the n estimators, criterion, max depth, min samples split, min samples leaf, max features, max leaf nodes, random state, verbose, max samples parameters.For the analysis we choose the following parameters: n estimators, that is the number of trees in the random forest classifier algorithm, and max depth attribute.

Evaluation of the classification algorithms
In the first step, classification of the downloaded data was conducted and classifier evaluation measures, such as precision, recall, f1-score and support, were determined.
The precision measure is responsible for recognizing a class, for example which part positively predicted elements are of all those marked as a part of this register.Fig. 7. Precision parameter for RFC for two splits Figure 7 shows the precision parameter for nine-time intervals for two divisions 30/70 and 50/50 (test set/training set).It should be noted how this parameter changes depending on the recognized distribution.In the case of division, where there is 70% of the training set, we can see that for all time intervals the precision is 100% for the first distribution, which in the case Fig. 8. Recall parameter for RFC for two splits of these specifically analyzed data may be a determinant of proper operation of the algorithm (Fig. 8).There are slight differences when the test set is 50%.The level is between 98% and 100%, which means that the test set in the case of these variables should have more than five hundred samples.Expected difficulties exist with the second and third distributions.The precision in the case of the 30/70 division only for the 8th and 9th time interval is flawless for all three distributions.In the case of the 50/50 division, it was not possible to obtain 100% precision in the rare of the analyzed time intervals.
The next parameter was recall.This parameter called in some publications sensitivity or true positive rate informs how many elements from a given class have been correctly recognized.
The recall on the tested splits was also on high level.We can see that in the case of the 30/70 split, the first and third to seventh time periods did not reach the 100% recall level for the three distributions.Only for the eighth and ninth time periods was the 100% recall level reached.
For the 50/50 split, no time intervals except the eighth one, achieved the 100% recall level.The last parameter is an F 1 -score determined according to equation In statistical rating of classification, it is a harmonic mean of the precision and recall.Figure 9 presents the F 1 -score for nine time periods for 30/70 and 50/50 splits.
Fig. 9. F 1 -score parameter for RFC for two splits All these tree parameters allow us to evaluate the RFC for the mentioned time periods.

Summary of the classification algorithms in nine time periods
Figure 10 shows the accuracy of the RFCs on the test and training sets when their split is 30/70 and 50/50, respectively, for the three distributions.It should be emphasized that for this analysis, the search for the best hyper parameters was conducted in order to classify the appropriate class in the best possible way.The accuracy of classification using RFCs on training sets for all time intervals and both division is 100%.Accuracy for 30% of the test set ranged from 98.9% to 100%.For a 50% of the test set, the accuracy ranged from 98% to 100%.

Conclusion
After the analysis of the distributions of the second raw stochastic moments for the indicated nine time intervals, using the algorithm of random decision trees and the random forest classifier, the second algorithm was selected for further work based on the achieved results.It can be concluded that the algorithm RFC allows for recognition of distributions and determination of statistical parameters, which have the highest impact on distinguishing distributions.
When answering the question how it is possible to precisely differentiate the distributions Φ 2 and Φ 3 before 1800 s, it should be noted that high precision was achieved in fitting the data to a given distribution.For the RFC classifier, for the case where a 30% test set was considered, for 5-minute intervals an average of 100% for Φ 1 , 100% for Φ 2 and 97% for Φ 3 was achieved.In the case of 10-minute intervals, 100% was achieved for Φ 1 and Φ 3 and 98.5% for Φ 2 .Thus, the differences in the precision of classification depending on the length of the intervals are not significant.
Based on the results obtained for the decision trees algorithm, it should be stated that it is not suitable for the analysis of this type of technical problems because despite the positive results of the classifier operation, the share of individual statistics and their significance changes with each time interval.

Fig. 1 .
Fig. 1.The movements x(t) of vibrations of an oscillator forced by a random series of impulses for one second; below the first stochastic raw moment and the second one computed on the basis of x(t) for 900 s

Fig. 3 .Fig. 4 .
Fig. 3. Distribution of the second raw stochastic moment calculated from the location x(t) for two samples generated from Φ 2 and Φ 3 distributions

Fig. 6 .
Fig. 6.Representation of basic statistics for the second raw moments determined using RFC algorithms for the indicated time intervals considering division of the harvest in the proportion of 30/70

Fig. 10 .
Fig. 10.Representation of accuracy on training and the test set for second raw moments using RFC algorithms for the indicated time intervals considering division of the harvest in the proportion of 30/70 and 50/50

Table 1 .
Parameters of distributions of the random variables Φ i DTC for 30/70 T P 1 T P 2 T P 3 T P 4 T P 5 T P 6 T P 7 T P 8 T P 9 T P 2 T P 3 T P 4 T P 5 T P 6 T P 7 T P 8 T P 9