Oleg I. Berngardt

h-index1

5papers

8citations

Novelty40%

AI Score35

Ranked #122,547 of 205,806 authors (top 60%)#26,730 in LG (top 63%)

5 Papers

LGSep 5, 2023

Superclustering by finding statistically significant separable groups of optimal gaussian clusters

Oleg I. Berngardt

The paper presents the algorithm for clustering a dataset by grouping the optimal, from the point of view of the BIC criterion, number of Gaussian clusters into the optimal, from the point of view of their statistical separability, superclusters. The algorithm consists of three stages: representation of the dataset as a mixture of Gaussian distributions - clusters, which number is determined based on the minimum of the BIC criterion; using the Mahalanobis distance, to estimate the distances between the clusters and cluster sizes; combining the resulting clusters into superclusters using the DBSCAN method by finding its hyperparameter (maximum distance) providing maximum value of introduced matrix quality criterion at maximum number of superclusters. The matrix quality criterion corresponds to the proportion of statistically significant separated superclusters among all found superclusters. The algorithm has only one hyperparameter - statistical significance level, and automatically detects optimal number and shape of superclusters based of statistical hypothesis testing approach. The algorithm demonstrates a good results on test datasets in noise and noiseless situations. An essential advantage of the algorithm is its ability to predict correct supercluster for new data based on already trained clusterer and perform soft (fuzzy) clustering. The disadvantages of the algorithm are: its low speed and stochastic nature of the final clustering. It requires a sufficiently large dataset for clustering, which is typical for many statistical methods.

LGApr 23, 2023

Improving Classification Neural Networks by using Absolute activation function (MNIST/LeNET-5 example)

Oleg I. Berngardt

The paper discusses the use of the Absolute activation function in classification neural networks. An examples are shown of using this activation function in simple and more complex problems. Using as a baseline LeNet-5 network for solving the MNIST problem, the efficiency of Absolute activation function is shown in comparison with the use of Tanh, ReLU and SeLU activations. It is shown that in deep networks Absolute activation does not cause vanishing and exploding gradients, and therefore Absolute activation can be used in both simple and deep neural networks. Due to high volatility of training networks with Absolute activation, a special modification of ADAM training algorithm is used, that estimates lower bound of accuracy at any test dataset using validation dataset analysis at each training epoch, and uses this value to stop/decrease learning rate, and re-initializes ADAM algorithm between these steps. It is shown that solving the MNIST problem with the LeNet-like architectures based on Absolute activation allows to significantly reduce the number of trained parameters in the neural network with improving the prediction accuracy.

8.5AO-PHApr 30

Physically-Informed Fuzzy Clustering of Vertical Sounding Ionograms

Oleg I. Berngardt, Sergey N. Ponomarchuk

This paper presents a physically-informed fuzzy clustering of vertical sounding ionograms for automatically separating the ionogram into tracks suitable for further interpretation and determining their optimal number. The model is designed for use not only in conditions where the number of tracks is known, but also in disturbed ionospheric conditions where the number of tracks is preliminary unknown. The method is based on an expectation-maximization algorithm, used for clustering, and on parametrically specified distributions of distances from points to parametrically specified curves. The curves used as track models are close to model tracks in the parabolic ionospheric layer model. The resulting model of each track has six parameters: three standard ones (the critical frequency, the lower boundary of the layer, and its half-width), and three additional ones to take into account possible underlying layer effects. By sequentially increasing the number of tracks and optimizing their parameters, the model finds the optimal number of tracks on the ionogram by minimizing the modified Bayesian information criterion. The Sequential Least Squares Quadratic Programming algorithm is used to find the parameters of a single track. The width of each single track is assumed to be unknown constant found during fitting process. To improve the quality of ionogram clustering, automatic adaptive noise filtering is performed before clustering. This filtering is based on a combination of the DBSCAN and Gaussian Mixture algorithms. Also, to improve clustering quality on an ionosonde without hardware separation of the ordinary and extraordinary components, a preliminary approximate removal of points belonging to the extraordinary mode is performed.

LGMay 23, 2024

Minimum number of neurons in fully connected layers of a given neural network (the first approximation)

Oleg I. Berngardt

This paper presents an algorithm for searching for the minimum number of neurons in fully connected layers of an arbitrary network solving given problem, which does not require multiple training of the network with different number of neurons. The algorithm is based at training the initial wide network using the cross-validation method over at least two folds. Then by using truncated singular value decomposition autoencoder inserted after the studied layer of trained network we search the minimum number of neurons in inference only mode of the network. It is shown that the minimum number of neurons in a fully connected layer could be interpreted not as network hyperparameter associated with the other hyperparameters of the network, but as internal (latent) property of the solution, determined by the network architecture, the training dataset, layer position, and the quality metric used. So the minimum number of neurons can be estimated for each hidden fully connected layer independently. The proposed algorithm is the first approximation for estimating the minimum number of neurons in the layer, since, on the one hand, the algorithm does not guarantee that a neural network with the found number of neurons can be trained to the required quality, and on the other hand, it searches for the minimum number of neurons in a limited class of possible solutions. The solution was tested on several datasets in classification and regression problems.

GEO-PHJan 15, 2022

Wrapped Classifier with Dummy Teacher for training physics-based classifier at unlabeled radar data

Oleg I. Berngardt, Oleg A. Kusonsky, Alexey I. Poddelsky et al.

In the paper a method for automatic classification of signals received by EKB and MAGW ISTP SB RAS coherent scatter radars (8-20MHz operating frequency) during 2021 is described. The method is suitable for automatic physical interpretation of the resulting classification of the experimental data in realtime. We called this algorithm Wrapped Classifier with Dummy Teacher. The method is trained on unlabeled dataset and is based on training optimal physics-based classification using clusterization results. The approach is close to optimal embedding search, where the embedding is interpreted as a vector of probabilities for soft classification. The approach allows to find optimal classification algorithm, based on physically interpretable parameters of the received data, both obtained during physics-based numerical simulation and measured experimentally. Dummy Teacher clusterer used for labeling unlabeled dataset is gaussian mixture clustering algorithm. For algorithm functioning we extended the parameters obtained by the radar with additional parameters, calculated during simulation of radiowave propagation using ray-tracing and IRI-2012 and IGRF models for ionosphere and Earth's magnetic field correspondingly. For clustering by Dummy Teacher we use the whole dataset of available parameters (measured and simulated ones). For classification by Wrapped Classifier we use only well physically interpreted parameters. As a result we trained the classification network and found 11 well-interpretable classes from physical point of view in the available data. Five other found classes are not interpretable from physical point of view, demonstrating the importance of taking into account radiowave propagation for correct classification.