Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

University of Texas at El Paso

Discipline
Keyword
Publication Year
Publication
Publication Type

Articles 151 - 180 of 2315

Full-Text Articles in Physical Sciences and Mathematics

Towards A Psychologically Natural Relation Between Colors And Fuzzy Degrees, Victor L. Timchenko, Yuriy P. Kondratenko, Olga Kosheleva, Vladik Kreinovich, Nguyen Hoang Phuong Aug 2023

Towards A Psychologically Natural Relation Between Colors And Fuzzy Degrees, Victor L. Timchenko, Yuriy P. Kondratenko, Olga Kosheleva, Vladik Kreinovich, Nguyen Hoang Phuong

Departmental Technical Reports (CS)

A natural way to speed up computations -- in particular, computations that involve processing fuzzy data -- is to use the fastest possible communication medium: light. Light consists of components of different color. So, if we use optical color computations to process fuzzy data, we need to associate fuzzy degrees with colors. One of the main features -- and of the main advantages -- of fuzzy technique is that the corresponding data has intuitive natural meaning: this data comes from words from natural language. It is desirable to preserve this naturalness as much as possible. In particular, it is desirable …


Design And Development Of Transition Metal-Based Electrocatalysts For Environmentally Friendly And Efficient Hydrogen Evolution Reactions (Her), Navid Attarzadeh Aug 2023

Design And Development Of Transition Metal-Based Electrocatalysts For Environmentally Friendly And Efficient Hydrogen Evolution Reactions (Her), Navid Attarzadeh

Open Access Theses & Dissertations

Hydrogen fuel is a clean energy source primarily because it emits no carbon dioxide (CO2). Sustainable energy alternatives have attracted the scientific community and policymakers as concerns over global warming and depletion of fossil fuels have increased significantly. Substituting H2 gas as a primary source for our daily energy consumption under the guideline of the hydrogen economy concept has not progressed as anticipated because of inadequate efficiency associated with the generation (electrolyzer) and utilization (fuel cell) devices. However, there are challenges associated with hydrogen that must be overcome for it to become a truly sustainable and widespread energy source. The …


Single-Index Multinomial Model For Analyzing Crime Data, Kwabena Gyamfi Duodu Aug 2023

Single-Index Multinomial Model For Analyzing Crime Data, Kwabena Gyamfi Duodu

Open Access Theses & Dissertations

We develop a flexible single-index multinomial model for analyzing crime data. In additionto the number of crimes reported, the data also includes covariates such as location, time of day, weather, and other demographic factors. We provide an estimation algorithm and develop R code for the single-index multinomial model. Using simulations, we evaluate the performance of the proposed estimation algorithm. When applied to crime data, the single-index multinomial model provides important insights into crime trends and risk variables, assisting in the development of tailored crime prevention programs. Policymakers and law enforcement organizations can use the model's projections to more efficiently allocate …


Increasing The Efficiency And Accuracy Of Collective Intelligence Methods For Image Classification, Md Mahmudulla Hassan Aug 2023

Increasing The Efficiency And Accuracy Of Collective Intelligence Methods For Image Classification, Md Mahmudulla Hassan

Open Access Theses & Dissertations

Collective intelligence has emerged as a powerful methodology for annotating and classifying challenging data that pose difficulties for automated classifiers. It works by leveraging the concept of "wisdom of the crowds" which approximates a ground truth after aggregating experts' feedback and filtering out noise. However, challenges arise when certain applications, such as medical image classification, security threat detection, and financial fraud detection, demand accurate and reliable data annotation. The unreliability of experts due to inconsistent expertise and competencies, coupled with the associated cost and time-consuming judgment extraction, presents additional challenges.

Input aggregation is the process of consolidating and combining multiple …


Robust Mahalanobis K-Means Algorithm In Comparison With Other Existing Clustering Methods., Eleazer Tabi Serebour Aug 2023

Robust Mahalanobis K-Means Algorithm In Comparison With Other Existing Clustering Methods., Eleazer Tabi Serebour

Open Access Theses & Dissertations

This study enhances K-means Mahalanobis clustering using Density Power Divergence (DPD) for outlier handling and detection. Through the utilization of simulations and the analysis of real-world data, our approach consistently outperforms standard K-means, Mahalanobis K-means, Fuzzy C-means, and others in clustering datasets with outliers. While our method performs similarly to others on spherical datasets, it ranks second to DBSCAN for arbitrary shapes. We showcase its superiority on real-life datasets (Iris flower and wheat seed), demonstrating resilient outlier identification. By navigating various structures and cluster characteristics, our Modified Mahalanobis K-means method proves adaptable and robust, offering insights into diverse clustering scenarios. …


How To Propagate Interval (And Fuzzy) Uncertainty: Optimism-Pessimism Approach, Vinícius F. Wasques, Olga Kosheleva, Vladik Kreinovich Jul 2023

How To Propagate Interval (And Fuzzy) Uncertainty: Optimism-Pessimism Approach, Vinícius F. Wasques, Olga Kosheleva, Vladik Kreinovich

Departmental Technical Reports (CS)

In many practical situations, inputs to a data processing algorithm are known with interval uncertainty, and we need to propagate this uncertainty through the algorithm, i.e., estimate the uncertainty of the result of data processing. Traditional interval computation techniques provide guaranteed estimates, but from the practical viewpoint, these bounds are too pessimistic: they take into account highly improbable worst-case situations when all the measurement and estimation errors happen to be strongly correlated. In this paper, we show that a natural idea of having more realistic estimates leads to the use of so-called interactive addition of intervals, techniques that has already …


How To Combine Probabilistic And Fuzzy Uncertainty: Theoretical Explanation Of Clustering-Related Empirical Result, Lázló Szilágyi, Olga Kosheleva, Vladik Kreinovich Jul 2023

How To Combine Probabilistic And Fuzzy Uncertainty: Theoretical Explanation Of Clustering-Related Empirical Result, Lázló Szilágyi, Olga Kosheleva, Vladik Kreinovich

Departmental Technical Reports (CS)

In contrast to crisp clustering techniques that assign each object to a class, fuzzy clustering algorithms assign, to each object and to each class, a degree to which this object belongs to this class. In the most widely used fuzzy clustering algorithm -- fuzzy c-means -- for each object, degrees corresponding to different classes add up to 1. From this viewpoint, these degrees act as probabilities. There exist alternative fuzzy-based clustering techniques in which, in line with the general idea of the fuzzy set, the largest of the degrees is equal to 1. In some practical situations, the probability-type fuzzy …


Which Fuzzy Implications Operations Are Polynomial? A Theorem Proves That This Can Be Determined By A Finite Set Of Inequalities, Sebastia Massanet, Olga Kosheleva, Vladik Kreinovich Jul 2023

Which Fuzzy Implications Operations Are Polynomial? A Theorem Proves That This Can Be Determined By A Finite Set Of Inequalities, Sebastia Massanet, Olga Kosheleva, Vladik Kreinovich

Departmental Technical Reports (CS)

To adequately represent human reasoning in a computer-based systems, it is desirable to select fuzzy operations that are as close to human reasoning as possible. In general, every real-valued function can be approximated, with any desired accuracy, by polynomials; it is therefore reasonable to use polynomial fuzzy operations as the appropriate approximations. We thus need to select, among all polynomial operations that satisfy corresponding properties -- like associativity -- the ones that best fit the empirical data. The challenge here is that properties like associativity mean satisfying infinitely many constraints (corresponding to infinitely many possible triples of values), while most …


Methodological Lesson Of Pythagorean Triples, Julio C. Urenda, Olga Kosheleva, Vladik Kreinovich Jul 2023

Methodological Lesson Of Pythagorean Triples, Julio C. Urenda, Olga Kosheleva, Vladik Kreinovich

Departmental Technical Reports (CS)

There are many right triangles in which all three sides a, b, and c have integer lengths. The triples (a,b,c) formed by such lengths are known as Pythagorean triples. Since ancient times, it is known how to generate all Pythagorean triples: we can enumerate primitive Pythagorean triples -- in which the three numbers have no common divisors -- by considering all pairs of natural numbers m>n in which m and n have no common divisors, and taking a =m2 − n2, b = 2mn, and c = m2 + n2. Multiplying all elements of a triple by the same …


Why 6-Labels Uncertainty Scale In Geosciences: Probability-Based Explanation, Aaron Velasco, Julio C. Urenda, Olga Kosheleva, Vladik Kreinovich Jul 2023

Why 6-Labels Uncertainty Scale In Geosciences: Probability-Based Explanation, Aaron Velasco, Julio C. Urenda, Olga Kosheleva, Vladik Kreinovich

Departmental Technical Reports (CS)

To describe uncertainty in geosciences, several researchers have recently proposed a 6-labels uncertainty scale, in which one the labels corresponds to full certainty, one label to the absence of any knowledge, and the remaining four labels correspond to the degrees of confidence from the intervals [0,0.25], [0.25,0.5], [0.5,0.75], and [0.75,1]. Tests of this 6-labels scale indicate that it indeed conveys uncertainty information to geoscientists much more effectively than previously proposed uncertainty schemes. In this paper, we use probability-related techniques to explain this effectiveness.


Fuzzy Mathematics Under Non-Minimal "And"-Operations (T-Norms): Equivalence Leads To Metric, Order Leads To Kinematic Metric, Topology Leads To Area Or Volume, Purbita Jana, Olga Kosheleva, Vladik Kreinovich Jul 2023

Fuzzy Mathematics Under Non-Minimal "And"-Operations (T-Norms): Equivalence Leads To Metric, Order Leads To Kinematic Metric, Topology Leads To Area Or Volume, Purbita Jana, Olga Kosheleva, Vladik Kreinovich

Departmental Technical Reports (CS)

Most formulas analyzed in fuzzy mathematics assume -- explicitly or implicitly -- that the corresponding "and"-operation (t-norm) is the simplest minimum operation. In this paper, we analyze what happens if instead, we use other "and"-operations. It turns out that for such operations, a fuzzification of a mathematical theory naturally leads to a more complex mathematical setting: fuzzification of equivalence relation leads to metric, fuzzification of order leads to kinematic metric, and fuzzification of topology leads to area or volume.


Complex Numbers Explain Why In Chinese Tradition, 4 Is Bad But 8 Is Good, Luc Longpre, Olga Kosheleva, Vladik Kreinovich Jul 2023

Complex Numbers Explain Why In Chinese Tradition, 4 Is Bad But 8 Is Good, Luc Longpre, Olga Kosheleva, Vladik Kreinovich

Departmental Technical Reports (CS)

In the traditional Chinese culture, 4 is considered to be an unlucky number, while the number 8 is considered to be very lucky. In this paper, we show that both "badness" and "goodness" can be explained if we take into account the role of complex numbers in the analysis of general dynamical systems.


Why Resilient Modulus Is Proportional To The Square Root Of Unconfined Compressive Strength (Ucs): A Qualitative Explanation, Edgar Daniel Rodriguez Velasquez, Vladik Kreinovich Jul 2023

Why Resilient Modulus Is Proportional To The Square Root Of Unconfined Compressive Strength (Ucs): A Qualitative Explanation, Edgar Daniel Rodriguez Velasquez, Vladik Kreinovich

Departmental Technical Reports (CS)

The strength of the pavement is determine by its resilient modulus, i.e., by its ability to withstand (practically) instantaneous stresses caused by the passing traffic. However, the resilient modulus is not easy to measure: its measurement requires a special expensive equipment that many labs do not have. So, instead of measuring it, practitioners often measure easier-to-measure Unconfined Compressive Strength (UCS) -- that describes the effect of a continuously applied force -- and estimate the resilient modulus based on the result of this measurement. An empirical formula shows that the resilient modulus is proportional to the square root of the Unconfined …


How To Estimate Unknown Unknowns: From Cosmic Light To Election Polls, Talha Azfar, Vignesh Ponraj, Vladik Kreinovich, Nguyen Hoang Phuong Jul 2023

How To Estimate Unknown Unknowns: From Cosmic Light To Election Polls, Talha Azfar, Vignesh Ponraj, Vladik Kreinovich, Nguyen Hoang Phuong

Departmental Technical Reports (CS)

In two different areas of research -- in the study of space light and in the study of voting -- the observed value of the quantity of interest is twice larger than what we would expect. That the observed value is larger makes perfect sense: there are phenomena that we do not take into account in our estimations. However, the fact that the observed value is exactly twice larger deserves explanation. In this paper, we show that Laplace Indeterminacy Principle leads to such an explanation.


We Can Always Reduce A Non-Linear Dynamical System To Linear -- At Least Locally -- But Does It Help?, Orsolya Csiszar, Gábor Csiszar, Olga Kosheleva, Vladik Kreinovich, Nguyen Hoang Phuong Jul 2023

We Can Always Reduce A Non-Linear Dynamical System To Linear -- At Least Locally -- But Does It Help?, Orsolya Csiszar, Gábor Csiszar, Olga Kosheleva, Vladik Kreinovich, Nguyen Hoang Phuong

Departmental Technical Reports (CS)

Many real-life phenomena are described by dynamical systems. Sometimes, these dynamical systems are linear. For such systems, solutions are well known. In some cases, it is possible to transform a nonlinear system into a linear one by appropriately transforming its variables, and this helps to solve the original nonlinear system. For other nonlinear systems -- even for the simplest ones -- such transformation is not known. A natural question is: which nonlinear systems allow such transformations? In this paper, we show that we can always reduce a nonlinear system to a linear one -- but, in general, it does not …


What Was More Frequently Used -- "And" Or "Or": Based On Analysis Of European Languages, Olga Kosheleva, Vladik Kreinovich Jul 2023

What Was More Frequently Used -- "And" Or "Or": Based On Analysis Of European Languages, Olga Kosheleva, Vladik Kreinovich

Departmental Technical Reports (CS)

Traditional logic has two main connectives: "and" and "or". A natural question is: which of the two is more frequently used? This question is easy to answer for the current usage of these connectives -- we can simply analyze all the texts, but what can we say about the past usage? To answer this question, we use the known linguistics fact that, in general, notions that are more frequently used are described by shorter words. It turns out that in most European languages, the word for "and" is shorter -- or of the same length -- as the word for …


Why Bump Reward Function Works Well In Training Insulin Delivery Systems, Lehel Dénes-Fazakas, Lásló Szilágyi, Gyorgy Eigner, Olga Kosheleva, Vladik Kreinovich, Nguyen Hoang Phuong Jul 2023

Why Bump Reward Function Works Well In Training Insulin Delivery Systems, Lehel Dénes-Fazakas, Lásló Szilágyi, Gyorgy Eigner, Olga Kosheleva, Vladik Kreinovich, Nguyen Hoang Phuong

Departmental Technical Reports (CS)

Diabetes is a disease when the body can no longer properly regulate blood glucose level, which can lead to life-threatening situations. To avoid such situations and regulate blood glucose level, patients with severe form of diabetes need insulin injections. Ideally, the system should automatically decide when best to inject insulin and how much to inject. To find the optimal control, researchers applied machine learning with different reward functions. It turns out that the most effective learning occurred when they used the so-called bump function. In this paper, we provide a possible explanation for this empirical result.


Fuzzy Techniques Explain The Effectiveness Of Relu Activation Function In Deep Learning, Julio C. Urenda, Olga Kosheleva, Vladik Kreinovich Jul 2023

Fuzzy Techniques Explain The Effectiveness Of Relu Activation Function In Deep Learning, Julio C. Urenda, Olga Kosheleva, Vladik Kreinovich

Departmental Technical Reports (CS)

In the last decades, deep learning has led to spectacular successes. One of the reasons for these successes was the fact that deep neural networks use a special Rectified Linear Unit (ReLU) activation function s(x) = max(0,x). Why this activation function is so successful is largely a mystery. In this paper, we show that common sense ideas -- as formalized by fuzzy logic -- can explain this mysterious effectiveness.


Why Deep Learning Is Under-Determined? Why Usual Numerical Methods For Solving Partial Differential Equations Do Not Preserve Energy? The Answers May Be Related To Chevalley-Warning Theorem (And Thus To Fermat Last Theorem), Julio C. Urenda, Olga Kosheleva, Vladik Kreinovich Jul 2023

Why Deep Learning Is Under-Determined? Why Usual Numerical Methods For Solving Partial Differential Equations Do Not Preserve Energy? The Answers May Be Related To Chevalley-Warning Theorem (And Thus To Fermat Last Theorem), Julio C. Urenda, Olga Kosheleva, Vladik Kreinovich

Departmental Technical Reports (CS)

In this paper, we provide a possible explanation to two seemingly unrelated phenomena: (1) that in deep learning, under-determined systems of equations perform much better than the over-determined one -- which are typical in data processing, and that (2) usual numerical methods for solving partial differential equations do not preserve energy. Our explanation is related to the intuition of Fermat behind his Last Theorem and of Euler about more general statements, intuition that led to the proof of Chevalley-Warning Theorem in number theory.


How To Best Retrain A Neural Network If We Added One More Input Variable, Saeid Tizpaz-Niari, Vladik Kreinovich Jul 2023

How To Best Retrain A Neural Network If We Added One More Input Variable, Saeid Tizpaz-Niari, Vladik Kreinovich

Departmental Technical Reports (CS)

Often, once we have trained a neural network to estimate the value of a quantity y based on the available values of inputs x1, ..., xn, we learn to measure the values of an additional quantity that have some influence on y. In such situations, it is desirable to re-train the neural network, so that it will be able to take this extra value into account. A straightforward idea is to add a new input to the first layer and to update all the weights based on the patterns that include the values of the new input. The problem with …


Topological Explanation Of Why Complex Numbers Are Needed In Quantum Physics, Julio C. Urenda, Vladik Kreinovich Jul 2023

Topological Explanation Of Why Complex Numbers Are Needed In Quantum Physics, Julio C. Urenda, Vladik Kreinovich

Departmental Technical Reports (CS)

In quantum computing, we only use states in which all amplitudes are real numbers. So why do we need complex numbers with non-zero imaginary part in quantum physics in general? In this paper, we provide a simple topological explanation for this need, explanation based on the Second Law of Thermodynamics.


How To Make Decision Under Interval Uncertainty: Description Of All Reasonable Partial Orders On The Set Of All Intervals, Tiago M. Costa, Olga Kosheleva, Vladik Kreinovich Jul 2023

How To Make Decision Under Interval Uncertainty: Description Of All Reasonable Partial Orders On The Set Of All Intervals, Tiago M. Costa, Olga Kosheleva, Vladik Kreinovich

Departmental Technical Reports (CS)

In many practical situations, we need to make a decision while for each alternative, we only know the corresponding value of the objective function with interval uncertainty. To help a decision maker in this situation, we need to know the (in general, partial) order on the set of all intervals that corresponds to the preferences of the decision maker. For this purpose, in this paper, we provide a description of all such partial orders -- under some reasonable conditions. It turns out that each such order is characterized by two linear inequalities relating the endpoints of the corresponding intervals, and …


Natural Color Interpretation Of Interval-Valued Fuzzy Degrees, Victor L. Timchenko, Yury P. Kondratenko, Vladik Kreinovich, Olga Kosheleva Jun 2023

Natural Color Interpretation Of Interval-Valued Fuzzy Degrees, Victor L. Timchenko, Yury P. Kondratenko, Vladik Kreinovich, Olga Kosheleva

Departmental Technical Reports (CS)

Intuitively, interval-values fuzzy degrees are more adequate for representing expert uncertainty than the traditional [0,1]-based ones. Indeed, the very need for fuzzy degrees comes from the fact that experts often cannot describe their opinion not in terms of precise numbers, but by using imprecise ("fuzzy") words from natural language like "small". In such situations, it is strange to expect the same expert to be able to provide an exact number describing his/her degree of certainty; it is more natural to ask this expert to mark the whole interval (or even, more generally, a fuzzy set of possible degrees). In spite …


Logical Inference Inevitably Appears: Fuzzy-Based Explanation, Julio C. Urenda, Olga Kosheleva, Vladik Kreinovich, Orsolya Csiszar Jun 2023

Logical Inference Inevitably Appears: Fuzzy-Based Explanation, Julio C. Urenda, Olga Kosheleva, Vladik Kreinovich, Orsolya Csiszar

Departmental Technical Reports (CS)

Many thousands years ago, our primitive ancestors did not have the ability to reason logically and to perform logical inference. This ability appeared later. A natural question is: was this appearance inevitable -- or was this a lucky incident that could have been missed? In this paper, we use fuzzy techniques to provide a possible answer to this question. Our answer is: yes, the appearance of logical inference in inevitable.


Which Activation Function Works Best For Training Artificial Pancreas: Empirical Fact And Its Theoretical Explanation, Lehel Dénes-Fazakas, Lásló Szilágyi, György Eigner, Olga Kosheleva, Martine Ceberio, Vladik Kreinovich Jun 2023

Which Activation Function Works Best For Training Artificial Pancreas: Empirical Fact And Its Theoretical Explanation, Lehel Dénes-Fazakas, Lásló Szilágyi, György Eigner, Olga Kosheleva, Martine Ceberio, Vladik Kreinovich

Departmental Technical Reports (CS)

One of the most effective ways to help patients at the dangerous levels of diabetes is an artificial pancreas, a device that constantly monitors the patient's blood sugar level and injects insulin based on this level. Patient's reaction to insulin is highly individualized, so the artificial pancreas needs to be trained on each patient. It turns out that the best training results are attained when instead of the usual ReLU neurons, we use their minor modification known as Exponential Linear Units (ELU). In this paper, we provide a theoretical explanation for the empirically observed effectiveness of ELUs.


Why Fuzzy Control Is Often More Robust (And Smoother): A Theoretical Explanation, Orsolya Csiszar, Gábor Csiszar, Olga Kosheleva, Martine Ceberio, Vladik Kreinovich Jun 2023

Why Fuzzy Control Is Often More Robust (And Smoother): A Theoretical Explanation, Orsolya Csiszar, Gábor Csiszar, Olga Kosheleva, Martine Ceberio, Vladik Kreinovich

Departmental Technical Reports (CS)

In many practical situations, practitioners use easier-to-compute fuzzy control to approximate the more-difficult-co-compute optimal control. As expected, for many characteristics, this approximate control is slightly worse than the optimal control it approximates, However, with respect to robustness or smoothness, the approximating fuzzy control is often better than the original one. In this paper, we provide a theoretical explanation for this somewhat mysterious empirical phenomenon.


Dialogs Re-Enacted Across Languages, Version 2, Nigel G. Ward, Jonathan E. Avila, Emilia Rivas, Divette Marco Jun 2023

Dialogs Re-Enacted Across Languages, Version 2, Nigel G. Ward, Jonathan E. Avila, Emilia Rivas, Divette Marco

Departmental Technical Reports (CS)

To support machine learning of cross-language prosodic mappings and other ways to improve speech-to-speech translation, we present a protocol for collecting closely matched pairs of utterances across languages, a description of the resulting data collection and its public release, and some observations and musings. This report is intended for:

  • people using this corpus
  • people extending this corpus
  • people designing similar collections of bilingual dialog data.

Change Notes. This version supersedes UTEP-CS-22-108. There is some new information and numerous clarifications, mostly arising from our experiences diversifying our corpus and helping a vendor to use this protocol.


Selecting The Most Adequate Fuzzy Operation For Explainable Ai: Empirical Fact And Its Possible Theoretical Explanation, Orsolya Csiszar, Gábor Csiszar, Martine Ceberio, Vladik Kreinovich Jun 2023

Selecting The Most Adequate Fuzzy Operation For Explainable Ai: Empirical Fact And Its Possible Theoretical Explanation, Orsolya Csiszar, Gábor Csiszar, Martine Ceberio, Vladik Kreinovich

Departmental Technical Reports (CS)

A reasonable way to make AI results explainable is to approximate the corresponding deep-learning-generated function by a simple expression formed by fuzzy operations. Experiments on real data show that out of all easy-to-compute fuzzy operations, the best approximation is attained if we use an operation a + b − 0.5 ( limited to the interval [0,1]$. In this paper, we provide a possible theoretical explanation for this empirical result.


Is Fully Explainable Ai Even Possible: Fuzzy-Based Analysis, Miroslav Svitek, Olga Kosheleva, Vladik Kreinovich Jun 2023

Is Fully Explainable Ai Even Possible: Fuzzy-Based Analysis, Miroslav Svitek, Olga Kosheleva, Vladik Kreinovich

Departmental Technical Reports (CS)

One of the main limitations of many current AI-based decision-making systems is that they do not provide any understandable explanations of how they came up with the produced decision. Taking into account that these systems are not perfect, that their decisions are sometimes far from good, the absence of an explanation makes it difficult to separate good decisions from suspicious ones. Because of this, many researchers are working on making AI explainable. In some applications areas -- e.g., in chess -- practitioners get an impression that there is a limit to understandability, that some decisions remain inhuman -- not explainable. …


Why Softmax? Because It Is The Only Consistent Approach To Probability-Based Classification, Anatole Lokshin, Vladik Kreinovich Jun 2023

Why Softmax? Because It Is The Only Consistent Approach To Probability-Based Classification, Anatole Lokshin, Vladik Kreinovich

Departmental Technical Reports (CS)

In many practical problems, the most effective classification techniques are based on deep learning. In this approach, once the neural network generates values corresponding to different classes, these values are transformed into probabilities by using the softmax formula. Researchers tried other transformation, but they did not work as well as softmax. A natural question is: why is softmax so effective? In this paper, we provide a possible explanation for this effectiveness: namely, we prove that softmax is the only consistent approach to probability-based classification. In precise terms, it is the only approach for which two reasonable probability-based ideas -- Least …