The goal is to predict the mean and standard deviation of an imbalance multiclass classification example. The imbalance multiclass classification problem is as follows: given a training set of loss data (e.g., images), a pre-trained balanced classifier, and a test set of images, one wants to predict which images were classified as “balanced” by the trained classifier.
Class-imbalance multiclass classification is an important machine learning task which is used to classify large and diverse data sets of real-world datasets. In this task, the dataset is artificially imbalanced in order to simulate the real-world problems and in order to make classifi-cation of the imbalanced data sets more efficient.
In this lesson, we will examine the difference between micro and macro averages and calculate macro and micro averages for precision and recall. Start with the micromedium. You may think micro means small. To understand the little things, you have to look through a microscope. A micromedium is a study of a single class, for example a micromedium of a firm examines the prices of individual products and market structures.
Macro means dealing with aggregates or series, macro is an examination of the whole. Here we look at general economic phenomena such as the unemployment rate, economic growth, price levels and gross domestic product or GDP.
A few quick examples: Suppose Johnson loses his job during an economic recession. If we want to study how this drop in income affects his consumption, this question will be studied at the micro level. We look at a person’s actions. In contrast, if we assume that during the same recession unemployment rose to 11%, then this aggregate unemployment and its effect on aggregate demand is examined in macroeconomics.
The difference between macro and micro averages is that with macro averages, each class is equally weighted, while with micro averages, each sample is equally weighted. If you have the same number of samples for each class, macro and micro will yield the same result.
The macro mean calculates the metrics independently for each class and then averages them so that all classes are treated equally, while the micro mean combines the contributions of all classes to calculate the mean metric.
Suppose you have a multi-class classification system with three classes and the following numbers, The classes are unbalanced:
Micro-averaging accuracy
The average micro-precision is the sum of all true positives and is divided by the sum of all true positives plus the sum of all false positives. In other words, you divide the number of correct predictions by the total number of predictions. The value of the micromean is calculated :
Medium macro accurate
You can easily see that PrA=.71, PrB=.1, while PrC=.57. Then the macro average is calculated:
These are completely different precision values. Intuitively, in the macromedium, a decent overall accuracy (0.4) can be achieved with good accuracy (0.6) of classes A and C. Although this result is technically correct (the average accuracy for all classes is 0.4), it is somewhat misleading because a large number of examples are not classified correctly. These examples are mostly class B, so their contribution to the average is only 1/4, even though they represent 90% of your data.
The microman adequately reflects this imbalance between the classes, reducing the overall mean precision to 0.22 (which is closer to the precision of the dominant class B (0.1)).
Retrieve average micro-indicators
Now the average recall is by the micro-means method:
Here we take false negative instead of false positive.
Retrieve macro average values
The method is simple. Just take the average of the different sets. For example, the average macro view for this example is :
F1 Macro and Medium Macro
The micro-macro mean of the F-score is then simply the harmonic mean. For example, in binary classification we get an F1 score of 0.7 for class 1 and 0.5 for class 2. Using the macro mean, we will simply average these two scores to get an overall score for your classifier of 0.6, which is the same regardless of the distribution of samples across the two classes.
If you used the micro-means, the distribution would matter. If z. B. Class 1 represents 80% of your data, the formula would be 0.7×80% + 0.5×20, which yields 0.66 because each sample is equally weighted and the result reflects an imbalance in the data. If class 1 represents 50% of your data, the formula would become 0.7×50% + 0.5×50%, which would be 0.6, according to the macro averaging.
Remarks
If your data were perfectly balanced, the macro and micro averages would yield the same result.
Micro-average precision and micro-average recall correspond to the precision when each data point is assigned to exactly one class. The micro-mean metrics differ from the overall accuracy when the classifications are multi-classes or when some classes are excluded in the case of multi-classes.
Since large classes perform better than small classes, one would expect the micro mean to be higher than the macro mean.
The micro medium is preferable if the class is unbalanced. It depends on the purpose. If you’re interested in general data and not a preference for a particular class, the micro is perfect. However, if we assume that class A is rare, but very important, then the macro would be the best choice because it applies equally to all classes. The microphone is better if we are more concerned about accuracy in general. The micro is closer to accuracy, and the macro is a bit different when not dominated by a ruling class.
In a multi-class classification, micro-mechanisms are preferable if one suspects an imbalance between classes.
Frequently Asked Questions
What is the difference between micro averaging and macro averaging in a multi class classification problem?
Micro averaging is the process of taking a subset of observations from a larger dataset and calculating the average value for that subset. Macro averaging is the process of taking all observations in a dataset and calculating an average value for that entire dataset.
What are micro and macro averaging which should be used if one class dominates over the other?
Micro averaging is when the class is broken down into smaller groups and each group’s average is calculated. Macro averaging is when the class as a whole is averaged.
Which metric is good for imbalanced class problems?
The mean is good for imbalanced class problems.
Related Tags:
Feedback,macro average vs micro averagemicro f1 score keraskeras f1 score multi class classificationwhat is weighted average in classification reportmacro average vs weighted average which is betterin a multiclass classification, which is the best performance metric?,People also search for,Privacy settings,How Search works,F‑score,Precision and recall,Receiver operating characteri…,Accuracy and precision,See more,how to calculate accuracy in multi-class classification,micro f1 score keras,keras f1 score multi class classification,what is weighted average in classification report,compute micro-average roc curve and roc area,weighted average confusion matrix,multi-class metrics,micro f1 score vs accuracy