Main
Properties of Normal Probability Curve | Statistics
Some of the properties are: 1. The normal
curve is symmetrical 2. The normal curve is unimodal 3. Mean, median and mode
coincide 4. The maximum ordinate occurs at the centre 5. The normal curve is
asymptotic to the X-axis 6. The height of the curve declines symmetrically and
Others.
1. The normal curve is symmetrical:
The Normal Probability Curve (N.P.C.) is symmetrical about the ordinate
of the central point of the curve. It implies that the size, shape and slope of
the curve on one side of the curve is identical to that of the other.
That is, the normal curve has a bilateral symmetry. If the figure is to be folded along its vertical axis, the two halves would coincide. In other words the left and right values to the middle central point are mirror images.
2. The normal curve is unimodal:
Since there is only one point in the curve which has maximum frequency,
the normal probability curve is unimodal, i.e. it has only one mode.
3. Mean, median and mode coincide:
The mean, median and mode of the normal distribution are the same and
they lie at the centre. They are represented by 0 (zero) along the base line.
[Mean = Median = Mode]
4. The maximum ordinate occurs at the centre:
The maximum height of the ordinate always occurs at the central point of
the curve that is, at the mid-point. The ordinate at the mean is the highest
ordinate and it is denoted by Y0. (Y0 is the height of the curve at the mean or
mid-point of the base line).
5. The normal curve is asymptotic to the X-axis:
The Normal Probability Curve approaches the horizontal axis
asymptotically i.e., the curve continues to decrease in height on both ends
away from the middle point (the maximum ordinate point); but it never touches
the horizontal axis.
It extends infinitely in both directions i.e. from minus infinity (-∞)
to plus infinity (+∞) as shown in Figure below. As the distance from the mean
increases the curve approaches to the base line more and more closely.
6. The height of the curve declines symmetrically:
In the normal probability curve the height declines symmetrically in
either direction from the maximum point. Hence the ordinates for values of X =
µ ± K, where K is a real number, are equal.
For example:
7. The points of Influx occur at point ± 1 Standard Deviation (± 1 a):
The normal curve changes its direction from convex to concave at a point
recognized as point of influx. If we draw the perpendiculars from these two
points of influx of the curve on horizontal axis, these two will touch the axis
at a distance one Standard Deviation unit above and below the mean (± 1 σ).
8. The total percentage of area of the normal curve
within two points of influxation is fixed:
Approximately 68.26% area of the curve falls within the limits of ±1
standard deviation unit from the mean as shown in figure below.
9. Normal curve is a smooth curve:
The normal curve is a smooth curve, not a histogram. It is moderately
peaked. The kurtosis of the normal curve is 263.
10. The normal curve is bilateral:
The 50% area of the curve lies to the left side of the maximum central
ordinate and 50% lies to the right side. Hence the curve is bilateral.
11. The normal curve is a mathematical model in
behavioural sciences:
The curve is used as a measurement scale. The measurement unit of this
scale is ± σ (the unit standard deviation).
12. Greater percentage of cases at the middle of
the distribution:
There is a greater percentage of cases at the middle of the
distribution. In between -1σ and + 1σ, 68.26% (34.13 + 34.13), nearly 2/3 of
eases lie. To the right side of +1σ, 15.87% (13.59 + 2.14 + .14), and to the
left of-1σ, 15.87% (13.59 + 2.14 + .14) of cases lie. Beyond +2σ. 2.28% of
eases lie and beyond -2σ also 2.28% of cases lie.
Thus, majority of eases lie at the middle of the distribution and gradually number of cases on either side decreases with certain proportions.
13. The scale of X-axis in normal curve is generalised by Z deviates
14. The normal curve is based on elementary principles of probability and the other name of the normal curve is the ‘normal probability curve’.
Skewness and Kurtosis
“Skewness essentially measures the
symmetry of the distribution, while kurtosis determines the heaviness of the distribution
tails.”
The understanding shape of data is a crucial
action. It helps to understand where the most information is lying and analyze
the outliers in a given data. In this article, we’ll learn about the shape of
data, the importance of skewness, and kurtosis. The types of skewness and
kurtosis and Analyze the shape of data in the given dataset.
Skewness
If the values of a specific independent
variable (feature) are skewed, depending on the model, skewness may violate
model assumptions or may reduce the interpretation of feature importance.
In statistics, skewness is a degree of
asymmetry observed in a probability distribution that deviates from the
symmetrical normal distribution (bell curve) in a given set of data.
The normal distribution helps to know a skewness. When we talk about normal distribution, data symmetrically distributed. The symmetrical distribution has zero skewness as all measures of a central tendency lies in the middle. When data is symmetrically distributed, the left-hand side, and right-hand side, contain the same number of observations. (If the dataset has 90 values, then the left-hand side has 45 observations, and the right-hand side has 45 observations.). But, what if not symmetrical distributed? That data is called asymmetrical data, and that time skewness comes into the picture.
Types of skewness
1. Positive skewed or right-skewed
In statistics, a positively skewed
distribution is a sort of distribution where, unlike symmetrically distributed data where all measures of the
central tendency (mean, median, and mode) equal each other, with positively skewed data, the measures are dispersing,
which means Positively Skewed Distribution is a type of distribution where the
mean, median, and mode of the distribution are positive rather than negative or
zero.
In positively skewed, the mean of the
data is greater than the median (a large number of data-pushed on the
right-hand side). In other words, the results are bent towards the lower side.
The mean will be more than the median as the median is the middle value and
mode is always the highest value
The extreme positive skewness is not desirable for
distribution, as a high level of skewness can cause misleading results. The
data transformation tools are helping to make the skewed data closer to a
normal distribution. For positively skewed distributions, the famous
transformation is the log transformation. The log transformation proposes the
calculations of the natural logarithm for each value in the dataset.
2. Negative skewed or left-skewed
A negatively skewed distribution is the
straight reverse of a positively skewed distribution. In statistics, negatively
skewed distribution refers to the distribution model where more values are
plots on the right side of the graph, and the tail of the distribution is
spreading on the left side.
In negatively skewed, the mean of the
data is less than the median (a large number of data-pushed on the left-hand
side). Negatively Skewed Distribution is a type of distribution where the mean,
median, and mode of the distribution are negative rather than positive or zero.
Median is the middle value, and mode is the highest value, and due to unbalanced distribution median will be higher than the mean.
Calculate the skewness coefficient of the sample
Pearson’s first coefficient of skewness
Subtract a mode from a mean, then divides
the difference by standard deviation.
As Pearson’s correlation coefficient differs from -1 (perfect negative linear relationship) to +1 (perfect positive linear relationship), including a value of 0 indicating no linear relationship, When we divide the covariance values by the standard deviation, it truly scales the value down to a limited range of -1 to +1. That accurately the range of the correlation values.
Pearson’s first coefficient of skewness is helping
if the data present high mode. But, if the data have low mode or various modes,
Pearson’s first coefficient is not preferred, and Pearson’s second coefficient
may be superior, as it does not rely on the mode.
Pearson’s second coefficient of
skewness
Multiply the difference by 3, and divide the
product by standard deviation.
·
If the skewness is
between -0.5 & 0.5, the data are nearly symmetrical.
·
If the skewness is
between -1 & -0.5 (negative skewed) or between 0.5 & 1(positive
skewed), the data are slightly skewed.
·
If the skewness is
lower than -1 (negative skewed) or greater than 1 (positive skewed), the data
are extremely skewed.
Kurtosis
Kurtosis refers to the degree of presence of
outliers in the distribution.Kurtosis is a statistical measure, whether the
data is heavy-tailed or light-tailed in a normal distribution.
Excess Kurtosis
The excess kurtosis is used in
statistics and probability theory to compare the kurtosis coefficient with that
normal distribution. Excess kurtosis can be positive (Leptokurtic
distribution), negative (Platykurtic distribution), or near to zero (Mesokurtic
distribution). Since normal distributions have a kurtosis of 3, excess kurtosis
is calculating by subtracting kurtosis by 3.
Excess kurtosis = Kurt – 3
Types of excess kurtosis
- Leptokurtic or heavy-tailed
distribution (kurtosis more than normal distribution).
- Mesokurtic (kurtosis same as the
normal distribution).
- Platykurtic or short-tailed
distribution (kurtosis less than normal distribution).
Leptokurtic (kurtosis > 3)
Leptokurtic is having very long and
skinny tails, which means there are more chances of outliers. Positive values
of kurtosis indicate that distribution is peaked and possesses thick tails. An
extreme positive kurtosis indicates a distribution where more of the numbers
are located in the tails of the distribution instead of around the mean.
Platykurtic (kurtosis < 3)
Platykurtic having a lower tail and stretched
around center tails means most of the data points are present in high proximity
with mean. A platykurtic distribution is flatter (less peaked) when compared
with the normal distribution.
Mesokurtic (kurtosis = 3)
Mesokurtic is the same as the normal distribution, which means
kurtosis is near to 0. In Mesokurtic, distributions are moderate in breadth,
and curves are a medium peaked height.
Excess kurtosis can be positive (Leptokurtic distribution), negative (Platykurtic distribution), or near to zero (Mesokurtic distribution). Leptokurtic distribution (kurtosis more than normal distribution).Mesokurtic distribution (kurtosis same as the normal distribution).Platykurtic distribution (kurtosis less than normal distribution).
Comments
Post a Comment