1 added
2 removed
Original
2026-01-01
Modified
2026-02-28
1
-
<p>1639 Learners</p>
1
+
<p>1675 Learners</p>
2
<p>Last updated on<strong>November 22, 2025</strong></p>
2
<p>Last updated on<strong>November 22, 2025</strong></p>
3
<p>Outliers are extreme values and an essential part of a dataset. Outliers provide valuable insights into data and can significantly impact the results of analysis. Let us now learn more about an outlier.</p>
3
<p>Outliers are extreme values and an essential part of a dataset. Outliers provide valuable insights into data and can significantly impact the results of analysis. Let us now learn more about an outlier.</p>
4
<h2>What is an Outlier?</h2>
4
<h2>What is an Outlier?</h2>
5
<p>Outliers are<a>data</a>points that stand out because they are much higher or lower than the rest<a>of</a>the data. Outliers can disproportionately affect statistical measures such as the<a>mean</a>,<a>standard deviation</a>, and<a>regression</a>models, skewing results and leading to misguided conclusions. This is because statistical measures, such as the mean, standard deviation, and regression models, are sensitive to extreme values. </p>
5
<p>Outliers are<a>data</a>points that stand out because they are much higher or lower than the rest<a>of</a>the data. Outliers can disproportionately affect statistical measures such as the<a>mean</a>,<a>standard deviation</a>, and<a>regression</a>models, skewing results and leading to misguided conclusions. This is because statistical measures, such as the mean, standard deviation, and regression models, are sensitive to extreme values. </p>
6
<p>For example, A teacher records the marks of 10 students in a<a>math</a>test. \(45, 48, 50, 47, 49, 46, 50, 48, 47, 95\). Here, all the scores are around 45 - 50, but 95 is much higher than the rest. So, 95 is an outlier because it is unusually high compared to the other data points.</p>
6
<p>For example, A teacher records the marks of 10 students in a<a>math</a>test. \(45, 48, 50, 47, 49, 46, 50, 48, 47, 95\). Here, all the scores are around 45 - 50, but 95 is much higher than the rest. So, 95 is an outlier because it is unusually high compared to the other data points.</p>
7
<h2>How to Identify an Outlier in a Dataset</h2>
7
<h2>How to Identify an Outlier in a Dataset</h2>
8
<p>Finding outliers is an integral part of data analysis because unusual values can heavily influence the results. Outliers can be identified using two main approaches: visualization techniques and statistical methods.</p>
8
<p>Finding outliers is an integral part of data analysis because unusual values can heavily influence the results. Outliers can be identified using two main approaches: visualization techniques and statistical methods.</p>
9
<p><strong>Using Visualization Methods</strong>Visualization involves presenting data in graphical form, making it easier to spot patterns and detect unusual values. Two commonly used visual tools are:</p>
9
<p><strong>Using Visualization Methods</strong>Visualization involves presenting data in graphical form, making it easier to spot patterns and detect unusual values. Two commonly used visual tools are:</p>
10
<p><strong>1. Box Plot</strong>A<a>box plot</a>displays the minimum, first quartile (Q₁),<a>median</a>, third quartile (Q₃), and maximum. Any point that lies outside the “whiskers” (beyond\(Q_1 - 1.5 \times \mathrm{IQR} \quad \text{or} \quad Q_3 + 1.5 \times \mathrm{IQR} \)) is considered an outlier. Example: Data: 12, 14, 15, 16, 17, 18, 40 In the box plot, 40 appears far outside the upper whisker, making it an outlier.</p>
10
<p><strong>1. Box Plot</strong>A<a>box plot</a>displays the minimum, first quartile (Q₁),<a>median</a>, third quartile (Q₃), and maximum. Any point that lies outside the “whiskers” (beyond\(Q_1 - 1.5 \times \mathrm{IQR} \quad \text{or} \quad Q_3 + 1.5 \times \mathrm{IQR} \)) is considered an outlier. Example: Data: 12, 14, 15, 16, 17, 18, 40 In the box plot, 40 appears far outside the upper whisker, making it an outlier.</p>
11
<p><strong>2. Scatter Plot</strong>A<a>scatter plot</a>displays individual data points on a graph. Outliers appear as points that lie far away from the general cluster. Example: If you plot students’ study hours vs. exam scores, and most points form a cluster, but one point. For example, 10 hours of study and only 5 marks that lies far away, that point is an outlier.</p>
11
<p><strong>2. Scatter Plot</strong>A<a>scatter plot</a>displays individual data points on a graph. Outliers appear as points that lie far away from the general cluster. Example: If you plot students’ study hours vs. exam scores, and most points form a cluster, but one point. For example, 10 hours of study and only 5 marks that lies far away, that point is an outlier.</p>
12
<h2>Identifying Outlier Using Box Plot</h2>
12
<h2>Identifying Outlier Using Box Plot</h2>
13
<p>A box plot is a statistical chart that provides a visual summary of how data is distributed. Outliers in a box plot can be identified using these steps.</p>
13
<p>A box plot is a statistical chart that provides a visual summary of how data is distributed. Outliers in a box plot can be identified using these steps.</p>
14
<p><strong>Step 1: </strong>Sort the data in<a>ascending order</a>and determine the median.</p>
14
<p><strong>Step 1: </strong>Sort the data in<a>ascending order</a>and determine the median.</p>
15
<p><strong>Step 2: </strong>Calculate the Interquartile Range (IQR), which represents the central 50% of the dataset.</p>
15
<p><strong>Step 2: </strong>Calculate the Interquartile Range (IQR), which represents the central 50% of the dataset.</p>
16
<p><strong>Step 3: </strong>Determine the lower and upper bounds (also called fences) using the IQR.</p>
16
<p><strong>Step 3: </strong>Determine the lower and upper bounds (also called fences) using the IQR.</p>
17
<p><strong>Step 4: </strong>Any data point that lies below the lower bound or above the upper bound is considered an outlier.</p>
17
<p><strong>Step 4: </strong>Any data point that lies below the lower bound or above the upper bound is considered an outlier.</p>
18
<h3>Explore Our Programs</h3>
18
<h3>Explore Our Programs</h3>
19
-
<p>No Courses Available</p>
20
<h2>Identifying Outlier Using Scatter Plot</h2>
19
<h2>Identifying Outlier Using Scatter Plot</h2>
21
<p>A scatter plot is used to visualize the relationship between two continuous<a>variables</a>, with each point represented as a dot. In this plot, any points far from the central cluster of data are considered outliers.</p>
20
<p>A scatter plot is used to visualize the relationship between two continuous<a>variables</a>, with each point represented as a dot. In this plot, any points far from the central cluster of data are considered outliers.</p>
22
<p><strong>Using Statistical Methods</strong>To detect the outliers numerically, statistical techniques are used. Some commonly used methods include Z-score, DBSCAN, and the Isolation Forest algorithm.</p>
21
<p><strong>Using Statistical Methods</strong>To detect the outliers numerically, statistical techniques are used. Some commonly used methods include Z-score, DBSCAN, and the Isolation Forest algorithm.</p>
23
<p><strong>Identifying Outliers Using the Z-Score</strong>The Z-score method measures how many standard deviations a data point is from the mean. It is calculated using the<a>formula</a>:</p>
22
<p><strong>Identifying Outliers Using the Z-Score</strong>The Z-score method measures how many standard deviations a data point is from the mean. It is calculated using the<a>formula</a>:</p>
24
<p>\(Z=X-μσ\)</p>
23
<p>\(Z=X-μσ\)</p>
25
<p>Where:</p>
24
<p>Where:</p>
26
<p>X = the data point</p>
25
<p>X = the data point</p>
27
<p>μ = mean of the dataset</p>
26
<p>μ = mean of the dataset</p>
28
<p>σ = standard deviation of the dataset</p>
27
<p>σ = standard deviation of the dataset</p>
29
<p>A data point is considered an outlier if its Z-score is<a>greater than</a>+3 or<a>less than</a>-3.</p>
28
<p>A data point is considered an outlier if its Z-score is<a>greater than</a>+3 or<a>less than</a>-3.</p>
30
<h2>Identifying Outlier Using Isolation Forest Algorithm</h2>
29
<h2>Identifying Outlier Using Isolation Forest Algorithm</h2>
31
<p>The Isolation Forest algorithm is an anomaly detection technique that uses decision trees to separate data points. It works by randomly partitioning the dataset. Points that are isolated in fewer steps are considered outliers because unusual values stand out and are easier to separate from the rest.</p>
30
<p>The Isolation Forest algorithm is an anomaly detection technique that uses decision trees to separate data points. It works by randomly partitioning the dataset. Points that are isolated in fewer steps are considered outliers because unusual values stand out and are easier to separate from the rest.</p>
32
<p>For example, Consider the dataset representing daily sales (in units): 50, 52, 55, 53, 58, 60, 54, 300</p>
31
<p>For example, Consider the dataset representing daily sales (in units): 50, 52, 55, 53, 58, 60, 54, 300</p>
33
<p>Most values are close to each other, between 50 and 60. But 300 is hugely different.</p>
32
<p>Most values are close to each other, between 50 and 60. But 300 is hugely different.</p>
34
<p>When the Isolation Forest algorithm creates random partitions:</p>
33
<p>When the Isolation Forest algorithm creates random partitions:</p>
35
<ul><li>The values around 50-60 need many splits to isolate because they are very similar.</li>
34
<ul><li>The values around 50-60 need many splits to isolate because they are very similar.</li>
36
<li>The value 300 gets isolated very quickly because it stands out from the rest. </li>
35
<li>The value 300 gets isolated very quickly because it stands out from the rest. </li>
37
</ul><p>Since 300 requires fewer splits, the algorithm marks it as an outlier. </p>
36
</ul><p>Since 300 requires fewer splits, the algorithm marks it as an outlier. </p>
38
<h2>How to Calculate Outliers</h2>
37
<h2>How to Calculate Outliers</h2>
39
<p>Outliers can be identified using different methods depending on the complexity of the data, the time available, and the level of<a>accuracy</a>needed. Here are four commonly used methods, along with the steps:</p>
38
<p>Outliers can be identified using different methods depending on the complexity of the data, the time available, and the level of<a>accuracy</a>needed. Here are four commonly used methods, along with the steps:</p>
40
<ul><li><strong>Sorting Method</strong><strong>Step 1:</strong>Arrange the data in<a>ascending</a>order.<p><strong>Step 2:</strong>Look for values that appear unusually high or low compared to the rest.</p>
39
<ul><li><strong>Sorting Method</strong><strong>Step 1:</strong>Arrange the data in<a>ascending</a>order.<p><strong>Step 2:</strong>Look for values that appear unusually high or low compared to the rest.</p>
41
<p><strong>Step 3:</strong>Verify by checking whether these extreme points are significantly different from the dataset's general pattern.</p>
40
<p><strong>Step 3:</strong>Verify by checking whether these extreme points are significantly different from the dataset's general pattern.</p>
42
<p><strong>Step 4:</strong>Mark these extreme values as outliers. For example, Data: 10, 12, 13, 14, 15, 80 80 is clearly separated and becomes the outlier.</p>
41
<p><strong>Step 4:</strong>Mark these extreme values as outliers. For example, Data: 10, 12, 13, 14, 15, 80 80 is clearly separated and becomes the outlier.</p>
43
</li>
42
</li>
44
<li><strong>Using Visualization</strong><strong>Step 1:</strong>Plot the data using a scatter plot or box plot.<p><strong>Step 2:</strong>Observe the data distribution.</p>
43
<li><strong>Using Visualization</strong><strong>Step 1:</strong>Plot the data using a scatter plot or box plot.<p><strong>Step 2:</strong>Observe the data distribution.</p>
45
<p><strong>Step 3:</strong>Identify points that are distant from the central cluster (scatter plot) or outside the whiskers (box plot).</p>
44
<p><strong>Step 3:</strong>Identify points that are distant from the central cluster (scatter plot) or outside the whiskers (box plot).</p>
46
<p><strong>Step 4:</strong>Mark these distant points as outliers. For example, In a scatter plot, if one dot lies far away from the cluster, it is an outlier.</p>
45
<p><strong>Step 4:</strong>Mark these distant points as outliers. For example, In a scatter plot, if one dot lies far away from the cluster, it is an outlier.</p>
47
</li>
46
</li>
48
<li><strong>Statistical Outlier Detection (Z-Score Method)</strong><strong>Step 1:</strong>Calculate the mean (μ) and standard deviation (σ) of the dataset.<p><strong>Step 2:</strong>Use the formula,</p>
47
<li><strong>Statistical Outlier Detection (Z-Score Method)</strong><strong>Step 1:</strong>Calculate the mean (μ) and standard deviation (σ) of the dataset.<p><strong>Step 2:</strong>Use the formula,</p>
49
<p>\(Z= X-μσ \)</p>
48
<p>\(Z= X-μσ \)</p>
50
<p><strong>Step 3:</strong>Compute the Z-score for each data point.</p>
49
<p><strong>Step 3:</strong>Compute the Z-score for each data point.</p>
51
<p><strong>Step 4:</strong>If a Z-score is greater than +3 or less than -3, that data point is an outlier.</p>
50
<p><strong>Step 4:</strong>If a Z-score is greater than +3 or less than -3, that data point is an outlier.</p>
52
<p>For example, If a score has Z = 3.2 → it is an outlier.</p>
51
<p>For example, If a score has Z = 3.2 → it is an outlier.</p>
53
</li>
52
</li>
54
<li><strong>Interquartile Range (IQR) Method</strong><strong>Step 1:</strong>Arrange the data in ascending order.<p><strong>Step 2:</strong>Find Q₁ (first quartile) and Q₃ (third quartile).</p>
53
<li><strong>Interquartile Range (IQR) Method</strong><strong>Step 1:</strong>Arrange the data in ascending order.<p><strong>Step 2:</strong>Find Q₁ (first quartile) and Q₃ (third quartile).</p>
55
<p><strong>Step 3:</strong> Calculate the \( IQR = Q₃ - Q₁\).</p>
54
<p><strong>Step 3:</strong> Calculate the \( IQR = Q₃ - Q₁\).</p>
56
<p><strong>Step 4:</strong>Compute the outlier fences:</p>
55
<p><strong>Step 4:</strong>Compute the outlier fences:</p>
57
<p>Lower Fence = \(Q₁ - 1.5 × IQR\) Upper Fence =\( Q₃ + 1.5 × IQR\)</p>
56
<p>Lower Fence = \(Q₁ - 1.5 × IQR\) Upper Fence =\( Q₃ + 1.5 × IQR\)</p>
58
</li>
57
</li>
59
<li>Any data point below the lower fence or above the upper wall is classified as an outlier. For example, If the upper fence is 40, and one value is 55 → 55 is an outlier.</li>
58
<li>Any data point below the lower fence or above the upper wall is classified as an outlier. For example, If the upper fence is 40, and one value is 55 → 55 is an outlier.</li>
60
</ul><h2>Calculating Outlier by Sorting Method</h2>
59
</ul><h2>Calculating Outlier by Sorting Method</h2>
61
<p>In this method the data is arranged in ascending order and<a>sorting</a>the data visually scanning the extreme values. </p>
60
<p>In this method the data is arranged in ascending order and<a>sorting</a>the data visually scanning the extreme values. </p>
62
<p><strong>Step 1:</strong>Arrange the data in ascending, that is, from small to big</p>
61
<p><strong>Step 1:</strong>Arrange the data in ascending, that is, from small to big</p>
63
<p><strong>Step 2:</strong>The value which is higher than the other values are considered to be the outlier </p>
62
<p><strong>Step 2:</strong>The value which is higher than the other values are considered to be the outlier </p>
64
<p><strong>Calculating Outlier by Statistical Outlier Detection (Z-score Method)</strong></p>
63
<p><strong>Calculating Outlier by Statistical Outlier Detection (Z-score Method)</strong></p>
65
<p>The z-score is calculated by using the formula,\( z = X - μ/σ\),</p>
64
<p>The z-score is calculated by using the formula,\( z = X - μ/σ\),</p>
66
<p>Here, X is the data point</p>
65
<p>Here, X is the data point</p>
67
<p>μ is the mean of the data<a>set</a></p>
66
<p>μ is the mean of the data<a>set</a></p>
68
<p>σ (sigma) is the standard deviation.</p>
67
<p>σ (sigma) is the standard deviation.</p>
69
<p>If the value is greater than or less than ±3, then the value is an outlier. That is an outlier is more than 3 times a standard deviation. </p>
68
<p>If the value is greater than or less than ±3, then the value is an outlier. That is an outlier is more than 3 times a standard deviation. </p>
70
<h2>Calculating Outlier by Interquartile Range (IQR) Method</h2>
69
<h2>Calculating Outlier by Interquartile Range (IQR) Method</h2>
71
<p>Interquartile range is the median of the half of the data set. In this method, we find the outlier by following these steps,</p>
70
<p>Interquartile range is the median of the half of the data set. In this method, we find the outlier by following these steps,</p>
72
<p><strong>Step 1:</strong>Arranging the data in ascending order, that is, low from high.</p>
71
<p><strong>Step 1:</strong>Arranging the data in ascending order, that is, low from high.</p>
73
<p><strong>Step 2:</strong>Finding the value of Q1 and Q3, Q1 is the middle value of the lower half and Q3 is the middle of the upper half.</p>
72
<p><strong>Step 2:</strong>Finding the value of Q1 and Q3, Q1 is the middle value of the lower half and Q3 is the middle of the upper half.</p>
74
<p><strong>Step 3:</strong>Calculate the value of IQR. So, \(IQR = Q3 - Q1\).</p>
73
<p><strong>Step 3:</strong>Calculate the value of IQR. So, \(IQR = Q3 - Q1\).</p>
75
<p><strong>Step 4:</strong>Finding the value of lower bound and upper bound, here the lower bound = \(Q1 - 1.5 × IQR\) and the upper bound = \(Q3 + 1.5 × IQR\).</p>
74
<p><strong>Step 4:</strong>Finding the value of lower bound and upper bound, here the lower bound = \(Q1 - 1.5 × IQR\) and the upper bound = \(Q3 + 1.5 × IQR\).</p>
76
<h2>Tips and Tricks to Master Outliers</h2>
75
<h2>Tips and Tricks to Master Outliers</h2>
77
<p>To master the topic outliers, some tips and tricks are mentioned below. </p>
76
<p>To master the topic outliers, some tips and tricks are mentioned below. </p>
78
<ul><li>Outliers can be used to spot unusual credit card transactions by<a>comparing</a>them against a customer’s typical spending behavior.</li>
77
<ul><li>Outliers can be used to spot unusual credit card transactions by<a>comparing</a>them against a customer’s typical spending behavior.</li>
79
<li>In healthcare, outliers help identify abnormal vital signs in a patient’s records, which may indicate conditions that require immediate attention.</li>
78
<li>In healthcare, outliers help identify abnormal vital signs in a patient’s records, which may indicate conditions that require immediate attention.</li>
80
<li>Outliers are useful in detecting unexpected spikes in website traffic from certain locations or unusual user behavior, which could signal a bot attack or viral activity.</li>
79
<li>Outliers are useful in detecting unexpected spikes in website traffic from certain locations or unusual user behavior, which could signal a bot attack or viral activity.</li>
81
<li>In cybersecurity, outliers help flag suspicious login attempts, irregular access times, or abnormal data transfers that may point to hacking or malware.</li>
80
<li>In cybersecurity, outliers help flag suspicious login attempts, irregular access times, or abnormal data transfers that may point to hacking or malware.</li>
82
<li>In manufacturing, outliers help identify defective products or irregular machine performance, ensuring consistent production quality.</li>
81
<li>In manufacturing, outliers help identify defective products or irregular machine performance, ensuring consistent production quality.</li>
83
<li>Parents can encourage children to notice patterns and identify values that “don’t belong,” helping them build intuitive understanding.</li>
82
<li>Parents can encourage children to notice patterns and identify values that “don’t belong,” helping them build intuitive understanding.</li>
84
<li>Teachers can use box plots and scatter plots during lessons to visually demonstrate how outliers appear in a dataset.</li>
83
<li>Teachers can use box plots and scatter plots during lessons to visually demonstrate how outliers appear in a dataset.</li>
85
</ul><h2>Common Mistakes and How to Avoid Them in Outlier</h2>
84
</ul><h2>Common Mistakes and How to Avoid Them in Outlier</h2>
86
<p>Now let’s learn a few common mistakes that students tend to repeat when working on outlier. But learning these students can master outlier</p>
85
<p>Now let’s learn a few common mistakes that students tend to repeat when working on outlier. But learning these students can master outlier</p>
87
<h2>Real-Life Applications of Outlier</h2>
86
<h2>Real-Life Applications of Outlier</h2>
88
<p>Outlier is used in different fields such as finance, environment monitoring, cybersecurity, and so on. Let’s learn a few real-life applications of outliers. </p>
87
<p>Outlier is used in different fields such as finance, environment monitoring, cybersecurity, and so on. Let’s learn a few real-life applications of outliers. </p>
89
<ul><li>To identify fraud detection in finance outlier is used, as it can identify any unusual transactions using credit cards by analyzing the spending patterns of the customer. </li>
88
<ul><li>To identify fraud detection in finance outlier is used, as it can identify any unusual transactions using credit cards by analyzing the spending patterns of the customer. </li>
90
<li>In health monitoring, an outlier is used to analyze any abnormal vital signs in the patient's medical records. It is helpful as it could indicate the health issues that need immediate attention.</li>
89
<li>In health monitoring, an outlier is used to analyze any abnormal vital signs in the patient's medical records. It is helpful as it could indicate the health issues that need immediate attention.</li>
91
<li>To identify the unusual spikes in website traffic from a special location or user behavior. </li>
90
<li>To identify the unusual spikes in website traffic from a special location or user behavior. </li>
92
</ul><ul><li>Cybersecurity Threat Detection: Outliers help in detecting unusual login attempts, irregular access times, or abnormal data transfers in networks, which could indicate hacking or malware attacks.</li>
91
</ul><ul><li>Cybersecurity Threat Detection: Outliers help in detecting unusual login attempts, irregular access times, or abnormal data transfers in networks, which could indicate hacking or malware attacks.</li>
93
</ul><ul><li>Manufacturing and Quality Control: In production, outliers can highlight defective items or abnormal machine behavior, helping industries quickly address errors and maintain<a>product</a>quality.</li>
92
</ul><ul><li>Manufacturing and Quality Control: In production, outliers can highlight defective items or abnormal machine behavior, helping industries quickly address errors and maintain<a>product</a>quality.</li>
94
</ul><h3>Problem 1</h3>
93
</ul><h3>Problem 1</h3>
95
<p>A teacher records the ages of students in a class: 12, 13, 14, 15, 12, 13, 14, 12, 13, 27. Find the outlier in the dataset.</p>
94
<p>A teacher records the ages of students in a class: 12, 13, 14, 15, 12, 13, 14, 12, 13, 27. Find the outlier in the dataset.</p>
96
<p>Okay, lets begin</p>
95
<p>Okay, lets begin</p>
97
<p>The outlier is 27. </p>
96
<p>The outlier is 27. </p>
98
<h3>Explanation</h3>
97
<h3>Explanation</h3>
99
<p>Arranging the data: 12, 12, 12, 13, 13, 13, 14, 14, 15, 27</p>
98
<p>Arranging the data: 12, 12, 12, 13, 13, 13, 14, 14, 15, 27</p>
100
<p>The data set has 10 numbers </p>
99
<p>The data set has 10 numbers </p>
101
<p>Here, Q1 is 12</p>
100
<p>Here, Q1 is 12</p>
102
<p>Q3 is 14</p>
101
<p>Q3 is 14</p>
103
<p>So,\( IQR = Q3 - Q1 = 14 - 12 = 2\)</p>
102
<p>So,\( IQR = Q3 - Q1 = 14 - 12 = 2\)</p>
104
<p>Lower bound =\( Q1 - 1.5 × IQR = 12 - 1.5 × 2 = 9\)</p>
103
<p>Lower bound =\( Q1 - 1.5 × IQR = 12 - 1.5 × 2 = 9\)</p>
105
<p>Upper bound = \(Q3 + 1.5 × IQR = 14 + 1.5 × 2 = 17\)</p>
104
<p>Upper bound = \(Q3 + 1.5 × IQR = 14 + 1.5 × 2 = 17\)</p>
106
<p>Any value below 9 or above 17 is the outlier. Here the outlier is 27.</p>
105
<p>Any value below 9 or above 17 is the outlier. Here the outlier is 27.</p>
107
<p>Well explained 👍</p>
106
<p>Well explained 👍</p>
108
<h3>Problem 2</h3>
107
<h3>Problem 2</h3>
109
<p>A runner records his daily running distance (in miles) over 7 days: 3, 4, 3.5, 3.8, 4.2, 3.9, 10. Identify the outlier.</p>
108
<p>A runner records his daily running distance (in miles) over 7 days: 3, 4, 3.5, 3.8, 4.2, 3.9, 10. Identify the outlier.</p>
110
<p>Okay, lets begin</p>
109
<p>Okay, lets begin</p>
111
<p>The outlier is 10. </p>
110
<p>The outlier is 10. </p>
112
<h3>Explanation</h3>
111
<h3>Explanation</h3>
113
<p>Sorting the data: 3, 3.5, 3.8, 3.9, 4, 4.2, 10</p>
112
<p>Sorting the data: 3, 3.5, 3.8, 3.9, 4, 4.2, 10</p>
114
<p>Here the median is 4th value: 3.9</p>
113
<p>Here the median is 4th value: 3.9</p>
115
<p>The lower half is 3, 3.5, 3.8. So, \(Q1 = 3.5\)</p>
114
<p>The lower half is 3, 3.5, 3.8. So, \(Q1 = 3.5\)</p>
116
<p>The upper half is 4, 4.2, 10. So, \(Q3 = 4.2\)</p>
115
<p>The upper half is 4, 4.2, 10. So, \(Q3 = 4.2\)</p>
117
<p> So, \(IQR = Q3 - Q1 = 4.2 - 3.5 = 0.7\)</p>
116
<p> So, \(IQR = Q3 - Q1 = 4.2 - 3.5 = 0.7\)</p>
118
<p>Finding the lower bound,</p>
117
<p>Finding the lower bound,</p>
119
<p>Lower bound = \(Q1 - 1.5 × IQR \)</p>
118
<p>Lower bound = \(Q1 - 1.5 × IQR \)</p>
120
<p>= \(3.5 - 1.5 × 0.7 = 2.45\)</p>
119
<p>= \(3.5 - 1.5 × 0.7 = 2.45\)</p>
121
<p>Finding the upper bound,</p>
120
<p>Finding the upper bound,</p>
122
<p>Upper bound = \(Q3 + 1.5 × IQR\)</p>
121
<p>Upper bound = \(Q3 + 1.5 × IQR\)</p>
123
<p>= \(4.2 + 1.5 × 0.7 = 5.25 \)</p>
122
<p>= \(4.2 + 1.5 × 0.7 = 5.25 \)</p>
124
<p>The number below 2.45 and above 5.25 is the outlier</p>
123
<p>The number below 2.45 and above 5.25 is the outlier</p>
125
<p>Here the outlier is 10.</p>
124
<p>Here the outlier is 10.</p>
126
<p>Well explained 👍</p>
125
<p>Well explained 👍</p>
127
<h3>Problem 3</h3>
126
<h3>Problem 3</h3>
128
<p>A bakery records daily cupcake sales: 25, 30, 28, 35, 27, 500, 32. Find the outlier.</p>
127
<p>A bakery records daily cupcake sales: 25, 30, 28, 35, 27, 500, 32. Find the outlier.</p>
129
<p>Okay, lets begin</p>
128
<p>Okay, lets begin</p>
130
<p>The outlier is 500. </p>
129
<p>The outlier is 500. </p>
131
<h3>Explanation</h3>
130
<h3>Explanation</h3>
132
<p>Sorting the data: 25, 27, 28, 30, 32, 35, 500</p>
131
<p>Sorting the data: 25, 27, 28, 30, 32, 35, 500</p>
133
<p>The 4th value is the median, so the median is 30</p>
132
<p>The 4th value is the median, so the median is 30</p>
134
<p>The lower half is 25, 27, 28. So, Q1 is 27</p>
133
<p>The lower half is 25, 27, 28. So, Q1 is 27</p>
135
<p>The upper half is 32, 35, 500. So, Q3 is 35</p>
134
<p>The upper half is 32, 35, 500. So, Q3 is 35</p>
136
<p>\(IQR = Q3 - Q1 \)</p>
135
<p>\(IQR = Q3 - Q1 \)</p>
137
<p>So, \(IQR = 35 - 27 = 8\)</p>
136
<p>So, \(IQR = 35 - 27 = 8\)</p>
138
<p>Now let’s find the lower bound,</p>
137
<p>Now let’s find the lower bound,</p>
139
<p>Lower bound = \(Q1 -1.5 × IQR = 27 - 1.5 × 8 = 15\)</p>
138
<p>Lower bound = \(Q1 -1.5 × IQR = 27 - 1.5 × 8 = 15\)</p>
140
<p>Upper bound = \(Q3 +1.5 × IQR = 35 + 1.5 × 8 = 47\)</p>
139
<p>Upper bound = \(Q3 +1.5 × IQR = 35 + 1.5 × 8 = 47\)</p>
141
<p>Here, the outlier is below 15 and above 47, so the outlier is 500.</p>
140
<p>Here, the outlier is below 15 and above 47, so the outlier is 500.</p>
142
<p>Well explained 👍</p>
141
<p>Well explained 👍</p>
143
<h3>Problem 4</h3>
142
<h3>Problem 4</h3>
144
<p>A group of friends records their heights in inches: 60, 61, 62, 63, 64, 65, 90. Identify the outlier.</p>
143
<p>A group of friends records their heights in inches: 60, 61, 62, 63, 64, 65, 90. Identify the outlier.</p>
145
<p>Okay, lets begin</p>
144
<p>Okay, lets begin</p>
146
<p>The outlier here is 90.</p>
145
<p>The outlier here is 90.</p>
147
<h3>Explanation</h3>
146
<h3>Explanation</h3>
148
<p>Sorting the data in ascending order: 60, 61, 62, 63, 64, 65, 90</p>
147
<p>Sorting the data in ascending order: 60, 61, 62, 63, 64, 65, 90</p>
149
<p>Here the median is the 4th value, which is 63</p>
148
<p>Here the median is the 4th value, which is 63</p>
150
<p>Therefore, the lower half is 60, 61, 62. So, Q1 is 61.5</p>
149
<p>Therefore, the lower half is 60, 61, 62. So, Q1 is 61.5</p>
151
<p>The upper half is 64, 65, 90. So, Q3 is 64.5</p>
150
<p>The upper half is 64, 65, 90. So, Q3 is 64.5</p>
152
<p>\(IQR = Q3 - Q1 = 64.5 - 61.5 = 3\)</p>
151
<p>\(IQR = Q3 - Q1 = 64.5 - 61.5 = 3\)</p>
153
<p>Lower bound = \(Q1 - 1.5 × IQR\) </p>
152
<p>Lower bound = \(Q1 - 1.5 × IQR\) </p>
154
<p>\(= 61.5 - 1.5 × 3 = 61.5 - 4.5 = 57\)</p>
153
<p>\(= 61.5 - 1.5 × 3 = 61.5 - 4.5 = 57\)</p>
155
<p>Upper bound =\( Q3 + 1.5 × IQR = 64 + 1.5 × 4 \)</p>
154
<p>Upper bound =\( Q3 + 1.5 × IQR = 64 + 1.5 × 4 \)</p>
156
<p>= \(64.5 + 1.5 × 3 = 64.5 + 4.5= 69\) </p>
155
<p>= \(64.5 + 1.5 × 3 = 64.5 + 4.5= 69\) </p>
157
<p>Any value above 69 is an outlier. As \(90 > 69\), it is the outlier.</p>
156
<p>Any value above 69 is an outlier. As \(90 > 69\), it is the outlier.</p>
158
<p>Well explained 👍</p>
157
<p>Well explained 👍</p>
159
<h3>Problem 5</h3>
158
<h3>Problem 5</h3>
160
<p>A company records the number of employees working overtime each week: 5, 7, 6, 8, 6, 50, 7. Identify the outlier.</p>
159
<p>A company records the number of employees working overtime each week: 5, 7, 6, 8, 6, 50, 7. Identify the outlier.</p>
161
<p>Okay, lets begin</p>
160
<p>Okay, lets begin</p>
162
<p>The outlier here is 50. </p>
161
<p>The outlier here is 50. </p>
163
<h3>Explanation</h3>
162
<h3>Explanation</h3>
164
<p>Sorting the data in ascending order: 5, 6, 6, 7, 7, 8, 50</p>
163
<p>Sorting the data in ascending order: 5, 6, 6, 7, 7, 8, 50</p>
165
<p>Here the median is the 4th value, which is 7</p>
164
<p>Here the median is the 4th value, which is 7</p>
166
<p>Therefore, the lower half is 5, 6, 6. So Q1 is 6</p>
165
<p>Therefore, the lower half is 5, 6, 6. So Q1 is 6</p>
167
<p>The upper half is 7, 8, 50. So, Q3 is 8</p>
166
<p>The upper half is 7, 8, 50. So, Q3 is 8</p>
168
<p>\(IQR = Q3 - Q1 = 8 - 6 = 2\)</p>
167
<p>\(IQR = Q3 - Q1 = 8 - 6 = 2\)</p>
169
<p>Lower bound = \(Q1 - 1.5 × IQR\)</p>
168
<p>Lower bound = \(Q1 - 1.5 × IQR\)</p>
170
<p> \(= 6 - 1.5 × 2 = 3\)</p>
169
<p> \(= 6 - 1.5 × 2 = 3\)</p>
171
<p>Upper bound = \(Q3 + 1.5 × IQR\)</p>
170
<p>Upper bound = \(Q3 + 1.5 × IQR\)</p>
172
<p>= \(8 + 1.5 × 2 = 11\)</p>
171
<p>= \(8 + 1.5 × 2 = 11\)</p>
173
<p>Any value above 11 and below 3 is an outlier. As \(50 > 11\) it is the outlier.</p>
172
<p>Any value above 11 and below 3 is an outlier. As \(50 > 11\) it is the outlier.</p>
174
<p>Well explained 👍</p>
173
<p>Well explained 👍</p>
175
<h2>FAQs of Outlier</h2>
174
<h2>FAQs of Outlier</h2>
176
<h3>1.What is the 1.5 IQR rule for outliers?</h3>
175
<h3>1.What is the 1.5 IQR rule for outliers?</h3>
177
<p>The \(1.5 IQR\) rule is used to find the value of outlier in a dataset. It is based on the idea that the outlier falls 1.5 times the IQR, hence the formula lower bound =\( Q1 - 1.5 × IQR\) and upper bound = \(Q1 + 1.5 × IQR\).</p>
176
<p>The \(1.5 IQR\) rule is used to find the value of outlier in a dataset. It is based on the idea that the outlier falls 1.5 times the IQR, hence the formula lower bound =\( Q1 - 1.5 × IQR\) and upper bound = \(Q1 + 1.5 × IQR\).</p>
178
<h3>2.How many deviations is an outlier?</h3>
177
<h3>2.How many deviations is an outlier?</h3>
179
<p>For a data point to be an outlier, it needs to be 3 standard deviations away from the mean. </p>
178
<p>For a data point to be an outlier, it needs to be 3 standard deviations away from the mean. </p>
180
<h3>3.What does IQR stand for?</h3>
179
<h3>3.What does IQR stand for?</h3>
181
<p>IQR stand for interquartile range, which is the half of the 50% of the data, that is\( IQR = Q3 - Q1\), where Q1 is the middle value of the lower half and Q3 is the middle of the upper half.</p>
180
<p>IQR stand for interquartile range, which is the half of the 50% of the data, that is\( IQR = Q3 - Q1\), where Q1 is the middle value of the lower half and Q3 is the middle of the upper half.</p>
182
<h3>4.Is the z-score an outlier?</h3>
181
<h3>4.Is the z-score an outlier?</h3>
183
<p>No, z-score is not the outlier. If the value of the z-score is greater than or less than ±3 then the value is the outlier. </p>
182
<p>No, z-score is not the outlier. If the value of the z-score is greater than or less than ±3 then the value is the outlier. </p>
184
<h3>5.How to eliminate outliers?</h3>
183
<h3>5.How to eliminate outliers?</h3>
185
<p>An outlier can be eliminated by identifying and removing the outlier from the data set. Or else by replacing the outlier with the mean, median, or<a>mode</a>of the data set without the outlier. </p>
184
<p>An outlier can be eliminated by identifying and removing the outlier from the data set. Or else by replacing the outlier with the mean, median, or<a>mode</a>of the data set without the outlier. </p>
186
<h2>Jaipreet Kour Wazir</h2>
185
<h2>Jaipreet Kour Wazir</h2>
187
<h3>About the Author</h3>
186
<h3>About the Author</h3>
188
<p>Jaipreet Kour Wazir is a data wizard with over 5 years of expertise in simplifying complex data concepts. From crunching numbers to crafting insightful visualizations, she turns raw data into compelling stories. Her journey from analytics to education ref</p>
187
<p>Jaipreet Kour Wazir is a data wizard with over 5 years of expertise in simplifying complex data concepts. From crunching numbers to crafting insightful visualizations, she turns raw data into compelling stories. Her journey from analytics to education ref</p>
189
<h3>Fun Fact</h3>
188
<h3>Fun Fact</h3>
190
<p>: She compares datasets to puzzle games-the more you play with them, the clearer the picture becomes!</p>
189
<p>: She compares datasets to puzzle games-the more you play with them, the clearer the picture becomes!</p>