Talk:Outlier

Renee Masse[edit]

What's a Renee Masse? (In the first sentence).

—Preceding unsigned comment added by 212.183.70.147 (talk) 15:04, 15 October 2007 (UTC)[reply]

Adopted orphan redirects for Google: inner fence, outer fence, mild outlier, Extreme outlier

In sans-serif font, 1.5 IQR looks like a division. Patrick 11:15 Dec 23, 2002 (UTC)

I did it that way so I wouldn't have to use * or x or × for multiplication. What would you prefer? dcljr 13:53 Dec 23, 2002 (UTC)

When using multi-letter variables a multiplication sign avoids ambiguity, and in this case coincidentally there was even a little more ambiguity.

Any of the three is fine with me, × is neatest, but more cumbersome to write. - Patrick 14:26 Dec 23, 2002 (UTC)

more[edit]

more info but clearer answers i don't get a thing and my exam is tommorrow!!!!

Outliers as 2 s.d.s away from men[edit]

Is there not a second defintion of outliers, as lying more than two standard deviations away from the mean? Or am I mixing other things up? I am a physicist, and it is a long time since I did "real" statistics... Batmanand | Talk 09:48, 28 September 2006 (UTC)[reply]

This is a poor definition of outliers as it changes upon recursion, i.e. the standard deviation is highly dependent on the outliers. Check boxplot for a simple but easy to understand definition that is not distribution dependent.

Boats[edit]

Isn't an outlier (German: Ausleger, Swedish: utliggare) also a supporting 2nd keel for a canoe or sailing boat that makes it almost a catamaran? Hmm... apparently this is called a outrigger on an outrigger canoe in English. Other languages would use "rig" for things that have sails. --LA2 23:04, 1 August 2007 (UTC)[reply]

-No, Ausleger is not outlier, that's a false friend. Outrigger, as you say, is the right term. —Preceding unsigned comment added by 212.183.70.147 (talk) 15:07, 15 October 2007 (UTC)[reply]

Definition[edit]

I'm opposed to the "Mathematical definition" section in the article. Determining whether an observation is an outlier is ultimately a subjective decision, and any definition based on measures such as standard deviation or interquartile range is completely arbitrary.

If this section need be kept, perhaps it could be renamed? The term "mathematical" implies a logical certainty, which doesn't apply in this case. -3mta3 (talk) 11:49, 14 April 2008 (UTC)[reply]

Methods for identifying outliers[edit]

I want to know what is the source of the method of using the Interquantile range mentioned in the text?? Is it just a rule of thumb, or does it have a more objective explanation?--Forich (talk) 21:33, 9 June 2008 (UTC)[reply]

It is a popular method, known as "Interquartile range" (IQR), where k = 3 is suposed to identify extreme outliers, and k = 1.5 mild outliers. This method has no scientific basis; it belongs to the category of Mumbo Jumbo methods in statistics. --Lambiam 04:51, 10 June 2008 (UTC)[reply]

i hate wiki —Preceding unsigned comment added by 71.250.133.163 (talk) 21:54, 9 September 2008 (UTC)[reply]

A more thorough discussion of methods for identifying outliers should be added, for example Rosner's test. See e.g. Barnett, V., and T. Lewis. (1995): Outliers in Statistical Data. Agnerf (talk) 09:09, 24 February 2022 (UTC)[reply]

Pointing a citation to its source[edit]

The second citation

"2. ^ Grubbs, F. E.: 1969, Procedures for detecting outlying observations in samples. Technometrics 11, 1–21."

I googled it and found it on jstor at http://www.jstor.org/pss/1266761

Would it be a good idea to link directly to the article in the reference section? I didn't know how and I didn't know if that was appropriate or not. It seems appropriate though.

Your thoughts?

--Ted Wheeland (talk) 21:38, 3 January 2010 (UTC)[reply]

Pronunciation?[edit]

How is it spoken ?

Like "Out Lier" or "ootlee-er".109.150.237.200 (talk) 09:46, 6 December 2012 (UTC)[reply]

Former. Hear it at http://www.merriam-webster.com/dictionary/outlier Glrx (talk) 22:21, 7 December 2012 (UTC)[reply]

Identifying Outliers[edit]

Much of this section is directly plagiarized from A Survey of Outlier Detection Methodologies (2004) by Hodge & Austin. For example,

"Type 1 - Determine the outliers with no prior knowledge of the data. This is essentially a learning approach analogous to unsupervised clustering. The approach processes the data as a static distribution, pinpoints the most remote points, and flags them as potential outliers." is word-for-word identical, as are the definitions of the following 2 types. At the absolute minimum their work should be cited. — Preceding unsigned comment added by 208.105.82.93 (talk) 21:48, 7 March 2013 (UTC)[reply]

Outliers exclusion citation[edit]

I think the phrase "Deletion of outlier data is a controversial practice frowned on by many scientists and science instructors" needs some kind of citation or should be reformulated / removed. I mean - why is it controversial? What can happen? Examples of bad things that happend because of outliers exclusion? — Preceding unsigned comment added by 89.120.104.106 (talk) 10:26, 18 July 2013 (UTC)[reply]

Outlier as measurement error[edit]

The whole article refers to outliers as possible errors (eg "An outlier may be due to variability in the measurement or it may indicate experimental error"). This is not right or actually incomplete. An outlier can also point at something real going on which is unusual. As a colleague pointed out, it could be your next Nobel prize. If I remember well, NASA had detected the ozone hole in data before Joe Farman published it, but NASA had excluded those. So, I would propose to change the tone of this article and emphasize that statistics helps to identify outliers, which are especially interesting points pointing at measurement errors, unusual distributions (the tails) and possibly new phenomena. — Preceding unsigned comment added by Pjtverheijen (talk • contribs) 05:58, 12 March 2014 (UTC)[reply]

Grubbs' and Modified Thompson Tau tests are the same[edit]

Modified Thompson Tau test is *exactly* the Grubbs' test for outliers, which makes things even more confusing. If it's just a different name convention, I think it should just link to Grubbs' test for outliers page. — Preceding unsigned comment added by Yannick Copin (talk • contribs) 16:24, 7 October 2016 (UTC)[reply]

Grubb is already linked. Maybe simply remove the whole section on that modified Thomson Tau, if the only "citeable" version is Grubbs. HelpUsStopSpam (talk) 22:35, 7 October 2016 (UTC)[reply]

Cutting "good" points together with the outliers[edit]

This article seems to not include any mention of the corrections for parameter estimates when part of the sample is cut. Remember, some "good" points may be cut together with the outliers. For example, conditional probability can be used to correct the parameter estimates. — Preceding unsigned comment added by 84.245.245.196 (talk) 14:51, 23 March 2017 (UTC)[reply]

Control Chart as method for outlier detection[edit]

Why isn't the control chart mentioned as method for outlier detection?

Seems like a blatant omission.

2601:14F:8002:CAD2:88A7:1D26:370A:AFD6 (talk) 11:40, 28 May 2018 (UTC)[reply]

It better suits change detection (aka change point detection). Plus, the underlying model largely is that of a normal distribution, which is mentioned here. You can't put everything everywhere. Chire (talk) 21:18, 28 May 2018 (UTC)[reply]