Talk:Probability axioms

This is the talk page for discussing improvements to the Probability axioms article.
This is not a forum for general discussion of the article's subject.

Put new text under old text. Click here to start a new topic.
New to Wikipedia? Welcome! Learn to edit; get help.

Article policies

Find sources: Google (books · news · scholar · free images · WP refs) · FENS · JSTOR · TWL

Mathematics Mid‑priority

	Mathematics portal This article is within the scope of WikiProject Mathematics, a collaborative effort to improve the coverage of mathematics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.MathematicsWikipedia:WikiProject MathematicsTemplate:WikiProject Mathematicsmathematics articles
Mid	This article has been rated as Mid-priority on the project's priority scale.

Statistics Mid‑importance

	This article is within the scope of WikiProject Statistics, a collaborative effort to improve the coverage of statistics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.StatisticsWikipedia:WikiProject StatisticsTemplate:WikiProject StatisticsStatistics articles
Mid	This article has been rated as Mid-importance on the importance scale.

Daily pageviews of this article

A graph should have been displayed here but graphs are temporarily disabled. Until they are enabled again, visit the interactive graph at pageviews.wmcloud.org

Earliest discussion[edit]

Would it be possibvle to rejig this in terms of Borel sets? Just a more elegant way of expressing it IMHO; here it's as if the defintion of sigma algebra comes out of probability theory. It might be moe confusing for the general reader however.

It's more confusing for me, anyway. The universe S is not necessarily a topological space, so what is a Borel set in this context? Fool 23:42 Mar 11, 2003 (UTC)

"Fool" is right. Borel sets are by definition members of the sigma-algebra generated by a topology. But there need not be any topology on a probability space. Or, at least, no topology is explicitly contemplated by the conventional Kolmogorovian definition.

I have moved this article to "Probability axioms" (plural!). Usually it is better to use the singular than the plural in the title of an article; "zebra" is better than "zebras". But in the case of this article, it is colossally silly. This is not about axioms as individual things; it is about systems of axioms, or, at least, about one particular system of axioms --- the one formulated by Kolmogorov. Michael Hardy 00:41 Mar 12, 2003 (UTC)

I think Cox's axioms should be stated here in addition to Kolmogorov's; if I'm not mistaken, the Kolmogorov axioms are derivable as theorems from Cox's -- which, again IINM, was Cox's point, that the accepted laws of probability are derivable from more basic assumptions. The one bit that doesn't carry over, if we start from Cox's axioms, is countable additivity -- IINM finite additivity derives from Cox's axioms but not countable additivity. Wile E. Heresiarch 15:30, 27 Dec 2003 (UTC)

I believe Axiom 1 should be stated simply as P(E) >= 0 rather than 0 <= P(E) <=1, since P(E) <=1 is actually a consequence of P(S) = 1, P(E) >= 0 and countable additivity.

Sigma-algebra[edit]

I'd like to rework this article to at least begin in a less technical manner, as I distinctly remember being confused by this when I learned it. But I'm not sure about the history here. I can understand why a modern mathematician working in ZFC needs this sort of thing to excommunicate non-measurable sets, but did Kolmogorov really care (or even know) about such things at all? Did his first axiom not apply to all events? Was his third axiom really countably additive and not just pairwise (therefore finitely) additive? -Dan (Fool) 03:47, 5 December 2005 (UTC)

Yes, his celebrated 1933 book does explicitly state the axiom as countable additivity. And he said that that, rather than just finite additivity, was merely for the sake of convenience. That Kolmogorov did NOT know about such things as sigma-algebras and non-measurable sets strikes me as implausible, but I haven't looked that closely. But after all, Kolmogorov's book appeared in 1933, so one should expect it to be quite modern in approach. Michael Hardy 22:11, 5 December 2005 (UTC)[reply]

Thanks. Actually, I figured it out, I was confusing him with Kronecker for some reason. Oops. -Dan 00:44, 6 December 2005 (UTC)

Proposed merge with probability theory[edit]

I propose this page has a merge with probability theory. Please add your comments to the proposal there. Thanks Andeggs 16:13, 24 December 2006 (UTC)[reply]

If someone can check that all of the relevant information has been moved to probability theory, we might be able to delete this page and replace it with a redirect? MisterSheik 17:12, 28 February 2007 (UTC)[reply]

I'm not for the merge anymore given that this page is linked to from probability space. And I'm not for folding probability space into probability theory given that the treatment at probability theory has a nice parallel structure with the other kinds of prob. theory, while the one at prob. space is more in depth, and clearer given no prior knowledge. Yeah, there's duplication, but there's also a lot of duplication of probability distribution, for example (which could really use a clean-up.) MisterSheik 17:41, 29 March 2007 (UTC)[reply]

Oversimplified?[edit]

From the discussion here it looks like this once discussed sigma-algebras and now doesn't. As it stands, this is just wrong - not every subset of omega can be assigned a probability. — ciphergoth 10:29, 11 November 2007 (UTC)

Question regarding third axiom[edit]

I have a question regarding the third axiom. P(E₁UE₂......UE_N)=P(E₁)+....P(E_N) thing... Wouldn't it suffice to say the equality for two events ? Can't the above equality be derived from the equation considering just 2 Events? In fact, by Occam's Razor, isn't the latter the way it is supposed to be ? Rkr1991 ^{(Wanna chat?)} 13:43, 11 September 2009 (UTC)[reply]

Stating it for two events implies the equality only for finitely many events ("finite additivity"), not for countably many ("countable additivity"). It is in fact possible to reject countable additivity and get a different kind of probability theory, but it's not common. Shreevatsa (talk) 14:03, 11 September 2009 (UTC)[reply]

Why woldn't it hold for countable additivity ? Can you please explain ? Rkr1991 ^{(Wanna chat?)} 04:13, 12 September 2009 (UTC)[reply]

Why would it hold? :-) It is clear how to extend it from 2 to any finite number (by induction etc.), but it's not possible to extend it to infinitely many. [BTW if a mathematical discussion gets too far from discussing the article itself, it may be best to take it to the reference desk.) Shreevatsa (talk) 04:26, 12 September 2009 (UTC)[reply]

Page name[edit]

I think the name of this page is misleading, since there are other ways to formalise probability theory than Kolmogorov's (one example is Cox's approach, linked at the bottom of the article). It should be called something like "Kolmogorov's axioms for probability theory" instead. Nathaniel Virgo (talk) 13:34, 14 April 2011 (UTC)[reply]

Re-organizing First Axiom?[edit]

So...maybe I'm a horrible person, but I've been reading this page for a while and just noticed that axiom 1 kind of spells out

P(E)\in RAPE\geq 0

Feel free to disregard if it's just me...but maybe a simple rotation of the P(E) and R terms would help...? 71.197.0.228 (talk) 06:51, 2 May 2012 (UTC)[reply]

Errors in these axioms[edit]

When a reader carefully studies the sections First axiom, Second axiom, and Third axiom (the heart of this article), should he/she understand these to be building on the second paragraph of the introdoction (the paragraph that begins "These assumptions can be summarised as....")? Or are they meant to be independent of that paragraph?

The reason I ask is that First axiom defines F, which has already been defined in the second paragraph. This suggests that they are meant to be independent. However, Second axiom never defines Ω. There is no indication that Ω has any relation to any of the symbols yet mentioned, nor even that Ω is a set at all. Then to the reader's surprise, a set subtraction operation Ω\E is performed later on, suggesting that Ω is a set.

Moreover, the Consequences section is wrong, or the Axioms are misstated. Consider the following example. Let F = {a, b, c}. Let Ω = {a, b}. Let P({a}) = 0.5, P({b}) = 0.5, and P({c}) = 0.5, and assume that P is fully additive. You can easily verify that this example satisfies all three axioms: in particular, every subset of F has non-negative finite probability, and P(Ω) = P({a, b}) = 1.

Yet it does not satisfy the so-called numeric bound consequence, because there is a subset of F whose probability is greater than one: specifically, P({a, b, c}) = 1.5.

One way to fix this is to explicitly state that Ω is just another name for F. — Lawrence King ^(talk) 21:46, 9 November 2012 (UTC)[reply]

Another way to fix this issue would be to use the same symbol

Ω

or

{\mathcal {F}}

throughout the article. In addition, I also noticed that it might be possible to change the first axiom to read

P\left(E\right)\in \left\{x\in \mathbb {R} :x\geq 0\right\}\,\forall E\in {\mathcal {F}}

instead of what it reads now.

RandomDSdevel (talk) 01:15, 1 April 2013 (UTC)[reply]

P.S.: Oh, wait: don't fix it! I just remembered that

{\mathcal {F}}

is the set of all subsets (or, alternatively, all members of the power set

{\mathcal {P}}\,\left(\Omega \right)

) of the sample space

Ω

to which one can reasonably apply the probability measure

P

! For finite sample spaces,

{\mathcal {F}}

would contain all subsets – i.e.: all members of the power set

{\mathcal {P}}\,\left(\Omega \right)

– of the sample space

Ω

, but one has to limit the

σ

-algebra

{\mathcal {F}}

to include only the measurable subsets of the sample space

Ω

when it is infinite.

Axioms being in disagreement with physics, Bell-like inequalities[edit]

The axioms from this article allow to derive Bell-like Mermin's inequality: "tossing 3 coins, at least 2 are equal" - for 3 binary variables ABC:

Pr(A=B) + Pr(A=C) + Pr(B=C) >=1

Specifically, choosing any probability distribution for 8 possibilities $\sum _{ABC}pABC=1$

Pr(A=B) = p000 + p001 + p110 + p111

Pr(A=C) = p000 + p010 + p101 + p111

Pr(B=C) = p000 + p100 + p011 + p111

Pr(A=B) + Pr(A=C) + Pr(B=C) = 2p000 + 2p111 + $\sum _{ABC}pABC$ >= 1

However, QM formalism allows to violate this inequality, e.g. https://arxiv.org/pdf/1212.5214

Maybe it is worth to mention that the axioms are in disagreement with physics/QM? --Jarek Duda (talk) 08:50, 7 September 2019 (UTC)[reply]

I don't think that "the axioms are in disagreement with physics/QM?" is a good way of stating this. It took very long before probability theory was put on solid mathematical ground by Kolmogorov, in the 1930's. More than thirty years after that, Bell's results and others point towards the fact that the number that are called "probabilities" in quantum mechanics behave in a strange way, and cannot fit in Kolmogorov's framework. To my opinion, if this had to be mentioned, it should be in a physics article, and not a mathematical one. 2001:861:3743:6E00:2CDA:F744:71A7:97D6 (talk) 16:19, 1 September 2022 (UTC)[reply]

Above derivation has used Kolmogorov axioms to get inequality Pr(A=B) + Pr(A=C) + Pr(B=C)>=1, which can be violated by quantum formalism. Beside the previously linked arXiv, I have only seen it in Preskill lecture notes - lecture 6 from http://theory.caltech.edu/~preskill/ph219/ph219_2021-22.html Jarek Duda (talk) 17:07, 1 September 2022 (UTC)[reply]

What are you suggesting to do on this Wiki page? All I'm saying is that even if "disagreement" is a symmetric relation, I think the sentence "Kolmogorov's axioms are in disagreement with QM" makes it look like a failure for Kolmogorov's axioms, whereas I think it's more accurate to say that Kolmogorov's axioms do a wonderful job as for classical probability, but over the last sixty years, we have discovered that quantum probabilities behave in a different way.

In short, of course Bell-like theorems prove that quantum probabilities are not classical, I agree. But do you really think it is worth mentioning on this page? Why not on a physics page? I mean, I don't think (chronologically speaking, at least) that Kolmogorov's axioms were intended to formalize quantum probability as well. 2001:861:3743:6E00:38D6:F27D:42EF:773B (talk) 07:50, 2 September 2022 (UTC)[reply]

We intuitively assume they are ultimately true, but QM shows we should be more careful. I think this article should contain this kind of warning, like "The axioms concern classical probability theory, which are not necessarily true in physics governed by quantum mechanics, which allows to replace 3rd axiom with Born rule". If needed, here is some diagram: https://i.postimg.cc/rFKPxYGY/bell3.png Jarek Duda (talk) 09:37, 2 September 2022 (UTC)[reply]

Kolmogorov's axioms are a mathematical model and as such they aren't "true" or "false". They do however happen to be an excellent model for a large number of situations involving random phenomena, provided that the mapping from "physical outcome" to "elementary event" satisfies certain natural properties. These are not satisfied in the situation you have in mind and so "classical" probability theory (i.e. Kolmogorov's axioms) isn't a useful model there. Your suggestion regarding a "warning" does not make much sense: we wouldn't dream of putting a warning on the page for the Navier-Stokes equations regarding the fact that there are situations in which on should rather use the Allen-Cahn equations and that this would somehow make the Navier-Stokes equations "untrue"...

Regarding Born rule, it isn't usually viewed as a replacement for the third axiom at all. Instead, it just tells you how to compute the (classical!) probabilities of various outcomes when observing a state. Violation of Bell's inequality can happen when one then makes the unfounded (and ultimately wrong) assumption that the observation of two distinct and non-commuting observables can be described by a joint probability distribution on their respective outcomes. Hairer (talk) 20:07, 2 September 2022 (UTC)[reply]

Regarding "They do however happen to be an excellent model for a large number of situations involving random phenomena" - I completely agree, so shouldn't Wikipedia article contain examples where it does apply, and where it doesn't?

Regarding 3rd axiom vs Born rule, we have general question: "what is probability of alternative of disjoint events?"

3rd axiom responds: it is sum of their probabilities. In contrast Born rule says: it is proportional to square of sum of their amplitudes.

These are two essentially different answers to the same question, allowing e.g. to imply (Bell-like) inequalities from one of them, which are violated by the second. Jarek Duda (talk) 06:26, 3 September 2022 (UTC)[reply]

I don't agree that the 3rd axiom and the Born rule are two different answers to the same question. The Born rule simply tells us how to compute probabilities (and these are perfectly normal, "classical" probabilities) of outcomes of measurements on a quantum-mechanical system, given its state. One main feature of quantum mechanics is that after such a measurement the state of the system is altered (waveform collapse). In particular, when measuring several non-commuting observables, the order of the observations matters. The violation of Bell's inequalities only arises when one asks oneself whether the observed probabilities arising from the measurements of several different non-commuting observables could possibly also arise from one single measurement of some other observable (and clearly the answer happens to be 'no'). In the example you are referring to, only two out of the three events (A=B), (B=C), and (A=C) can ever be realised as commuting observables (there's a bit of freedom since there are really two A's, two B's, etc), but not all three together. Hairer (talk) 13:38, 4 September 2022 (UTC)[reply]

So let's go back to the Pr(A=B) + Pr(A=C) + Pr(B=C) >=1 Mermin's inequality for ABC binary variables, intuitively "tossing 3 coins, at least 2 give the same" - obvious, trivial to derive. However, QM allows to violate it - still satisfying axiom 1 (probabilities are real nonnegative) and 2 (sum of probabilities is 1). Doesn't it mean there is a problem with axiom 3? Once again the diagram with derivation and violation if replacing 3rd axiom with Born rule: https://i.postimg.cc/rFKPxYGY/bell3.png Jarek Duda (talk) 13:48, 4 September 2022 (UTC)[reply]

It is only intuitively obvious if you assume that it is possible to check that either one of the events A, B and C occurs without affecting whether any of the others occur. This assumption fails in the setting you're considering, so there's no reason for the inequality to hold (irrespective of whether you're talking about quantum mechanics or not). Hairer (talk) 15:18, 4 September 2022 (UTC)[reply]

Indeed, measuring all 3 variables, we have probability distribution on size 2^3=8 space, for which 3rd axiom is true - the inequality has to be satisfied. To violate it, it is crucial to measure only 2 out of 3 e.g. spins in Ising sequence - replacing 3rd axiom with Born rule for non-distinguishable scenarios, allowing for lower bounds than classical 1: 3/4 in QM, 3/5 in Ising. Jarek Duda (talk) 17:32, 4 September 2022 (UTC)[reply]