## Journal tepples's Journal: Correlation and Causation 23 23

Correlation implies 25% likelihood of causation. Either A causes B, B causes A, C causes A and B, or chance.

In this post, Immerman wrote:

I *hate* seeing statistics abused. A 25% likelihood of causation is *not* implied. Yes, one of the four outcomes must be the case, but you don't know the relative probabilities of each. It's like grabbing a marble out of a bag containing red, green, blue, and yellow marbles - there's only four possibilities as to which color your marble is, but for all you know I filled the bag with blue marbles and just threw in a handful of the other colors, in which case it would be preposterous to claim a 25% chance of getting a red one.

I'm aware of the hyperbole in my illustration. They're probably not equally probable, but absent other evidence, one has to assume so. My point is that just because the probability isn't 100 percent doesn't mean it can always be treated as 0 percent. So if you want to plead false cause more effectively, explain why they're not equally probable. Be willing to discuss what further observations would be needed to show which of the four possibilities is most likely. But don't say "correlation does not imply causation" as if it were "correlation implies lack of causation" without providing evidence, as that's close to the fallacy fallacy and the black or white fallacy.

**This discussion has been automatically archived.** Discussion continues in Daniel Dvorkin's journal.

## Still wrong. (Score:1)

## Four infinities (Score:2)

## Re: (Score:2)

I'm afraid you cannot divide one infinity by another and get back a fraction, not ever. Read up on Hilbert's Hotel for an intuitive exploration of why this is so.

## Re: (Score:2)

I'm afraid you cannot divide one infinity by another and get back a fraction, not ever.

There is a bijection between rational numbers in lowest terms and the positive integers. Therefore, they are equally infinite. Calling the cardinalities equal in the sense that their ratio is 1 is hyperbole, as I have already admitted. But to more directly address the point: How else would you recommend colorfully expressing "just because causation hasn't already been proved doesn't necessarily mean we should drop the investigation of causation"?

## Re: (Score:2)

I think your current sig does so pretty succinctly, actually.

## Logical Fallacy (Score:1)

I can't say I thought I'd actually learn something from clicking here to your journal, but I did. Honestly, I expected some more arguments but perhaps they'd be more interesting than the usual, as it seemed this was about an interesting topic. Instead, you've posted that excellent website which is now on my favorites. TYVM, and have a good day.

## o rly? (Score:1)

## Re: (Score:2)

## Re: (Score:1)

## Less precise than 25% yet fits in 120 characters (Score:2)

you gotta figure it out by guessing the theory and do experiments to question it's validity

And one needs to do the same thing to establish a lack of causation. But a lot of the arguments I've seen take the form "You haven't

alreadyproved causation; therefore, working one's ass off to prove it one way or the other is futile."Saying "given a physical correlation and a randomly picked theory that fits the phenomenon, the theory will involve physical cause with x chance." is just bad science.

So is "92.7 percent of statistics are pulled out of someone's large intestine", despite it being ironically self-demonstrating. So what should I say that's less precise than "25%" but greater than zero? The intended meaning, "not greater or less than the other possibilities u

## Re: (Score:2)

How about ...

Correlation implies one of four possibilities:

because really, that's all you can honestly say.

## Re: (Score:1)

## Perhaps the right question is burden of proof (Score:2)

## Re: (Score:1)

## qualitative versus quantitative dishonesty (Score:2)

Oh, I completely agree that the use of "correlation does not imply causation" to dismiss the possibility of causation is a *huge* fallacy, and deserves to be called out. However, it's a qualitative fallacy, whereas yours is quantitative one. To assign a numerical probability to something when you have absolutely zero understanding of what the actual probabilities are is to be intellectually dishonest in a manner that brings nothing meaningful to the discussion and is likely to confuse the issue even furth

## some notes (Score:2)

the most obvious problem with your postulate is that it doesn't take the p-value into account. if i find correlation with p-value 0.00001, then the "likelihood" of it being chance should be lower than if the p-value was 0.1.

anyway, you're not really saying anything new. if you thought through what you are saying, you'd probably end up with bayesian inference or a more esoteric variant such as the dempster-shafer theory of evidence [wikipedia.org].

in short, you need to establish the prior probability of each of your hypothe

## Probabilities pulled from posterior (Score:2)

the most obvious problem with your postulate is that it doesn't take the p-value into account.

Anything quantitative about it (the "25%") is hyperbole, I admit. It's mostly directed at people who abuse "correlation does not imply causation" to imply "if causation has not already been proved, and if investigating it costs more than zero, then it should not be investigated". In addition, news sources that aren't paywalled tend to forget to report p-values.

anyway, you're not really saying anything new

I'm aware of that. Sometimes I have to repeat old things because new users haven't yet seen the old works.

and then evaluate the posterior probability

Which a lot of people unfamiliar with Bayes

## Re: (Score:2)

re posterior: people familiar with bayesian inference have the same objection. still, it's at least slightly better to establish prior probabilities which are then updated by seeing the evidence. what you're doing is saying that, whatever the data was, it's 25% across the board. if you ever want to get past this, you'll need something like bayesian inference or dempster-shafer.

re causality: my only point was that you have "A causes B" and "B causes A" as mutually exclusive categories. they aren't.

in total,

## Re: (Score:2)

i see you've changed your sig to something more reasonable; thank you.

i still don't like "chance," since the whole point of statistics is to rule out certain kinds of chance. there are also details like "A causes Z which causes B," and so on, and i think "A causes B and B causes A" is also possible.

## Where can I find the original? (Score:2)

I'm here but I'm confused

Where's the original discussion that led to this thread?

Thanks in advance !!

## Read the summary (Score:2)

In this post [slashdot.org], Immerman wrote

Taco Cowboy wrote:

Where's the original discussion that led to this thread?

It was a reply to a signature, and I had installed the signature after having seen numerous abuses of "correlation does not imply causation" in Slashdot comments. I apologize that I can't provide the URLs of all these comments.

## Not enough data (Score:2)

"but absent other evidence, one has to assume so"

and that assumption has been the downfall of many papers. It's also the same argument used to prop up things like acupuncture, homeopathy, chiropractors, and perpetual motion machines. "We observe X, can't explain it, therefore are pet solution must e the answer."

You simply to not have enough data to make any percentage guess.

## Or 0 percent (Score:2)