Monday, December 15, 2014

Observations on 3-fold interactions

Sorry about how "mathy" this post is. I'm percolating about what to write about gay blood donors, but I need to think on that for another few days.

The last lecture for my epi class was about effect measure modification (interactions). Most people do it completely wrong, they use an interaction term in a statistical model (Y=a +b1X1 + b2X2 + b3X1X2), and then interpret b3 as though it's telling you something interesting. It isn't (except in extremely unusual circumstances).
What you really want to know is the degree to which being exposed to X1 and X2 produces more disease than you'd expect if all you know was the effect of X1 in the absence of X2 and the effect of X2 in the absence of X1.
Or, in mathy terms, let Rij be the rate of disease when X1=i and X2=j
We want to know whether (R11-R00), The difference that both make when working together, is greater or less than (R01-R00) + (R10-R00), the difference each makes in the absence of the other.

I'm going to skip right to the three-factor effect measure modification - here the idea is whether:
(R111-R000), the effect of all three together,
is comparable to the effect of each of the three in isolation:
(R100-R000) + (R010-R000) + (R001-R000).

First implication: In order to make that assessment, your study needs people with none of the exposures, all of the exposures, and at most one of the exposures. It does not need anyone with two of the exposures, so including any such subjects would be inefficient. That's bizarre.

Second implication: The fact that those people with two exposures are irrelevant actually points to the fact that there could be four quantities of interest: First, the one comparing each of the three in isolation to the effect of all three together, and then three iterations of comparing one in isolation with the other two in combination, i.e.
(R110-R000) + (R001-R000)
or (R101-R000) + (R010-R000)
or (R011-R000) + (R100-R000)
So, there are actually four interaction terms to compare to the joint effect: (R111-R000).

Third implication: I love how the math and the concepts circle around and inform one another. In this case, the fact that there is one comparison to make when there are two exposures, but four to make when there are three, suggests to me that our brains are not well suited to thinking about the issue of three factor interactions, and the whole idea should ideally not be attempted at all.


But hmmm, what happens when we go to four....
(R1111-R0000) would be the joint effect of all four.
The single factors adding up would be:
(R1000-R0000) + (R0100-R0000)  + (R0010-R0000) + (R0001-R0000)
Three together + one more would be:
(R1110-R0000) + (R0001-R0000)
(R1101-R0000) + (R0010-R0000)
(R1011-R0000) + (R0100-R0000)
(R0111-R0000) + (R1000-R0000)
Two together plus each of the other two alone would be:
(R1100-R0000) + (R0010-R0000)  + (R0001-R0000)
(R1010-R0000) + (R0100-R0000)  + (R0001-R0000)
(R1001-R0000) + (R0100-R0000)  + (R0010-R0000)
(R0110-R0000) + (R1000-R0000)  + (R0001-R0000)
(R0101-R0000) + (R1000-R0000)  + (R0010-R0000)
(R0011-R0000) + (R1000-R0000)  + (R0100-R0000)
And then two together plus the other two together would be:
(R1100-R0000) + (R0011-R0000)
(R1010-R0000) + (R0101-R0000)
(R1001-R0000) + (R0110-R0000)

Ai! 14 terms to keep in mind simultaneously.