It's unfortunate your discussion wasn't more productive. As a statistician, I just wanted to comment on the statistical model. The mixed effects model is a better way to analyze the data than the matrix method proposed. In the mixed effects model, the group difference actually takes into account the twin comparison. More specifically, in this case the group effect is not the comparison of post-intervention means, but rather the comparison of the mean individual change while accounting for each individual's twin. The only thing I would have done differently is using a random slope for time rather than a fixed slope. In complete data, both are unbiased, but in this case, the fixed slope increases the type 1 error rate and the type 2 error rate. One pair dropped out so that may introduce bias, but likely only minimally.
Thanks for the comments. As a non-statistician, I'm afraid I don't understand formal statistics. It is, however, often not strictly relevant. The question is what do we want to know. In this, the matrix method is not a statistical measure but rather a presentation of the data. It is the eyeball test. In Gardner's experiment, we want to compare differences between twins vs. differences of all non-twin pairs. Otherwise, it is a waste of time and money to run the experiment. I don't know understand mixed effect or random effects but what Gardner presented is not what we want to know. And in any case, SD is the correct statistic for presentation. SEM always makes the data look better that it is. In any case, not presenting individual data has no justification. Without that, we don't know anything.
I agree with presenting SD and individual data. However, I disagree about the mixed effects model - it is what we want to know. I don't see a point in debating this though if you don't understand it. I would just kindly suggest that attacking an experiment (i.e. saying "a waste of time and money") is not constructive if you do not understand the analysis. I have no skin in the game so I'm just trying to promote accuracy and collegiality.
Also, "we want to compare differences between twins vs. differences of all non-twin pairs" would lead to a biased treatment effect. Since the twin is essentially a stratification variable, then the appropriate comparison is the difference within each twin group. However, I imagine you're trying to look for differences due to genetics (please correct me if I'm wrong). If so, this is not the right study design for that.
It is mostly the language that I don't understand -- great if you can explain what it is and how relevant to the current case. And I agree on keeping the rhetoric down. However, this is not a formal review and we need to look at the context, laid out in the two posts. The field is fairly contentious and I suggested dialog with Gardner because I know him and he is not particularly doctrinaire. He thought it was a good idea but it never happened. While not my original intention, somehow the twin paper published on vegan vs. omnivore diets came up (JAMA Network Open. 2023;6(11):e2344457). To be more specific, I addressed that.
In the study, there were two groups randomly assigned to either a vegan or an omnivorous diet. There were 22 pairs of twins and one twin from each group was assigned to vegan, and the other, to the omnivore diet. This must have required a certain amount of work but it seems to offer good comparisons between diets, an “opportunity to investigate the effects of a dietary intervention while controlling for genetic and environmental factors.” (You had also mentioned genetics). Comparisons, however, were made between the two groups as a whole (all vegans vs. all omnivore). (Is this "random effects"?) Averaging the behavior of the groups undermines the whole idea behind the experimental design and hides the unique results if any. Jeez, that's exasperating. Seems like, well, “a waste of time and money.”
My analysis of the data would be to consider the individual differences which would not be evident in the group data. For example, did any of the pairs go in the other direction? (I would first have compared the differences between twins at baseline). A qualitative analysis of the data would tell whether greater precision, and of what type, would be required.
Gardner has published important papers -- the A-Z study, for example (oddly not included in the references). The twin study is not one of them. The design
shows significant bias.
I would be glad to discuss further, including why frequentist statistics is not helpful here.
t and, in the end, claimed that he was too busy to provide data I asked for.
So, what is actually wrong here. The question of SD vs SEM is relevant -- they can be interconverted -- because from what was presented, the differences are small and the variation is obviously very large. That LDL was significantly higher on one or another diet doesn't tell us what we want to know as experimenters and, most of all, clinicians giving advice. What would be a patient's expectation if they went for the vegan diet
There is a lot to unpack so I'm going to try responding point by point to the themes I see. If I miss something, please let me know.
1. ‘It is mostly the language that I don't understand -- great if you can explain what it is and how relevant to the current case.” – Hopefully, points 3 and 4 clarify this. If not, please let me know.
2. “The field is fairly contentious and I suggested dialog with Gardner because I know him and he is not particularly doctrinaire” but “in the end, claimed that he was too busy to provide data I asked for.” – I agree it’s unfortunate
3. “Comparisons, however, were made between the two groups as a whole (all vegans vs. all omnivore).” – I think this is the crux of the issue. If they had performed a t-test (simply comparing the two groups), you would be correct that’s it’s biased. However, the model they used compares each twin pair individually (that is the random effect).
4. “My analysis of the data would be to consider the individual differences which would not be evident in the group data.” – I agree with this. Where I disagree is how to do it. A qualitative analysis could be useful, but I think a quantitative analysis is better in this case. As I indicated above, the model they used accounts for individual differences. Technically, it does not allow for any of the pairs to go in the other direction, which is why I suggested using a random slope (which would allow for this). With this caveat, the point estimate is still unbiased, but the type 1 and type 2 errors are a little off.
5. “The design shows significant bias.” – This is inaccurate as I described above.
6. “frequentist statistics is not helpful here” and “That LDL was significantly higher on one or another diet doesn't tell us what we want to know as experimenters” – I agree frequentist statistics have limitations. One being that statistical significance is not the same as clinical significance.
7. “The question of SD vs SEM is relevant” – I agree SD is better for presentation. However, this does not change the fact that their analysis was correctly done.
“…the model they used compares each twin pair individually (that is the random effect).” This is clearer than the descriptions on various internet sites and even texts. (I don’t understand this because, if they’re not clear, I don’t try to figure it out — for a chemist, I am actually middling smart). It is far outside of the analysis that I consider appropriate for this kind of experiment. As previously, it is more about getting a view of the data then coming up with a statistical conclusion. The approach you describe (I still don’t know what’s “random” or what the “effects” are) is a mathematical conclusion. It might be useful if I were trying to evaluate the different samples of my home-brew. (I know). But the data are not ready for this kind of analysis.
I think of the old joke of the biochemist, physicist and computer guy who go to the track. It’s long winded — the biochemist feels the horses hide, etc., etc. In the end, the physicist wins. They ask him how he did it. He says “First, assume a horse is a perfect sphere….”)
Bayesian statistics would probably tell you that the data don’t change your belief that there’s little difference in outcomes. But what we use before quantitative calculation is SOP statistics (seat of the pants). We really need to see the raw data. The easiest way to see whether “any of the pairs … go in the other direction” is to look at it before we mess with it.
So, we agree that “statistical significance is not the same as clinical significance” and may be inappropriate here where we don’t know that there is any identifiable distribution or that any subjects have similar dependence on the diets.
If you like, discuss further (off-list, first). Send contact info.
A perfect demonstration of GIGO!! I’ll defend anyone’s right to be a vegan, but IMHO that's a political not nutritional choice, therefore validating it with data is pointless.
Cf., —Emily Eakin (17 Aug 2002) "Holy Cow a Myth? An Indian Finds The Kick Is Real," is.gd/qD3pug
''Holy Cow: Beef in Indian Dietary Traditions,'' is a dry work of historiography buttressed by a 24-page bibliography and hundreds of footnotes citing ancient Sanskrit texts. It's the sort of book, in other words, that typically is read by a handful of specialists and winds up forgotten on a library shelf.
But when its author, Dwijendra Narayan Jha, a historian at the University of Delhi, tried to publish the book in India a year ago, he unleashed a furor of a kind not seen there since 1989, when the release of ''Satanic Verses,'' Salman Rushdie's novel satirizing Islam, provoked rioting and earned him a fatwa from Ayatollah Ruhollah Khomeini. [...]
After months of legal wrangling, Mr. Jha's lawyers succeeded in having the ban lifted this spring. And now his book has been published in Britain and the United States by Verso, with a new preface and a more provocative title: ''The Myth of the Holy Cow.''
But though copies have been shipped to India, few bookstores there are likely to stock it.
His offense? To say what scholars have long known to be true: early Hindus ate beef.
Well, not only political. People like the diet (it is generally low calorie). And some people have a sincere objection to killing animals. The data are whether they thrive. Other data, are as you say, pointless.
It's unfortunate your discussion wasn't more productive. As a statistician, I just wanted to comment on the statistical model. The mixed effects model is a better way to analyze the data than the matrix method proposed. In the mixed effects model, the group difference actually takes into account the twin comparison. More specifically, in this case the group effect is not the comparison of post-intervention means, but rather the comparison of the mean individual change while accounting for each individual's twin. The only thing I would have done differently is using a random slope for time rather than a fixed slope. In complete data, both are unbiased, but in this case, the fixed slope increases the type 1 error rate and the type 2 error rate. One pair dropped out so that may introduce bias, but likely only minimally.
Thanks for the comments. As a non-statistician, I'm afraid I don't understand formal statistics. It is, however, often not strictly relevant. The question is what do we want to know. In this, the matrix method is not a statistical measure but rather a presentation of the data. It is the eyeball test. In Gardner's experiment, we want to compare differences between twins vs. differences of all non-twin pairs. Otherwise, it is a waste of time and money to run the experiment. I don't know understand mixed effect or random effects but what Gardner presented is not what we want to know. And in any case, SD is the correct statistic for presentation. SEM always makes the data look better that it is. In any case, not presenting individual data has no justification. Without that, we don't know anything.
I agree with presenting SD and individual data. However, I disagree about the mixed effects model - it is what we want to know. I don't see a point in debating this though if you don't understand it. I would just kindly suggest that attacking an experiment (i.e. saying "a waste of time and money") is not constructive if you do not understand the analysis. I have no skin in the game so I'm just trying to promote accuracy and collegiality.
Also, "we want to compare differences between twins vs. differences of all non-twin pairs" would lead to a biased treatment effect. Since the twin is essentially a stratification variable, then the appropriate comparison is the difference within each twin group. However, I imagine you're trying to look for differences due to genetics (please correct me if I'm wrong). If so, this is not the right study design for that.
It is mostly the language that I don't understand -- great if you can explain what it is and how relevant to the current case. And I agree on keeping the rhetoric down. However, this is not a formal review and we need to look at the context, laid out in the two posts. The field is fairly contentious and I suggested dialog with Gardner because I know him and he is not particularly doctrinaire. He thought it was a good idea but it never happened. While not my original intention, somehow the twin paper published on vegan vs. omnivore diets came up (JAMA Network Open. 2023;6(11):e2344457). To be more specific, I addressed that.
In the study, there were two groups randomly assigned to either a vegan or an omnivorous diet. There were 22 pairs of twins and one twin from each group was assigned to vegan, and the other, to the omnivore diet. This must have required a certain amount of work but it seems to offer good comparisons between diets, an “opportunity to investigate the effects of a dietary intervention while controlling for genetic and environmental factors.” (You had also mentioned genetics). Comparisons, however, were made between the two groups as a whole (all vegans vs. all omnivore). (Is this "random effects"?) Averaging the behavior of the groups undermines the whole idea behind the experimental design and hides the unique results if any. Jeez, that's exasperating. Seems like, well, “a waste of time and money.”
My analysis of the data would be to consider the individual differences which would not be evident in the group data. For example, did any of the pairs go in the other direction? (I would first have compared the differences between twins at baseline). A qualitative analysis of the data would tell whether greater precision, and of what type, would be required.
Gardner has published important papers -- the A-Z study, for example (oddly not included in the references). The twin study is not one of them. The design
shows significant bias.
I would be glad to discuss further, including why frequentist statistics is not helpful here.
t and, in the end, claimed that he was too busy to provide data I asked for.
So, what is actually wrong here. The question of SD vs SEM is relevant -- they can be interconverted -- because from what was presented, the differences are small and the variation is obviously very large. That LDL was significantly higher on one or another diet doesn't tell us what we want to know as experimenters and, most of all, clinicians giving advice. What would be a patient's expectation if they went for the vegan diet
There is a lot to unpack so I'm going to try responding point by point to the themes I see. If I miss something, please let me know.
1. ‘It is mostly the language that I don't understand -- great if you can explain what it is and how relevant to the current case.” – Hopefully, points 3 and 4 clarify this. If not, please let me know.
2. “The field is fairly contentious and I suggested dialog with Gardner because I know him and he is not particularly doctrinaire” but “in the end, claimed that he was too busy to provide data I asked for.” – I agree it’s unfortunate
3. “Comparisons, however, were made between the two groups as a whole (all vegans vs. all omnivore).” – I think this is the crux of the issue. If they had performed a t-test (simply comparing the two groups), you would be correct that’s it’s biased. However, the model they used compares each twin pair individually (that is the random effect).
4. “My analysis of the data would be to consider the individual differences which would not be evident in the group data.” – I agree with this. Where I disagree is how to do it. A qualitative analysis could be useful, but I think a quantitative analysis is better in this case. As I indicated above, the model they used accounts for individual differences. Technically, it does not allow for any of the pairs to go in the other direction, which is why I suggested using a random slope (which would allow for this). With this caveat, the point estimate is still unbiased, but the type 1 and type 2 errors are a little off.
5. “The design shows significant bias.” – This is inaccurate as I described above.
6. “frequentist statistics is not helpful here” and “That LDL was significantly higher on one or another diet doesn't tell us what we want to know as experimenters” – I agree frequentist statistics have limitations. One being that statistical significance is not the same as clinical significance.
7. “The question of SD vs SEM is relevant” – I agree SD is better for presentation. However, this does not change the fact that their analysis was correctly done.
Wow. Thanks.
“…the model they used compares each twin pair individually (that is the random effect).” This is clearer than the descriptions on various internet sites and even texts. (I don’t understand this because, if they’re not clear, I don’t try to figure it out — for a chemist, I am actually middling smart). It is far outside of the analysis that I consider appropriate for this kind of experiment. As previously, it is more about getting a view of the data then coming up with a statistical conclusion. The approach you describe (I still don’t know what’s “random” or what the “effects” are) is a mathematical conclusion. It might be useful if I were trying to evaluate the different samples of my home-brew. (I know). But the data are not ready for this kind of analysis.
I think of the old joke of the biochemist, physicist and computer guy who go to the track. It’s long winded — the biochemist feels the horses hide, etc., etc. In the end, the physicist wins. They ask him how he did it. He says “First, assume a horse is a perfect sphere….”)
Bayesian statistics would probably tell you that the data don’t change your belief that there’s little difference in outcomes. But what we use before quantitative calculation is SOP statistics (seat of the pants). We really need to see the raw data. The easiest way to see whether “any of the pairs … go in the other direction” is to look at it before we mess with it.
So, we agree that “statistical significance is not the same as clinical significance” and may be inappropriate here where we don’t know that there is any identifiable distribution or that any subjects have similar dependence on the diets.
If you like, discuss further (off-list, first). Send contact info.
Sure, I sent you a message now
A perfect demonstration of GIGO!! I’ll defend anyone’s right to be a vegan, but IMHO that's a political not nutritional choice, therefore validating it with data is pointless.
Cf., —Emily Eakin (17 Aug 2002) "Holy Cow a Myth? An Indian Finds The Kick Is Real," is.gd/qD3pug
''Holy Cow: Beef in Indian Dietary Traditions,'' is a dry work of historiography buttressed by a 24-page bibliography and hundreds of footnotes citing ancient Sanskrit texts. It's the sort of book, in other words, that typically is read by a handful of specialists and winds up forgotten on a library shelf.
But when its author, Dwijendra Narayan Jha, a historian at the University of Delhi, tried to publish the book in India a year ago, he unleashed a furor of a kind not seen there since 1989, when the release of ''Satanic Verses,'' Salman Rushdie's novel satirizing Islam, provoked rioting and earned him a fatwa from Ayatollah Ruhollah Khomeini. [...]
After months of legal wrangling, Mr. Jha's lawyers succeeded in having the ban lifted this spring. And now his book has been published in Britain and the United States by Verso, with a new preface and a more provocative title: ''The Myth of the Holy Cow.''
But though copies have been shipped to India, few bookstores there are likely to stock it.
His offense? To say what scholars have long known to be true: early Hindus ate beef.
𝙸𝚕𝚕𝚎𝚐𝚒𝚝𝚒𝚖𝚒 𝚗𝚘𝚗 𝚌𝚊𝚛𝚋𝚘𝚛𝚞𝚗𝚍𝚞𝚖
Well, not only political. People like the diet (it is generally low calorie). And some people have a sincere objection to killing animals. The data are whether they thrive. Other data, are as you say, pointless.