What is the proper interpretation of the Minnesota Transracial Adoption Study?

The Minnesota Transracial Adoption Study was a long project dedicated to directly measuring the influence of shared environment on the racial gaps in intelligence. The original paper about it came out in 1976 and the authors hypothesized that socialization in favorable environments would significantly reduce the racial gap in IQ (Scarr and Weinberg, 1976). At that point, the data did suggest an environmentalist hypothesis regarding race differences, however it was revisited by the authors in 1992, an event which caused much more controversy regarding the results. While the authors still supported an environmentalist position (Weinberg et al., 1992), their data brought a number of replies. In this post, I am going to summarize the main arguments used by the authors and review the replies as well as the authors’ responses to those replies.

For anyone unfamiliar with the study, the authors shortly reviewed the findings in their 1992 paper:

1. Adoptive parents and their biological children in the 101 participating families scored in the bright-average to superior range of age-appropriate IQ tests.

2. The 130 black and interracial adopted children scored above the white population average for the same U.S. region (M = 100) and were performing adequately in school. In fact, we found the average IQ of the black/interracial children adopted in the first 12 months of life to be 110, some 20 points above the average IQ for black children being reared in the black community. Nevertheless, as found by other researchers, the adopted children scored on average below the birth children of these families. This was true not only for black/interracial adoptees, but also for white and Asian/Indian adoptees.

3. We interpreted these data to indicate that: (a) putative genetic racial differences do not account for a major portion of the IQ performance difference between racial groups, and (b) black and interracial children reared in the culture of the tests and the schools perform as well as other adopted children in similar families, as reported by other researchers.

4. The personality and social adjustment of the parents, biological offspring, and adopted children (ages 4-12) in these families was, on average, quite good.

Weinberg et al. (1992)

Weinberg et al. 1992

The newer study (1992) was done with the intent of seeing how the adoptees and birth children were doing ten years after initially being studied:

Ten years later, we restudied the adopted children (average age = 17) and the birth children (average age = 20), as well as other members of their families. After an extensive search for the 101 families, we collected data on intellectual performance and academic achievement, on personality and psychopathology, and on family members’ life adjustment. The objective of this longitudinal study was to see how these children were faring approximately a decade after they were initially studied. What was their current level of intellectual performance, school achievement, and personal and social adjustment? What might account for any change in performance or adjustment over that interval? We were also interested in the status of other family members and the quality of the families’ adaptation to their unique circumstances. We were provided the rare opportunity to explore systematically the impact of a potentially stressful situation on families at a point in their lives when adolescent problems may emerge to disrupt their adaptation.

Weinberg et al. (1992)

There is a hereditarian reason to argue why these results would substantially differ. The heritability of IQ raises with age and so is less malleable to shared environmental influence by adulthood (Bouchard, 2013). In fact, shared environment plays nearly no role in the variation IQ test scores by adulthood. This is all for within groups, however, and we want to know if it applies at the between-group level. So, we can correlate the racial gap by age with the heritability of intelligence by age. If it is a strong correlation, one inference would be that the same thing (genetic variants) that affects within-group differences in IQ is also affecting between-group differences in IQ. So, I did this with data from Bouchard (above) and Dickens and Flynn (2006).

What we find is that in a linear regression, the correlation between race gaps at different ages and the heritability at those ages is very strong. This certainly does not prove the hereditarian hypothesis, but it gives more credence to it. And because it is true, there is reason to believe the MTAS results will be strikingly different.

The re-analysis of the MTAS was very successful in gathering 93 of the 101 families to participate. Over 80% of the original sample (n = 426) took part in the study. And the sample they were able to collect was not largely different than the original sample.

The results showed that the overall IQ scores of the new sample declined. This is rightly attributed to the restandardization of the tests and the Flynn effect. Regardless, “the follow-up mean IQ performance of black and interracial adoptees remained above the average IQ for blacks reared in the black community.” In the new results, the black/interracial adoptees scored worse on the IQ tests than the biological offspring group. There was a relationship between IQ decline and age of follow-up, but partialling it out did not get rid of the difference between biological offspring and black/interracial adoptees. White adoptees had higher IQ test performance than black/interracial adoptees.

Pre-adoptive experience had a significantly diminished effect in the re-analysis compared to the original. And while the correlation between adoptive fathers’ occupation and adoptive family income was near zero in both analyses, the correlation of adoptive parent’s education did drop in the re-analysis compared to the original analysis. This all gives more credence to the Wilson effect. They also state, “The adoptive variables as a group accounted for 17% of the variance in Time 2 IQ, still a nonsignificant contribution in variance explained.”

One interesting finding from the re-analysis was that biracial adoptees had higher IQ scores than fully black adoptees. This gives some more credence to the hereditarian hypothesis. They give a very bad explanation for this as they say in the Discussion section that this is a result of better adoptive conditions for the black/white group than for the black/black group. But didn’t they say adoptive variables accounted for very little of the variance in IQ? They clearly can’t make up their mind about this.

The main issue which this paper attempted to bring light to was race differences, their malleability, and possible genetic influences. Table 2 gives the data for this.

Their main interpretation of this data is this:

“Although the biological offspring achieved significantly higher IQ scores at Time 2 than adoptees, there was no difference in IQ change among the groups, suggesting that the rearing environments continued to have positive effects on children in the adoptive families. Both biological offspring and adoptee groups scored in the average range of IQ based on the new norms for IQ tests.”

So that is the summary of the 1992 re-analysis of the Minnesota Transracial Adoption Study. There are two major replies that came out shortly after: one by Richard Lynn (1994) and another by Michael Levin (1994).

Lynn 1994

Lynn attacks many of the claims made by Weinberg et al. First, he gives this table to quickly summarize the data used by the original authors:

He says that many would argue this equates an increase of 4 points for adopted black children. But, he points out that the majority of the black adoptees came from the Northern United States. African-Americans from the North have generally higher IQs than those in the South. He also notes that, if we correct for the Flynn effect, the adoptee IQ is below average. If we do the same for whites, we see virtually no adoption gain.

The difference between whites and black adoptees in the sample was 17 points, slightly greater than the 1 SD difference found in most studies. He also points out a curious point that Weinberg et al. made, yet denied in their Discussion which is that they themselves say, “biological mother’s race the best single predictor of adopted child’s IQ when other variables are controlled”.

Lynn points out some other areas where Weinberg et al. were selective in evidence. For example, Weinberg et al. calculate (relatively small) correlations between time in adoptive home and IQ for blacks and interracial people. They take this as evidence adoptive home played a significant role in the supposedly greater IQs of blacks and interracial people. However, they choose not to calculate the correlation for whites which may likely be the same. He is also on strong ground for hereditarians when he says,

“The correlations presented are confounded with race differences because the black children had lower mean IQs, later ages at placement, and shorter times in the adoptive home, as compared with the interracial children. Thus, what appears to be an age-of-adoption effect may be only a race-differences effect. This is suggested by the multiple regression analysis, because, when race is entered first in the multiple regression, it appears as a significant predictor of adopted children’s IQs, and adoptive experience variables, entered second, make no significant contribution to children’s IQs.”

He makes some other comments towards the correlations, but what is addressed here suffices.

Levin 1994

Michael Levin (who has a very underrated book on race differences) also commented negatively on the authors’ interpretation of the study. Levin’s table is slightly more complex than Lynn’s but much less intrusive than Weinberg et al.s:

Levin, unlike Lynn, makes strong use of Cohen’s d to show how striking the differences between groups are.

Levin says, “using their data, we also see that d for the birth cohort-B/B cohort in 1975 was 1.63 and in 1986, 1.58. (These numbers are inflated by the rather small sample SDs; assuming a “true” pooled SD of 15 reduces both ds to about 1.3.) Prima facie, these rather large ds are not predicted by environmentalism.” If the authors or anyone reading wanted to make a strong environmentalist (H^2 = 0) interpretation of the data, the d should be the size of the deviation between the offspring cohort and the white adoptees. Levin also notes, as I did, that pre-adoptive experience explained very little variation in IQ scores, which is difficult to interpret in an environmentalist light.

Furthermore, the birth cohort group was actually genetically non-typical in regards to the general white population. The adoptive fathers averaged about 17 years of education and adoptive families were characterized as above average in socioeconomic status. So, genetically typical whites and genetically typical blacks are both raised in similar above-average environments yet end up with a large difference in IQ – calculated by Levin, a Cohen’s d of 1.08-1.28 by age 17.

Levin goes on to say,

“The hypothesis that best fits the data, rather, is that genetic variation between the races explains about 70% of the intelligence difference: H 2 is .72 = .5. This estimate is precisely Plomin’s estimate for h 2 (Plomin, 1990). It may however understate the H 2 to be estimated from Weinberg et al. (1992) because of the superior richness of the adoptive environments.”

Interestingly, Levin doesn’t bring up the argument that Lynn does which is that the blacks in the MTAS were generally from the North which is above average in IQ. And so, the IQ is below what it should be if the blacks were even raised in their normal home environment. One could thus conclude a genetic determinist hypothesis about race differences in IQ on this account, but I doubt the decline in IQ is much more than a chance occurrence.

Looking at the data through three different methods produced estimates of between group heritability of 0.66, 0.7, 0.59. Levin confirms the relevance of the IQ data by analyzing the data provided by Weinberg et al. on academic achievement for the groups.

Weinberg et al. 1994

Weinberg, Scarr, and Waldman (1994) decided to reply to the compelling criticisms of Levin and Lynn. The first portion of this article gives far too much attention to minor disputes. For example, they spend about a page criticizing the language of Levin regarding hereditarianism vs. environmentalism. Then they criticize Levin and Lynn for omitting the data regarding Asians for most of their replies.

The comment regarding Asians is incredibly unimportant. The sample size for Time 2 re-analysis of Asians was n=12. This can not be claimed to be representative data. Furthermore, the differences between Asians and whites are less important in terms of size as well as sociopolitical issues. Unlike black people in America, Asians don’t blame any of their failures on bad things white people did or do. The omission of Asian data from their main analysis is not a serious problem in all practicality.

Weinberg et al. are not at all critical of their own downplaying of the Asian data. We can make this claim based simply on the amount of mentions within their 1992 article of each group. “Asian” is mentioned 10 times, “white” is mentioned 41 times, “interracial” is mentioned 50 times, and “black” is mentioned a whole 120 times. They themselves know the differences between whites and Asians are not all that important scientifically or practically.

Their reply to Lynn regarding white data on the relationship between adoption age and IQ is fine; they say that the sample was incredibly limited so the correlation was excluded. This is true, however Lynn’s much larger point was that the correlation is probably about the same for whites as it is for the other groups.

When it comes to the actual race differences and their potential for a large genetic cause, they make another very bad argument. Their argument is this: the Asian/Indian IQ mean was between that of the interracial and black adoptee groups. That’s it. This does nothing to damage the relevance of a genetic explanation for the IQ gap between whites and blacks. For example, if it were shown that the difference in IQ between Asians and whites were entirely environmental, this would say nothing about the cause of the gap between whites and blacks. Furthermore, the Indian IQ is not particularly high, so this is just a very silly argument through and through.

Their next argument is more reasonable. They go on to say,

“The most damning evidence against many of the points raised by Levin and Lynn emerges from a consideration of the influence of early adoptive experiences on adoptees’ childhood and adolescent IQ and from adoptee group differences in such experiences.”

But, as they themselves concede, time of adoption could not account for more than 17 percent of the variance in IQ. But even in regards to this, it could still be a cause of between-group differences. They show that the rank ordering of pre-adoptive experiences matches that of the racial gaps in IQ.

There are multiple points one can make in response to this. For one, the quality of life prior to adoption may not even be a valid control. Take socioeconomic status – this is not a valid control for race differences in IQ because IQ is positively associated with socioeconomic status. In that the within-group heritability of IQ is quite high for all groups (Pesta et al., 2020), pre-adoptive experiences probably have a relationship to genetic IQ of the surroundings of each group.

Ultimately, these variables did not account for a large proportion of the variance in IQ. Levin (1997) summarizes their argument saying “Waldman, Weinberg, and Scarr contend that adoption experiences for the white and black cohorts did differ, but seemingly concede that controlling for that variable does not reduce the IQ gap by more than about 15%”. Since Levin asserts a 60-70% genetic explanation, adoption experiences are not going to hurt the hereditarian argument. Finally, at least in regards to the MTAS, they seem to concede that the area where the black people came from probably plays a role in the data.

Thomas 2017

Thomas (2017) provided a newer analysis of the data which is worth discussing as no one has really responded to it yet. Thomas starts,

“The adoptees are adoptees, and adoptees are typically raised in unrepresentative environments which tend to be more nurturing and high in socioeconomic status. Unusually wholesome environments could then explain the adoptees’ above-average IQ, rather than the adoptees’ race; race and environment would be confounded.”

This was addressed by Levin. The fact that the environments were above average actually helps the hereditarian hypothesis in regards to race differences. A 1.6 SD increase in environmental quality for black people resulted in a marginal, if any, increase in IQ scores.

He goes on to say that attrition between waves could be a major methodological problem. He says,

“The third issue, attrition, is less common, affecting only longitudinal studies. Even if a longitudinal study compares adoptees against only other adoptees (eliminating the first confound) who took the same IQ test at similar times (eliminating the second confound), attrition can take place between waves. When researchers lose track of some subjects between waves of a longitudinal study, the pattern of subjects lost to follow-up can vary between subgroups of subjects, degrading the statistical comparability of those subgroups.”

This isn’t an issue for Weinberg et al. (1992). They were able to locate and re-test over 80 percent of the sample and they specifically state, “Generally, subject attrition was minimal, even though this is a problem that often plagues longitudinal research.”

Thomas goes on to say we need to adjust the data for the Flynn Effect. Like I said, I side with Weinberg et al. (1994) on this, so I would say he is wrong and his method, similar to Lynn’s isn’t very good. He attributes some of the black-white gap to attrition as well. Some of the whites were lost who seemed to be less intelligent than the rest. This effect was smaller in blacks and actually in the opposite direction. So, correcting for this, he finds a smaller 11.7 point gap. This is a fair point, and I think it is worth considering. Edit: I just read PumpkinPerson’s comments on Thomas’ paper and Thomas is probably doing too much by adjusting for attrition. See the following:

That’s not to deny that adjusting for attrition can be important in some cases, but in this study, Thomas argues attrition only increased the IQs of adopted whites and not the adopted non-whites. An effect that only affected one demographic sounds to me like random error, not a systematic bias that needs to be adjusted for. And if the error was random, one could just as easily argue the IQs of adopted whites were too low before the attrition rather than too high after the attrition.

Indeed if the adopted white sample is so easily skewed by a few kids dropping out of the study, then maybe that sample is too small to begin with, and instead we should compare the much larger sample of adopted mixeds not to the adopted whites, but to the general U.S. white population.

Overall, Thomas reviews a number of different environmental variables which could possibly mediate the gap. I think these are unlikely and typical hereditarian literature goes over all of them (e.g. Levin, 1997). He also brings up adoption quality and whatnot; this is addressed earlier in this article. Thomas’ analysis is better in some areas and worse in some areas.


I think the Minnesota Transracial Adoption Study still generally supports the viewpoint that shared environment does not contribute to a significant portion of the variance in the black-white IQ gap. Lynn and Levin do provide the most compelling arguments and Weinberg et al.’s reply to them was filled with non-arguments and smaller things which weren’t really important.

Thomas does make some okay arguments, which leads me to the conclusion that the Minnesota Transracial Adoption Study should only be used by hereditarians with caution. In the age of even more direct analyses, like admixture analysis, the MTAS will become less important anyways. However, it would be nice if it were redone with a much better design. Warne (2020) states some basic necessities to improve the transracial adoption design: “A new transracial adoption study that includes IQ scores for biological parents would be an excellent goal.” If one wanted to avoid attrition in a future study, one could do a contractual setup, but this would risk motivation problems in regards to the IQ scores.

As Ion Rimaru pointed out to me on Twitter, Scarr (1998) actually ended up conceding the MTAS could support either an environmentalist or a hereditarian hypothesis:

“The results of the transracial adoption study can be used to support either a genetic difference hypothesis or an environmental difference one (because the children have visible African ancestry). We should have been agnostic on the conclusions;”

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s