Signals of Human Polygenic Adaptation: Moving Beyond Single-Gene Methods and Controlling for Population-Specific Linkage Disequilibrium

This research aimed to identify signals of polygenic adaptation in various phenotypes – such as educational attainment, height, and schizophrenia – by employing traditional Fst enrichment tests and polygenic score differentiation tests like Qst and Qx. Fst tests offered inconclusive evidence for over-differentiation in allele frequencies, while Qst tests indicated significant differences for cognitive traits but not for height. The investigation underscores that Fst underestimates the extent of phenotypic differentiation due to additive genetic influences because it fails to account for the covariance of allelic effects across populations. The research demonstrates that Bird's (2021) analysis of the genetic IQ disparity between Africans and Europeans is based on the incorrect assumption that Fst should be equal to the phenotypic variance between populations (Qst), assuming all between-group variation results from additive genetic effects. The findings emphasize the importance of considering both Fst and Qst values when assessing population genetic differentiation. They also stress the importance of controlling for population-specific Linkage Disequilibrium (LD) decay. Indeed, LD decay produced a pro-European bias in polygenic scores, inflating the European mean compared to Africans and East Asians. Finally, family based or multi-ancestry GWAS are needed to account for other sources of


Introduction
In recent years, the genomic effects of natural selection on polygenic traits, or traits that are influenced by multiple genes, has become a major area of study in human population genetics and ecology (Berg and Coop, 2014;Berg et al., 2021;Field et al., 2016;Gratten et al., 2014). These genomic effects can provide insights into the evolutionary history of populations and contribute to our understanding of complex traits, such as susceptibility to diseases and response to environmental factors.
Technological advancements in genome sequencing and novel analytic methods have significantly advanced the field.
One such method, the Fst enrichment test, is used to measure divergent selection pressure on single-gene traits. It involves comparing Fst, which quantifies genetic variation between populations, at the candidate gene with the Fst of the background genetic variation, which is mostly neutral. However, this method is limited in its ability to detect divergent selection when the selection signal is weak and spread across numerous loci, as is often the case with polygenic traits (soft sweeps) (Pritchard, 2010;Hollinger et al., 2019).
In contrast, Qst is employed to quantify phenotypic differentiation between populations resulting from genetic influences, particularly in the context of polygenic traits under an additive model. Qst quantifies the proportion of genetic variation in a trait that exists between populations. For polygenic traits, polygenic scores represent the additive genetic variance, which is the sum of the effects of individual genes. Qst can be calculated as the ratio of the variation between populations for polygenic scores to the overall variation of polygenic scores, which is the sum of between-population variation and twice the within-population variation.
According to the model developed by Le Corre and , allelic covariance, or the relationship between allele frequencies and their effects on a trait, can be broken down into two components: the covariance of allele frequencies and the covariance of allelic effects. In cases where polygenic traits are under divergent selection among populations, alleles with similar effects are driven to similar frequencies within populations across multiple loci. This can result in population differences in the mean of a quantitative trait due to positive covariances -that is, linkage disequilibrium -between distant variants (Latta, 1998;Le Corre and Kremer, 2003;Ma et al., 2010).
Factor analysis, a statistical method used to analyze the structure of data, has been employed to measure allelic covariance for traits such as educational attainment and height (Piffer, 2013(Piffer, , 2016. Studies have demonstrated that allelic covariance can help explain the observed population differences in these traits, emphasizing the importance of accounting for allelic covariance when investigating the genetics of polygenic traits under divergent selection. The relationship between allelic effects and frequencies, or the covariance of these variables, can be considered as the Qeios, CC-BY 4.0 · Article, July 18, 2023 Qeios ID: HDJK5P · https://doi.org/10.32388/HDJK5P 2/36 between-population component of linkage disequilibrium, which refers to the non-random association of alleles at different loci (Storz and Kelly, 2008;Ma et al., 2010). Selection can lead to the accumulation of intergenic disequilibrium, a phenomenon that can cause differentiation at the gene level to become uncoupled from differentiation at the trait level or in the polygenic score, which represents the cumulative effect of multiple genetic variants on a trait.
The Fst enrichment test (Guo et al., 2018;Bird, 2021) compares genetic differentiation, measured as the average Fst across genome-wide association study (GWAS) single nucleotide polymorphisms (SNPs), with that of other randomly matched SNPs. However, this test is only capable of detecting one component of genetic differentiation resulting from divergent selection, specifically, the Fstq(Fst at GWAS SNPs)/Fst(at neutral SNPs) ratio. In many cases (Le Corre and , this component is small compared to the allelic covariance across populations, which can lead to false negatives -that is, incorrectly identifying no significant difference when one actually exists. Studies have compared genetic differentiation at neutral markers (which are not influenced by selection) to differentiation at candidate genes (which are potentially under selection) for various tree species. These studies have found low levels of genetic differentiation that are not significantly different from those observed for neutral markers. However, they have observed much higher levels of Qst (Eveno et al., 2008;Pyhäjärvi et al., 2008;Heuertz et al., 2006;Hall et al., 2007;Luquez et al., 2007;Derory et al., 2010;Namroud et al., 2008).
Indeed, Qst can be large even if Fst is very small. This situation occurs when there is little genetic differentiation between populations at individual loci, but the covariance in allele frequencies between populations creates differences in the phenotypic traits. For highly polygenic traits like height and cognition, the genetic variance is expected to be mostly attributable to the allelic covariance component, as the significance of this component increases with the number of loci implicated in the trait Berg and Coop, 2014 be under divergent selection in humans.
Transferring polygenic scores across populations has proven challenging in this field of research (Martin et al., 2019). This issue arises from the variability in the impact of causal variants and differences in linkage disequilibrium patterns between populations (Vilhjálmsson et al., 2015). These factors can lead to a misalignment in non-GWAS populations between the "true" causal variant and the "tag" variant (variants linked to the causal variant that do not directly affect the trait in question) identified through GWAS in populations, typically of European descent. The effect of different, mainly weaker, LD patterns is particularly strong in individuals of African ancestry, where the polygenic scores typically show considerably less validity than they do for other populations, such as South and East Asians (Fahed et al., 2021). In fact, a polygenic score for educational attainment had 50% reduction in effect size for African Americans as compared to Europeans (Lee et al., 2018), though it still retained some predictive validity in a replication sample (Rabinowitz et al., 2019). In an independent sample, there was a slightly lower (~40%) effect size reduction (from 0.26 to 0.16) (Fuerst et al., 2023).
Differential LD patterns are probably responsible for a large portion of the limited trans-ethnic portability of GWAS results, because the effects of the "true" causal alleles remain relatively consistent across ancestries, with a correlation of 0.95 across local ancestries within African-European admixed individuals (Hou et al., 2023). This paper employs a previously published method (Piffer, 2021) to identify the influence of population-specific LD patterns on polygenic scores, and to demonstrate how eliminating the most-impacted SNPs affects pairwise differences.
We aim to examine the potential influence of divergent selection on height, educational attainment, and mental disorders Qst will be calculated on all the traits to show the amount of population differentiation and how this is inflated by LD decay, whereas Qx will be computed only on the traits that were found not to be significantly biased by LD decay.
To investigate whether genetic factors contribute to phenotypic differences between groups, we will calculate the correlation between population-level polygenic scores and average population IQ (used as a proxy for education-related abilities), as well as average height. A strong correlation between average phenotype and polygenic scores is a signal of divergent adaptation (Turchin et al., 2012).
Partial polygenic scores will be computed for the ancestry components found in the Latino/Hispanic gnomAD sample. This will help us examine if the mixing of different ethnic groups happens randomly or not, focusing on certain characteristics like education. If individuals from one ethnic group don't choose partners from other ethnic groups randomly, especially considering certain characteristics, then the average genetic scores for these characteristics will differ from the scores of the ethnic groups they partner with. For example, if individuals from group A only choose highly educated partners from group B, then the partial group B genetic scores related to education among mixed individuals will be higher than the overall genetic scores for education among individuals of group B.
Finally, we show that Kevin Bird's analysis (Bird, 2021) rests on the fallacious assumption that Fst = Qst and that the value he computed from phenotypic data is very close to the Qst value computed using polygenic scores for education.

Materials and methods
Datasets: For polygenic score computation, we utilized data from various GWAS studies. For instance, the latest GWAS of height used a multi-ancestry sample of 4 million individuals, identifying 7209 height-associated loci from 12,111 genome-wide significant regions, as defined by COJO P-value < 5×10-8 in trans-ancestry GWAS meta-analysis, with +/-35 kb flanking regions (Yengo et al., 2022). Among these, the SNPs (N= 3,779) that were significant in the GWAS summary statistics file for all populations. The educational attainment (EA) GWAS summary statistics were obtained from four different studies, including Lee et al. (2018), who used multi-trait analysis of GWAS (MTAG) to identify SNP associations with high predictive accuracy for EA3 polygenic score computation. In addition, the latest GWAS of educational attainment, which used a sample size of ~3 million individuals, was used for EA4 polygenic score computation (Okbay et al., 2022). Furthermore, summary statistics for sibship (within-family) GWAS of education were retrieved from a recent meta-analysis of sibship GWAS (Howe et al., 2022). A recent, small Danish GWAS identified 4 significant SNPs correlated with the first principal component of school grades (E1), which captured overall school performance and showed the strongest genetic correlations with educational attainment (r g = 0.90; SE = 0.03; P = 4.8 × 10 -198) and intelligence (r g = 0.80; SE = 0.03; P = 3.3 × 10-128) (Rajagopal et al., 2023). The PGS from this study will be referred to as DKedu (Denmark education). Trubetskoy et al. (2022) conducted the latest schizophrenia (SCZ) GWAS and identified 313 independent SNPs in the "primary" GWAS that were significant at a genome-wide level (P < 5 × 10^-8) with a linkage disequilibrium (LD) of r2 < 0.1.
In the extended GWAS (hereafter "combined"), primary GWAS results were meta-analyzed with summary statistics from deCODE genetics, identifying 342 linkage-disequilibrium-independent significant SNPs.
This study was selected because it is the most recent and because it is the first large-scale trans-racial GWAS for schizophrenia, including individuals of European, East Asian, Africa, and Amerindian ancestry. Polygenic scores were computed using both sets of SNPs ("primary" and "combined"). The PGS derived from the larger combined ancestry GWAS had more explanatory power than the one based on the matched ancestry GWAS even for non-European cohorts, likely due to the smaller sample size of the latter. Hence, we did not use the ancestry-specific GWAS summary statistics.

Test of selection and genetic differentiation:
The Fst enrichment test (Guo et al., 2018), which calculates the Fstq and Fst values (for sets of randomly matched SNPs), will be performed to test for selection acting on allelic differentiation. The decoupling between Qst and Fst is caused by the allelic covariance (θB), which is the predominant component of selection at highly polygenic traits . The covariance of allelic effects and frequencies can also be thought of as the between-population component of linkage disequilibrium (Storz and Kelly, 2008;Ma et al., 2010).
Selection can lead to the accumulation of intergenic disequilibrium, which decouples differentiation at the gene and trait (or polygenic score) levels. This happens when alleles with similar effects are driven to similar frequencies within populations across multiple loci. Qst was computed using the formula Qst = σ²B / (σ²B + 2σ²W) (Leinonen et al., 2013).
Qst is defined as the level of genetically based population differentiation in quantitative traits (Li et al., 2019).
The total genetic variance is the variance of the polygenic scores across all individuals in all populations. The genetic variance within populations is the average variance of the polygenic scores within each population, weighted by the number of individuals in each population.
Qst is then calculated as the genetic variance among populations divided by the sum of the genetic variance among populations and twice the genetic variance within populations.
As a test of divergent selection, GWAS beta (or OR, odds ratio) were randomly flipped with a probability of 0.5. The A1 and A2 alleles were randomly shuffled with a probability of 0.5 (i.e., coin flip) to produce a null distribution of polygenic scores and calculate random Qst values.
Another measure of over-dispersion of phenotypes (or polygenic scores) closely related to Qst, Qx, will be calculated using the formula provided by Berg and Coop (2014). Qx will be much smaller than 1 for traits under stabilizing selection with the same optimum across populations, whereas diversifying selection will produce values larger than 1. P-values for the Qx statistic were computed using a randomization procedure based on randomizing the sign of the effect size estimates of the GWAS SNPs as done in Refoyo-Martinez et al. (2021).
For the Qst test, the Fst enrichment test and Qx, control variants were matched to SNP variants using vSampler (Huang et al., 2021). The effect of LD decay on mean population polygenic scores will be tested using the method described by Piffer (2021).
To investigate the variation in linkage disequilibrium (LD) patterns across populations, the SNPs were inputted into LDlink (Machiela and Chanock, 2015). Variants within a +/-500 Kb window of the query variant that had a pairwise R2 value greater than 0.01 were downloaded, using CEU (Utah residents with Northern and Western European ancestry), YRI (Yoruba in Ibadan, Nigeria), and JPT (Japanese in Tokyo, Japan) as reference populations.
The pairwise R2 values between the GWAS variant and the linked variants were then computed for CEU, YRI and JPT, and the correlation coefficient was used as a measure of differential LD decay across these populations compared to the query variant. A higher correlation between the CEU and YRI (or JPT) R2 values indicated a lower level of trans-ethnic LD Qeios, CC-BY 4.0 · Article, July 18, 2023 Qeios ID: HDJK5P · https://doi.org/10.32388/HDJK5P 6/36 decay. Genetic value scores (GVS) for CEU and YRI (or JPT) were calculated for each GWAS SNP by multiplying the frequency of the effect allele by the GWAS effect size. Other populations of interest could also be used to calculate genetic value scores in a similar manner.
To compute the correlations between polygenic scores and population IQ, we merged HGDP, 1KG and gnomAD datasets and when there were overlapping populations, the larger sample was retained. For example, the ASW and FIN in 1KG (N = 113) were replaced with the African American (N = 20,744) and Finnish (N = 5,316) gnomAD samples. The resulting dataset comprised 72 populations.
The data sources used for population average IQ and national average height were as follows: Lynn and Vanhanen (2012) for IQ data and NCD-RisC (2020) (2014) for Palestine, Lynn (2010) for Sardinia, and Shibaev & Lynn (2017) for the Yakut population.

ANOVA
Polygenic scores were calculated for individuals in the four 1KG super-populations.
One-way ANOVA was run using the GWAS summary statistics for EA3, EA4, SCZ and Height 2022. Results revealed statistically significant differences between the group-level polygenic scores (Table 1), which suggest there are genetic differences between the populations studied that are associated with the traits examined.

Pairwise differences
The results of Fst analysis for two different population pairs (EUR-EAS and EUR-AFR) are presented in Table 3. The Fst values ranged from 0.081 to 0.149, indicating moderate genetic differentiation among populations.
To assess the significance of the Fst values, we compared them with the random Fst values generated by permutation tests. The Z score (Fstq/Fst) and p-values are also reported in Table 3.
For the EUR-EAS population pair, the Fst values for the polygenic scores EA3 and EA4 (LD filter 0.1) were 0.094 and 0.081, respectively. The corresponding random Fst values were 0.087 and 0.084, and the Z scores (Fst/Fst random) were 2.87 (p = 0.003) and -1.79 (p = 0.55), respectively. These results suggest significant genetic differentiation between the EUR and EAS populations for the EA3 score but not for the EA4 score. Pairwise Qst values were calculated for two population pairs: EUR-EAS and EUR-AFR. The Qst values were computed using both real and shuffled beta weights, These values are reported in Table 4.
For the EUR-EAS population pair, the Qst value for the EA3 PGS was 0.044 based on real Beta weights, and 0.05 based on shuffled Beta weights. The Z score comparing these values was -0.1 with a p-value of 0.36. The Qst value for the EA4 PGS was 0.456 based on real beta weights, and 0.045 based on shuffled beta weights. The Z score comparing these values was 7.6 with a p-value of 0.001.
The Qst value for the height PGS was 0.192 based on real Beta weights, and 0.057 based on shuffled beta weights. The Z score comparing these values was 1.92 with a p-value of.059.
For the EUR-AFR population pair, the Qst value for Schizophrenia was 0.568 based on real beta weights, and 0.088 based on shuffled beta weights. The Z score comparing these values was 4.73 with a p-value of.001. The Qst value for the EA3 PGS was 0.619 based on real beta weights, and 0.062 based on shuffled beta weights. The Z score comparing these values was 6.83 with a p-value of.001.The Qst value for the EA4 PGS was 0.942 based on real Beta weights, and 0.057 based on shuffled beta weights. The Z score comparing these values was 12.58 with a p-value of.001.
The Qst value for the polygenic score height was 0.013 based on real beta weights, and 0.076 based on shuffled beta weights. The Z score comparing these values was -0.67 with a p-value of.698.
Overall, these results provide evidence of divergent selection for some polygenic scores, particularly for EA3 and SCZ in the EUR-AFR and EA4 in the EUR-EAS and EUR-AFR population pair.  ranging from 0.72 to 0.75. The GVS ("genetic value score" or weighted allele frequency) difference between CEU and YRI or CEU and JPT was then computed by multiplying the effect allele frequency by the GWAS beta. The correlation between the GVS difference and the amount of LD decay is reported in Table 6. The results are visualized in Figure 3.

Reliability of population-level polygenic scores
They show that LD decay did not significantly impact the trans-ethnic polygenic score difference for most of the polygenic scores and population pairs. However, for the EUR-AFR EA4 and EUR-EAS EA4 and height EUR-EAS pairs, we observed a significant negative correlation between the GVS difference and the amount of LD decay. A negative correlation between GVS difference and lack of LD decay implies that LD decay is inflating the European PGS relative to the other population, as SNPs with lower LD decay have smaller PGS differences.
These findings suggest that the impact of LD decay on trans-ethnic polygenic score differences may vary across different polygenic scores and population pairs.

Selecting low LDD SNPs
To select SNPs with low LD decay for the EUR-AFR and EUR-EAS pairs, a threshold of r = 0.8 was chosen and applied separately to each population pair. Because LD decay patterns vary across population pairs, different SNPs will belong to the low LD group in each pair of populations. Hence retaining a single set of SNPs (corresponding to the intersection of the different sets) would result in a much smaller number of SNPs, reducing reliability. Cohen's d values for the group differences in polygenic scores were compared to those for the full set of SNPs and are reported in Table 7.

Qx test
The Qx test (Berg and Coop, 2014) was carried out on EA3, SCZ, the sibship EA and height PGS. EA4 was omitted from the analysis because it was found to be strongly biased by differential LD-decay.

Correlation with phenotypic means
The correlations between the average cognitive/educational polygenic scores and average population IQ were 0.87, 0.78 for EA3 and EA4, respectively ( Figures 5,6).
Qeios, CC-BY 4.0 · Article, July 18, 2023 Both EA3 and EA4 were correlated to absolute latitude (r = 0.64 and 0.60, respectively). A multiple linear regression was performed with average height as the dependent variable and Height PGS + EA3 + Latitude as predictors. The standardized betas were 0.62 and 0.44 for the Height and EA3 PGS, respectively (Table 10). All continuous predictors are mean-centered and scaled by 1 standard deviation. *** p < 0.001; ** p < 0.01; * p < 0.05.
These findings indicate that both the height and EA3 PGS are valid predictors of average height.

Partial PGS
Partial polygenic scores were computed for the three local ancestry components (Amerindian, African, European) of the Admixed American/Latino population in gnomAD. This ethnic group is extremely heterogeneous, consisting of 5% of individuals who derive their genetic ancestry primarily from a single continental population, 60% from two continental populations, and 35% with three continental populations well-represented within their genome. The allele frequencies for the three local ancestry groups were made available by gnomAD in a VCF file. (https://gnomad.broadinstitute.org/news/2021-12-local-ancestry-inference-for-latino-admixed-american-samples-ingnomad/).
The partial and full PGS are very similar (Figure 10). The partial AFR PGS is lower than the full PGS because the latter is computed using the African/African American sample in the gnomAD dataset, which is mixed with Europeans, whereas the local ancestry is "purely" African.

Discussion
Traditional tests of population genetic differentiation based on individual loci (Fst enrichment test) offered mixed evidence for over-differentiation in allele frequencies (Tables 2 and 3, Figure 2). For the global test comprising four superpopulations, EA3, the test attained significance only without LD clumping and with LD clumping using a threshold of R2 = 0.1. The results became non-significant with a stricter LD threshold of 0.01. Conversely, there was evidence of population under-differentiation for EA4 because the GWAS Fst values were significantly lower than the average Fst of the random SNPs. On the other hand, SCZ and height had significantly higher Fst than the average Fst of random SNPs.
Tests of polygenic score differentiation such as Qst in contrast yielded significant results for the cognitive traits but not for height ( figure 1, table 2, 4). Qst values of polygenic scores were significantly higher than those obtained from reshuffling the effect alleles.
Qst indicates the proportion of phenotypic variance accounted for by additive genetic components between populations to the total variance. Qst values ranged from 0.12 for height to 0.58 for EA3 and 0.91 for EA4, indicating that a substantial proportion of variation in polygenic scores is found between populations. More importantly, Fst underestimates the amount of phenotypic differentiation due to additive genetic effects, because it is a single-gene test that does not take into account the covariance of allelic effects between populations, which can cause large differences in phenotypic means even with low Fst values. Kremer and Le Corre (2012) showed that the genetic differentiation at the level of individual loci Qeios, CC-BY 4.0 · Article, July 18, 2023 Qeios ID: HDJK5P · https://doi.org/10.32388/HDJK5P 22/36 (Fst) does not necessarily correspond to the genetic differentiation underlying phenotypic traits (Qst). This is because Qst considers the additive genetic variance between populations, while Fst only measures the allele frequency differences.
Consequently, Bird's assumption (Bird, 2021) that the Fst value estimated from GWAS-identified SNPs should equal the phenotypic variance if all between-group variation is due to additive genetic effects is theoretically flawed. His oversight when accounting for cross-population LD leads him to equate phenotypic (IQ) group differences with Fst, and to conclude that genetic differentiation cannot explain between-group variance in IQ scores because the Fst value is much lower than the phenotypic Fst (Qst) calculated using his equation 2.
Bird calculated phenotypic Fst values ranging from 0.51 to 0.6, based on an estimated EUR-AFR difference of 30.8 IQ points and h 2 = 0.35 or 0.5 and observed that these are much higher than the EUR-AFR actual Fst (0.11), which would instead translate to a 4.7 -8.5 IQ points EUR-AFR difference.
In fact, as shown in the introduction, Qst (erroneously named "phenotypic Fst" by Bird) is often much higher than Fst as shown by mathematical modeling (Kremer and Le Corre, 2013) and empirical results (Berg and Coop, 2014). The equivalence between Qst and Fst (Qst = Fst) is expected under neutrality, and higher values of Qst (Qst > Fst) indicate divergent selection (Leinonen et al., 2013).
Bird's failure to acknowledge the difference between Qst and Fst leads him to expect Qst = Fst and to discard deviations from this equivalence as due to environmental factors or erroneous estimates of average IQ (Bird, 2021).
We derived a Qst value of 0.61 for EA3 concerning the EUR-AFR difference, which aligns with Bird's estimate of Qst derived from phenotypic IQ (erroneously misinterpreted as Fst by Bird) of 0.6. Indeed, Pst ("pseudo Qst" or the phenotypic equivalent of Qst) = Qst when environmental variance is zero (Saether et al., 2007).
Divergent selection often occurs in two phases: initially capturing advantageous allelic associations at various loci in distinct populations, followed by targeting changes in allelic frequencies. This supports the idea that allelic associations contribute to rapid genetic divergence between populations more effectively than changes in allelic frequencies. The disparity between Qst and Fst becomes more pronounced in traits governed by a large number of loci experiencing strong divergent selection , and this effect is expected to be significant for traits such as educational attainment, schizophrenia, and height. This in turn reinforces the findings of Berg and Coop (2014) that the power to detect population differentiation in polygenic scores stems almost entirely from the LD-like component, and the differentiation at the individual loci (i.e. Fst) has very little impact. Indeed, the Qst values were much higher than the Fst values for the neutral alleles, with Qst/Fst ratios of 6, 10 and 4 for EA3, EA4 and SCZ respectively (Table 2). Phenotypic traits with Qst significantly larger than the Fst estimated from neutral markers are considered as being under local adaptation, whereas Qst = Fst is the expectation under neutrality (Leinonen et al., 2013). Moreover, Qst was much higher than Fst estimated from GWAS SNPs. This "decoupling" is caused by the allelic covariance component Berg and Coop, 2014).
In fact, the polygenic selection test carried out by Bird (2021)  family effect sizes failed to reach statistical significance. However, this is likely due to the small sample size employed in within-family GWAS, much smaller than the population GWAS (N = 55K vs 1 and 3 million for EA3 and EA4, respectively).
Remarkably, none of the within-family SNPs reached statistical significance (after correction for multiple testing) and only 15 SNPs passed the p< 5*10^-6 filter after clumping with LD < 0.1. The null effect of within-family SNPs was evident both from the Fst enrichment test (Table 2) and the tests of polygenic score overdispersion such as Qst and Qx (Tables 4 and   5, respectively). This lack of validity was corroborated by the negative Cronbach's Alpha values (Table 6).
However, the significant overdispersion of education-related polygenic scores derived from the traditional between-family GWAS (Lee et al., 2018) was confirmed by the Qx test, which achieved values much higher than the null expectation (table 5). The Qx values for the height PGS also barely exceeded random expectations (p = 0.035).
SCZ had Qst values significantly higher than chance expectation (Qst = 0.57), but this was restricted to the EUR-AFR difference, with no differentiation between EUR-EAS (Table 4). This replicates earlier findings of a strong association between schizophrenia PGS and African ancestry (Curtis, 2018). The mean EUR-AFR difference was 10 times as high as the mean difference between European schizophrenia cases and controls.
Although there were cross-population differences in LD patterns, they did not significantly affect most of the polygenic score differences (Table 7). Nonetheless, LD decay did cause the European mean to be inflated compared to Africans and East Asians (Table 8). This was evident in the significant negative correlation between the GVS difference and the amount of LD decay (Figure 4) In all cases except for the height EUR-AFR difference, the bias due to varying LD patterns favored the European population. This bias results from the frequency distribution of non-causal SNPs. Although exploring the origins of this bias is beyond the scope of this study, it could be explored in future research.
The partial polygenic scores calculated using the admixed Latino population revealed a similar pattern to those computed using relatively admixed individuals from gnomAD and 1KG ( Figure 10). The low score obtained by the Amerindian genetic component replicated earlier results by Piffer (2013), who observed a discrepancy between the relatively low genetic distance of Native American from East Asians and the large gap in polygenic scores (Piffer, 2013). This finding is supported by the results of admixture analyses of different American ethnic groups, which found that Amerindian ancestry is about equally negatively associated as African ancestry with general cognitive ability among African, Hispanic, and other American subsamples (Fuerst, Hu and Connor, 2021).
There was a strong correlation between the new polygenic score for educational attainment (EA4) and the old (EA3).
Remarkably, a new polygenic score of school grades showing strong genetic correlations with educational attainment (r g = 0.90) and intelligence (r g = 0.80) (Rajagopal et al., 2023) was highly correlated to EA3 and EA4 (r = 0.91 and 0.76, respectively), replicating the cross-population validity of education PGS (Piffer, 2021).
The EA3 and EA4 PGS were both correlated to latitude at r = 0.6. A regression model showed that both EA3 and latitude were significant predictors of average IQ (Table 11). This suggests that higher latitude may confer an advantage in cognitive performance via environmental factors, such as limiting the detrimental effects of heat (Piil et al., 2020).
On the other hand, both EA3 and the height PGS predicted average height (Table 10). This suggests that cognitive abilities have an impact on average height by improving economic conditions.
Finally, we introduced Cronbach's alpha (a measure borrowed from psychometrics) to assess the reliability of population polygenic scores. In psychometrics, tests are supposed to gauge the same underlying construct (like anxiety, depression, intelligence, and so on). If the test is reliable, then we would expect all the items on the test to correlate highly with each other -since they all aim to measure the same thing. Cronbach's alpha quantifies the degree of intercorrelation among test items. It ranges from 0 to 1. A higher Cronbach's alpha -generally, above 0.7 -indicates good internal consistency, meaning the items on the test are all measuring the same underlying construct.
When applied to population-level polygenic scores, the strength of the coefficient depends on the magnitude of crosspopulation LD ("covariance of allelic effects") and the number of SNPs. However, instead of the underlying construct, it is the divergent selection pressure that causes the inter-correlation between the items (i.e. frequency of the GWAS effect allele weighted by the effect size).
In summary, this study investigated the relationship between genetic differentiation in various traits, such as educational attainment (EA3 and EA4), height, and schizophrenia, using traditional Fst enrichment tests and polygenic score differentiation tests such as Qst. The results revealed mixed evidence for over-differentiation in allele frequencies using Fst tests, while Qst tests yielded significant results for cognitive traits but not for height. The study also highlighted that Fst underestimates the amount of phenotypic differentiation due to additive genetic effects, as it does not account for the covariance of allelic effects between populations. This finding calls into question Bird's (2021) assumption that Fst should equal the phenotypic variance if all between-group variation is due to additive genetic effects.
The study's findings emphasize the importance of considering both Fst and Qst values in assessing population genetic differentiation, as well as the need to account for the covariance of allelic effects between populations when interpreting results. The results also demonstrate that allelic associations contribute to rapid genetic divergence between populations more effectively than changes in allele frequencies. This phenomenon is particularly pronounced in traits governed by a large number of loci experiencing strong divergent selection, such as educational attainment, schizophrenia, and height.
The results of the selection tests are greatly affected by the Genome-Wide Association Studies (GWAS) used to derive polygenic scores. This influence can be discerned from the disparities observed when comparing different versions of these studies, such as between EA3 and EA4, or when contrasting GWAS based on sibling data versus those relying on broader population data.
However, it is currently unfeasible to account for all possible sources of bias and inaccuracies when estimating these polygenic scores on a population-wide level. For instance, potential sources of bias might stem from the lack of representation of diverse populations in the GWAS databases, which primarily contain data from people of European ancestry. Another source of error can be the complex nature of many traits that are influenced by a multitude of genes interacting in ways that we do not fully understand yet.
Moreover, population-based GWAS results are confounded by population stratification, assortative mating and indirect genetic effects. Within-family genetic association estimates are relatively free from these sources of biases, but the studies published so far rely on small sample sizes that lack the power to detect meaningful associations. For example, the sibship EA and Height GWAS relied on sample sizes of 150K and 129K individuals (Howe et al., 2022), respectively, much smaller than the population based GWAS sample sizes of 3 and 5 million individuals (Okbay et al., 2022;Yengo et al., 2022). This results in few or no GWAS-significant SNPs, and the lack of GWAS significant SNPs affects between population genetic estimates more strongly than within-population genomic prediction.
Therefore, the findings drawn from these tests should be viewed as provisional and subject to alteration. This is because new GWAS, incorporating more diverse population samples and using more advanced methodologies, will continue to be conducted. As we refine these techniques and broaden the scope of our research, our understanding of polygenic scores and their implications will evolve, and this will likely change the outcomes of the selection tests.