Identifying and Interpreting Apparent Neanderthal Ancestry in African Individuals

Admixture has played a prominent role in shaping patterns of human genomic variation, including gene flow with now-extinct hominins like Neanderthals and Denisovans. Here, we describe a novel probabilistic method called IBDmix to identify introgressed hominin sequences, which, unlike existing approaches, does not use a modern reference population. We applied IBDmix to 2,504 individuals from geographically diverse populations to identify and analyze Neanderthal sequences segregating in modern humans. Strikingly, we find that African individuals carry a stronger signal of Neanderthal ancestry than previously thought. We show that this can be explained by genuine Neanderthal ancestry due to migrations back to Africa, predominately from ancestral Europeans, and gene flow into Neanderthals from an early dispersing group of humans out of Africa. Our results refine our understanding of Neanderthal ancestry in African and non-African populations and demonstrate that remnants of Neanderthal genomes survive in every modern human population studied to date.

Despite the methodological progress that has been made to identify introgressed hominin sequence, opportunities for further development of statistical tools abound and may result in novel insights. For example, a recent extension of the S∗ framework revealed two waves of Denisovan admixture in East Asian populations that were not previously detectable (). To this end, we describe a novel method for detecting Neanderthal ancestry in modern humans that does not require an unadmixed reference human panel, which we refer to as IBDmix. We apply IBDmix to genotype data from a large set of modern human individuals from Eurasia, America, and Africa. We make novel discoveries regarding Neanderthal ancestry in Africans and re-examine the relative levels of Neanderthal ancestry in Eurasian populations. We also replicate, extend, and discover new instances of adaptive introgression that may offer insight into human evolution and phenotypic variation in modern humans.

Moreover, a consistent observation in all studies of archaic hominin admixture is that East Asian populations have approximately 20% more Neanderthal ancestry compared to Europeans (). Numerous models have been invoked to explain this difference, including the interaction of demography and selection (), dilution by non-admixed populations (), or additional population-specific admixture events (). Accurately determining variation in Neanderthal ancestry among non-African populations has important implications for refining our understanding of admixture between modern human ancestors and Neanderthals.

The ability to identify introgressed hominin sequence in the genomes of modern humans enables inferences about the functional, evolutionary, and phenotypic significance of archaic admixture. For example, the genomic distribution of surviving Neanderthal and Denisovan lineages has been influenced by purifying selection (), which has purged introgressed sequence that was deleterious in modern humans. Indeed, some exceptionally large regions depleted of archaic ancestry (also referred to as “archaic deserts”) have been identified and may be due to selection (). There is also strong evidence that some Neanderthal and Denisovan sequences were beneficial () and were rapidly driven to high frequency in modern human populations by a process known as adaptive introgression (). In general, however, the functional impacts of introgressed sequences, how they have been shaped by selection, and how they have influenced modern human health and disease are only beginning to be explored.

A haplotype at STAT2 Introgressed from neanderthals and serves as a candidate of positive selection in Papua New Guinea.

Studies of ancient DNA are transforming our understanding of human evolutionary history and, in particular, how admixture has shaped past and present patterns of human genomic variation (). Of particular interest has been the discovery that admixture with archaic hominins occurred multiple times throughout human history (). In particular, approximately 2% of all non-African ancestry is derived from Neanderthals (), with Oceanic populations having an additional 2%–4% of ancestry attributable to gene flow with Denisovans ().

Previous analyses have identified large (>10 Mb) autosomal regions of the genome that are significantly depleted of Neanderthal ancestry in all non-African populations (). These large “deserts” of archaic introgressed sequence appear at frequencies greater than expected under neutral models. We analyzed our IBDmix call set to see if we could replicate previous findings or determine if deserts were a function of previous methodological biases. Following previously described methods to identify archaic deserts, we analyzed our IBDmix callset from both African and non-African samples ( STAR Methods ). We replicated 4 of the 6 previously reported deserts of Neanderthal sequence, including the deserts that contain FOXP2 (chr7) and ROBO1 and ROBO2 (chr3) ( Table S8 Figure S4 ). Moreover, the four replicated deserts are the same regions previously shown to also be significantly depleted of Denisovan ancestry. Thus, depletions of archaic ancestry seem to be a general feature of the data and are not likely due to methodological issues in identifying introgressed sequence. It is noteworthy that including all African samples, a subset (YRI), or none does not dramatically change the distribution of the frequencies of large deserts. This is consistent with the observation that the African Neanderthal sequence is predominantly a subset of non-African segments.

Of the 38 non-African-specific high-frequency Neanderthal haplotypes we identified, 19 were previously reported by, including well-known targets of adaptive introgression like WDR88, POU2F3, and TLR1/6/10 ( Figure 6 A and 6B ). Intriguingly, we also identified 31 high-frequency haplotypes shared by Africans and Europeans, including TRIM55 ( Figure 6 C; Table S7 ). These haplotypes would have been undetected in previous methods that relied on unadmixed reference human panels. Furthermore, we were for the first time able to detect African-specific high-frequency Neanderthal haplotypes ( Figure 6 D; Table S7 ). The 13 African-specific high-frequency Neanderthal haplotypes we identified show enrichment for genes involved in immunological function (e.g., IL22RA1 and IFNLR1) and ultraviolet-radiation sensitivity (e.g., DDB1 and IL22RA1) (). While some high frequency Neanderthal-like variants in Africans may derive from human-to-Neanderthal gene flow, only one of the high-frequency haplotypes shared by Africans and Europeans (chr3:89,587,868–90,134,709) overlaps a locus previously identified as introgressed from modern humans into the Altai Neanderthal (), and none of our detected African-specific high-frequency haplotypes do. These novel findings provide insight into the evolutionary history of these populations, the selective pressures they faced, and current variation in health and disease.

(C) An example of a high-frequency Neanderthal segment shared between Europeans and Africans at TRIM55. This haplotype, identified by IBDmix, is missed by methods that mask sequence shared by African and non-African populations.

In all plots, each row is an individual and is organized by population. Neanderthal segments called by IBDmix are plotted in dark green (EAS), orange (EUR), or purple (AFR). GWAS SNPs are shown as purple triangles and populations-specific high-frequency-derived alleles (DAF > 40%) that match the Altai reference genome are shown as red circles. In (A) and (B), examples of high-frequency introgressed segments detected in East Asian and European populations are shown for the POU2F3 and the TLR1/6/10 cluster.

Specifically, for variants that intersected identified Neanderthal segments, we calculated the differences in the derived allele frequencies between Europeans and East Asians, Africans and Europeans, and Africans and East Asians. We then took an outlier approach to identify loci with allele frequency differences in the 99percentile. We further filtered on loci where the derived allele matched the Neanderthal allele. Overall, we identified 38 non-African-specific high-frequency haplotypes and 13 African-specific high-frequency haplotypes ( Table S7 ). We compared these identified high-frequency haplotypes with previously identified high-frequency haplotypes () and the presence of previously reported GWAS SNPs.

Admixture with Neanderthals may have provided a mechanism for modern humans to acquire novel adaptive variation. Previous analyses have reported population-specific high-frequency introgressed Neanderthal haplotypes, which may be instances of adaptive introgression () or the reintroduction of alleles lost in the modern human lineage (). We examined our IBDmix callset for similar findings. We leveraged population-level derived allele frequencies of variants that overlapped calls made by IBDmix and matched the Neanderthal allele, in order to detect Neanderthal haplotypes with unusually large differences in frequency between populations.

We also examined how the reference panel size for S∗ affects Neanderthal ancestry estimates by bootstrap resampling the Yoruba samples in 1000 Genomes Project data (n = 108) and reanalyzing chromosome 1 for Europeans and East Asians ( Figure 5 C). We generated multiple reference panels based on different sample sizes and re-called Neanderthal sequence for European and East Asian individuals using the S∗-pipeline and the new reference panels. We compared the total S∗-sequence called for each sample to the average amount of S∗-sequence called for samples using a reference panel of 1 individual. Increasing the reference panel size showed a significant reduction (p < 2 × 10) in the amount of Neanderthal sequence called per individual. In addition, when comparing the amounts of Neanderthal sequence identified in Europeans and East Asians, increasing the reference panel size decreased the amount detected for both populations, but there was a greater loss in Europeans than in East Asians. Using a reference sample larger than 10 led to an apparent 20% enrichment of Neanderthal ancestry in East Asians compared to Europeans, as previously reported. Simulations of European to African back-migration using rates consistent with standard demographic models also generate a significant enrichment of Neanderthal ancestry in East Asians compared to Europeans when the data are analyzed with S∗, so long as back-migration occurs after the split of European and East Asian lineages (p < 8 × 10 Figure S3 ). Collectively, these results show that Neanderthal ancestry estimates in East Asians and Europeans were biased due to unaccounted for back-migrations from European ancestors into Africans.

Back-migration from ancestral Eurasians (left) reduces the amount of Neanderthal sequence recovered by S∗, but does not produce the apparent enrichment in East Asians when compared to Europeans, as seen in migration from ancestral Europeans (right). IBDmix is robust to both the rate and timing of migration. The level of Neanderthal ancestry is reported as an average for the population with the corresponding 95% confidence interval.

In the IBDmix callset for Africans, Europeans, and East Asians, there is a large enrichment of Neanderthal sequence shared exclusively between Africans and Europeans compared with the sequence shared exclusively between Africans and East Asians ( Figure 5 B). As a proportion of the total amount of Neanderthal sequence for each population, 7.2% of European sequence is shared exclusively with Africans, which is substantially higher than the 2% of East Asian sequence shared exclusively with Africans ( Figure 5 B). The disproportionate level of sharing between Africans and Europeans is consistent even after down-sampling the recovered Neanderthal segments in Europeans to match the total coverage of Neanderthal sequence in East Asians ( STAR Methods ). This imbalance in the proportion of exclusively shared sequence between African and non-African populations directly contributes to the biased Neanderthal ancestry estimates in previous methods that use an African reference panel.

Previous methods that have relied on unadmixed modern reference populations, like S∗, have reported >20% enrichment of Neanderthal sequence in East Asians compared to Europeans ( Figure 5 A). However, results from IBDmix show only 8% enrichment of Neanderthal sequence in East Asians compared to Europeans ( Figure 5 A). This level of enrichment is robust to changes in the segment size cutoff (30 kb, 40 kb, 50 kb) used for IBDmix calling ( Table S5 ). To better understand the discrepancy between IBDmix and previous inferences, we first removed Neanderthal sequence called by IBDmix in Europeans and East Asians that was shared with Africans (YRI) and replicated an 18% enrichment of Neanderthal ancestry in East Asians compared to Europeans ( Figure 5 A). This result shows that our observation of similar levels of Neanderthal ancestry in Europeans and East Asians is due to no longer masking Neanderthal sequence shared with Africans.

(C) Violin plot showing the decreasing amount of Neanderthal sequence identified in East Asian and European individuals by S∗ with increasing African reference-panel size.

(B) Venn diagram illustrating the amount of sequence shared among Africans and non-Africans. The bar plot shows the amount of exclusively shared sequence between Africans and non-Africans as a proportion of the total amount of sequence for each population.

(A) Violin plots showing enrichment of Neanderthal ancestry in East Asians compared to Europeans for S∗ and for IBDmix with and without masking Neanderthal sequence shared with Yoruba.

To further confirm the role of back-migration in introducing Neanderthal sequence into African populations, we examined the rate of overlap between called Neanderthal segments and non-African ancestry tracks in African samples. We hypothesized that if the Neanderthal sequence in Africans was introduced by back-migration from ancestors of contemporary Europeans, then there should be enrichment for overlap of Neanderthal segments and European ancestry segments in African samples. To test this hypothesis, we compared data from chromosome 1 for all 504 African samples in our analysis. For each individual, we identified tracks of European and East Asian ancestry using RFMix () and measured the rate of overlap with identified Neanderthal segments in the same individual ( Figure 4 A). We averaged these rates of overlap to calculate empirical rates of overlap for European ancestry and East Asian ancestry separately ( Figure 4 B). We found the rate of overlap with European ancestry to be highly significant (permutation p < 0.0001), while the rate of overlap with East Asian ancestry was not (permutation p > 0.05) ( Figure 4 B). These data are consistent with the hypothesis that back-migration contributes to the signal of Neanderthal ancestry in Africans. Furthermore, the data indicate that this back-migration came after the split of Europeans and East Asians, from a population related to the European lineage.

(B) Distributions of the mean rate of overlap from permuted data for European ancestry and East Asian ancestry, with the empirical values demarcated as dashed lines. The rate of overlap for European ancestry is highly significant (p < 0.0001), while the rate of overlap for East Asian ancestry is not (p > 0.05).

(A) Schematic of how an enrichment of European ancestry overlap was assessed. For each African individual, data from chromosome 1 were analyzed for tracks of Neanderthal and European ancestry. For each individual, the rate of overlap between Neanderthal segments and European segments was calculated, and the mean across all African individuals was taken as the empirical value.

These features are not replicated in either models with back-migration or human-to-Neanderthal gene flow alone. Specifically, while features like the distribution of segment lengths and the frequency of African segments in the African population are replicated in models with human-to-Neanderthal gene flow, only models with back-migration rates elevated in comparison to standard demographic estimates (5 × 10 −5 /generation) can replicate the enrichment of East Asian Neanderthal ancestry when masking shared African sequence. A model that combines both of these events, elevated back migration and human-to-Neanderthal gene flow, matches the empirical data best across all features. In summary, these data indicate that both pre-OOA human-to-Neanderthal gene flow and elevated historic back-migration contribute to the signal of Neanderthal ancestry detected in Africans.

In the empirical data, segments identified in Africans (YRI) that are shared with non-Africans (EAS and EUR) have a distribution of segment sizes more similar to that of non-African calls and also occur predominantly at high frequency (>10%) in the African population ( Figure 3 ). As noted previously, there is only a small enrichment (<10%) for Neanderthal ancestry in East Asians compared to Europeans without masking sequence shared with Africans. When shared sequence is masked, however, this enrichment increases to ∼18% ( Figure 3 ).

We therefore explicitly tested whether putative Neanderthal sequences identified in Africans were more likely to be explained by back-migration from non-Africans into Africa or by pre-OOA human-to-Neanderthal gene flow. To differentiate these scenarios, we compared the empirical data to simulated data, analyzing a variety of sequence characteristics ( Figure 3 ). Specifically, we simulated genotype data under a series of demographic models that included Neanderthal admixture into non-Africans, increasing levels of back-migration from Europeans into Africans, and gene flow from a pre-OOA human lineage into Neanderthals at varying time points. We then identified introgressed sequence for these models using IBDmix. We compared the empirical and simulated data across features including introgressed segment length, frequency of introgressed segments in the African population that are shared with non-Africans, and the ratio of East Asian Neanderthal ancestry to European Neanderthal ancestry before and after masking Neanderthal sequence shared between Africans and non-Africans.

Features of the empirical data were compared to data simulated under a model of back-migration, human-to-Neanderthal gene flow, and a mixture of both models (see the STAR Methods ). From left to right, the distribution of Neanderthal segment lengths, frequency of segments in Africans that segregate in Africans and non-Africans, and the ratio of East Asian to European Neanderthal ancestry with and without masking sequence shared with Africans.

We next considered two demographic models that could plausibly generate signals of Neanderthal ancestry in Africans that are detectable by IBDmix. Specifically, we studied models where non-African individuals, who carry Neanderthal sequences inherited from hybridization, migrated back to Africa and models of human-to-Neanderthal gene flow due to an early pre-out-of-Africa (pre-OOA) dispersal of modern humans (). We found that IBDmix is sensitive to both back migrations and pre-OOA gene flow from modern humans to Neanderthals ( Figure 2 C).

Given the unexpectedly large amounts of Neanderthal sequence identified in African individuals, we next performed analyses to understand their origins. To rule out systematic biases, we first called Denisovan sequence in African individuals using IBDmix ( STAR Methods ) and only identified 1.2 Mb/individual of Denisovan sequence in African samples ( Table S6 ). This is similar to the amount of Denisovan sequence called in non-African individuals (∼1Mb/individual) and considerably lower than the amount of Neanderthal sequence identified by IBDmix in African individuals. We also performed extensive simulations and found that the signal of Neanderthal ancestry in Africans was unlikely to be explained by false positives due to shared ancestry ( Figure 2 C).

We also recovered a substantial amount of Neanderthal sequence in non-African samples across populations. Notably, we found similar levels of Neanderthal ancestry in Europeans (51 Mb/individual), East Asians (55 Mb/individual), and South Asians (55 Mb/individual) ( Figure 2 A; Table S4 ). Surprisingly, we observed only a modest enrichment (8%) of Neanderthal ancestry in East Asian compared to European individuals. This contrasts with previous reports that have indicated ∼20% enrichment of Neanderthal ancestry in East Asians compared to Europeans (). The observed level of East Asian enrichment was even smaller (∼3%) when we were less conservative in our filtering methods ( Table S5 ). We compared the Neanderthal sequences in non-African individuals identified by IBDmix (merged regions) to those identified by previous methods, including S∗, diCal-admix, and CRF, for individuals shared in all these studies. Approximately 80% of the sequences overlapped between the IBDmix callset and the other callsets ( Figure S2 ).

Because IBDmix does not use a putatively unadmixed modern reference population, we were able to robustly identify regions of apparent Neanderthal sequence in African populations for the first time ( Figure 2 A). Surprisingly, we identified on average 17 Mb of Neanderthal sequence per individual in the African samples analyzed, and this value was similar across the mostly northern African subpopulations represented in the dataset (ranging from 16.4 Mb/individual in ESN to 18.0 Mb/individual in LWK; Figure 2 A; Table S4 ). Furthermore, we observed a significant overlap of sequence identified in Africans with that in non-Africans ( Figure 2 B). Specifically, of the Neanderthal sequence identified in African samples, more than 94% was shared with non-Africans.

(C) Bar plot showing the proportion of Neanderthal ancestry per individual in non-African (blue) and African (red) populations in different simulated models.

(A) Violin plots showing the amount of Neanderthal sequence called per individual across geographically diverse populations from the 1000 Genomes Project. Non-African, African admixed, and African populations are shown in blue, purple, and red, respectively. The inset figure shows the amount of Neanderthal sequence per individual for five African subpopulations.

We applied IBDmix to samples from the 1000 Genomes Project (), collected from geographically diverse populations, and used the Altai Neanderthal reference genome () to identify introgressed Neanderthal sequence in these individuals. After filtering ( STAR Methods ), we identified 110.98 Gb of Neanderthal sequence among 2,504 modern individuals. When overlapping introgressed segments are merged, this equates to 1.29 Gb of unique Neanderthal sequence.

In summary, IBDmix has higher power and lower FDR compared to S∗ and is robust to reference population biases. In the following, unless otherwise noted, we used a LOD score threshold of 4 and a minimum segment size of 50 kb, which provides a reasonable tradeoff between power and false-positive rate ( Figure S1 B).

Previous studies have identified the introgressing Neanderthal population as a sister clade of the sequenced Altai Neanderthal (). We therefore tested how IBDmix would perform when the reference archaic genome is distantly related to the introgressing archaic. We simulated models with two Neanderthal lineages representing an introgressing lineage and a sampled reference lineage (non-introgressing lineage) and varied the split time between these two populations ( STAR Methods ). We observed a small decrease in power and FPR using the non-introgressing Neanderthal as the reference genome, but overall performance measures remained consistent ( Figure S1 D).

We also tested the impact of genetic variation and mis-specification of recombination rates on IBDmix using simulated data. The performance of IBDmix improved overall with higher mutation rates ( Figure S1 C). As expected, we observed a noticeable improvement for shorter segments (FPR, FDR, and power; Figure S1 C). In testing the effect of recombination rate on IBDmix performance, we used data generated from a model with no Neanderthal introgression. We evaluated the FPR of IBDmix under models with a recombination rate equal to the genome-wide average (1cM/Mb) and models 1/10that rate (0.1cM/Mb). For larger segments (≥40 kb), we observed marginally higher false-positive rates in situations with the reduced recombination rate ( Table S3 ).

We evaluated IBDmix’s performance and operating characteristics using simulated data generated from a previously inferred realistic demographic model and compared it to results using S∗ ( STAR Methods Figure S1 ). As expected, IBDmix’s false-positive rate decreases and power increases as the introgressed segment size increases ( Figure 1 B). Compared to S∗, IBDmix has a lower false-positive rate and higher power for all introgressed segment sizes >30 kb ( Figure 1 B). Specifically, for introgressed segment sizes >30 kb, the power of IBDmix is >60% with an FDR ≤10% ( Figures 1 B and S1 B). Note that the power and FDR of IBDmix in non-African populations are not influenced by gene flow from non-Africans into Africans, whereas they do have a large effect on S∗ ( Figures 1 B and 1C). The power to detect introgressed sequence in non-African populations is particularly low for S∗ when this sequence is also found in the reference population (Africans), whereas IBDmix maintains power ( Figure 1 C). This observation implies that biases may arise in methods that use a modern human reference panel, as the power to detect introgressed sequence will be a function of its presence in the reference panel.

(A) Simplified schematic of the demographic model used for simulations evaluating the performance of IBDmix. (B) Optimizing IBDmix function parameters under the basic simulation model (A): (a) LOD score, (b) Archaic sequence error, (c) maximum sequence error in modern human, and (d) sequence error as a function of MAF in modern human. (C) Impact of genetic variation on IBDmix performance under the basic simulation model (A). IBDmix performance (FPR, FDR and Power) under the simulation models with mutation rates 2x, 5x, and 10x the default value (1.25x10 −8 per bp per generation). (D) Evaluation of IBDmix performance under the simulation models using a reference archaic genome distantly related to the introgressing archaic. In different models, the sampled reference lineage diverges from the introgressing archaic at 70 kya (blue), 100 kya (yellow), and 145 kya (red). For comparison, IBDmix performance using the introgressing archaic genome (purple) is shown.

Methods that identify introgressed Neanderthal lineages in modern humans must differentiate between sequences shared with Neanderthals because of ancient hybridization or because of a shared common ancestor. Previous approaches, such as S∗ (), CRF (), diCal-admix (), and HMM (), use an “unadmixed” modern reference panel, commonly an African population such as Yoruba (YRI), to control for false positives due to shared ancestry by “masking” putative archaic sequence present in the reference panel and the target sample. If the reference panel carries introgressed Neanderthal sequence, this will result in missing Neanderthal sequence in the target sample ( Figure 1 A). Our new method IBDmix, which is based on identity by descent (IBD), does not use a modern reference panel ( Figure 1 A). IBDmix calculates the probabilities that a variant site in a modern individual is and is not shared IBD with a reference archaic genome, while accounting for genotyping errors in the reference archaic and modern human sequences ( STAR Methods Table S1 ). The ratio of these probabilities is used to construct a single-site LOD score, where higher values indicate a greater likelihood that a modern individual’s genotype is shared IBD with the reference archaic genome. IBDmix uses a dynamic programming algorithm to sum together single-site LOD scores and maximize this score in order to identify introgressed segments ( STAR Methods ). The false-positive rate for IBDmix is controlled by the LOD score threshold and length of introgressed segments considered. Unlike existing methods that require phased sequence data, IBDmix works on unphased genotype data, making it more computationally tractable by avoiding time-consuming preprocessing and inaccuracies caused by phasing errors. It should be noted, however, that accurate estimates of allele frequency are required to calculate the probability of IBD, and so IBDmix cannot be used on individual genomes or in small sample sizes. In practice, we found that a minimum of ten individuals is sufficient for robust inferences ( STAR Methods Table S2 ).

(B and C) Comparison of IBDmix performance to S∗ using simulated data generated from models with a low back-migration rate (1.7 × 10 −5 /generation) and high back-migration rate (5 × 10 −4 /generation). In (B), power and false-positive rates are calculated for all simulated Neanderthal segments in non-Africans. In (C), we show the power to detect a Neanderthal segment in non-Africans conditional on it also being present in Africans.

Discussion

We developed a novel approach to identify an introgressed hominin sequence that persists in the genomes of modern humans, and we show that it performs well compared to existing methods. The main novelty of IBDmix is that compared to previous methods, it does not use an unadmixed reference panel. As such, we were able to make unbiased inferences about signals of Neanderthal ancestry in African populations, which are a combination of genuine introgressed Neanderthal sequences and human sequences present in the Neanderthal genome. We also demonstrate that back-migrations to Africa confounded previous estimates of variation in Neanderthal ancestry among non-African populations. Furthermore, we confirmed and refined genomic regions significantly depleted of Neanderthal ancestry, as well as putative targets of adaptive introgression, including several loci that were previously not detectable when using an African reference population.

It is important to note, however, that IBDmix has several limitations. In particular, IBDmix requires an archaic reference genome and therefore is not suitable for discovering introgressed sequence from unknown or unsequenced hominin lineages. IBDmix also requires that populations be analyzed separately, and that a sufficiently large sample size be used, in order to robustly estimate population allele frequencies, assign LOD scores, and determine IBD (simulations suggest a minimum of ten individuals; Table S2 ). Additionally, recombination rate heterogeneity across the genome and between populations can influence IBDmix segment size cutoffs. Consequently, it will be difficult to apply IBDmix to individual genomes or ancient human samples, where the sample size is limited and estimates of allele frequencies and recombination rates are imprecise. As such, IBDmix complements existing approaches for identifying introgressed sequences in modern humans.

In summary, our data show that out-of-Africa and in-to-Africa dispersals must be accounted for when interpreting archaic hominin ancestry in contemporary human populations. It is notable that Neanderthal sequences have been identified in every contemporary modern human genome analyzed to date. Thus, the legacy of gene flow with Neanderthals likely exists in all modern humans, highlighting our shared history.