CHAPTER 5. THE MOSAIC GENOME ARCHITECTURE OF INDIGENOUS
5.4 R ESULTS
174
175
(98.94% in average). In addition to this, AFB also has overall alignment rate (BF001: 89.02%, and BF002: 91.24%, respectively).
After variant calling and filtering steps, a total of ~60 million SNPs were finally retained. An average missing rate and minor allele frequency of the SNPs were 0.016 and 0.120, respectively. The density of SNP loci was 22.764/kbp along the genome (Table 5.2). Overall genotype concordance of 72 samples was 95.13%, between the additional genotype data and the re- sequencing results across the samples, providing confidence on the accuracy of SNP calling. Of the 72 samples, there were three samples that showed low genotype concordance. However, the DNA samples of these samples are not expected to produce reliable genotypes, considering the low call rates.
5.4.2 Population structure and genetic diversity of African cattle
Population structure and relationshipsWe performed principal component analysis, ignoring group and breed membership (Figure 5.2b). All of the African humped cattle samples (AFI, AFS, and AFZ) were located between EUT and ASI samples according to eigenvector 1, which explained ~10% of total variation, and they are also clearly separated from AFT. Therefore, There is no evidence for recent admixture between EUT and/or ASI and African cattle. Rather, these support ancient admixture for the African humped cattle with unique genetic material as explained by eigenvector 2 (~2.5% of total variation). Of African taurine cattle breeds, N’Dama was
176
clearly separated from other breeds. However, Sheko that has been referred as AFT breed clustered with other African humped breeds as previously reported (Hanotte, Tawah et al. 2000, Mbole-Kariuki, Sonstegard et al. 2014), and it is even closer to them than the Sanga breed, Ankole. African humped breeds are not separated by their breed membership, and they didn’t form distinct clusters except the Ankole breed. An identical pattern to PCA results was also observed in an individual-level phylogenetic tree reconstructed based on genome-wide SNPs (Figure 5.2a); We could not distinguish individual classified phenotypically as AFI, AFS and AFZ. The phylogenetic tree showed that AFI, AFS, and AFZ were clustered with ASI.
We then reconstructed the maximum likelihood tree of the 19 cattle populations using Treemix (Pickrell and Pritchard 2012) to address population- level relationships and to identify pairs of populations that are related to each other independent of that captured by this tree. The population-level phylogeny without any migration edges explained 95% of total variance and showed that most of African populations are clustered with ASI and are far closer to ASI than EUT except N’Dama. Migration events were added into initial tree until 99.8% of the variance in ancestry between populations was explained by the tree. The 13 migration edges added into the tree where the variance explained reached 99.8% (Figure 5.3), were statistically significant. They are mostly observed among AFI, AFS, AFZ and ASI breeds. (Figure 5.4).
In addition to the relationship of cattle breeds, we performed Admixture (Alexander, Novembre et al. 2009) analysis to address admixture patterns
177
between groups as well as breeds. A subset of the total SNPs (~14 million SNP loci) was used with increasing K from 2 to 22. The cross validation of this analysis suggested K = 6 as the most likely number of genetically distinct groups for our data (cross validation error = 0.33). At K = 2, while both of EUT and ASI populations showed intact genetic background, all of the African cattle breeds apparently shared genome ancestry with taurine as well as indicine. The taurine ancestry of African hybrids (AFS, and AFZ) showed no difference compared to that of AFI except for Ankole. Also, Sheko also displays a similar amount of taurine ancestry to that of other African humped breeds.
Approximately, 18% of each African humped cattle genomes (except Ankole) came from taurine ancestry. Of them, the Mursi population especially showed a higher proportion than admixed population. At K=3, most of taurine ancestry at K=2 has been replaced by putative African taurine ancestry (light blue). It indicates that the existing taurine ancestry across African humped cattle genomes originate from AFT. The admixture plot above K=3 showed intercontinental divergence and admixture; Asian taurine cattle, Hanwoo breed showed distinct ancestry, and most of African humped breed showed two ancestries different from ASI ancestry. North-western AFI including Barka, Butana and Kenena (Figure 5.1a), especially showed a different pattern to the other breeds (orange).
Genetic distance and diversity
178
To present genetic distances between the 22 cattle populations, pairwise Fst values was estimated (Figure 5.5). Based on these Fst values, the genetic background of EUT was observed to be definitely different from all other cattle breeds except N’Dama. The EUT showed ~0.2 and ~0.3 genetic distance in Fst values against African cattle breeds and ASI, respectively. As already shown in PCA and phylogenetic analysis, all of African breeds regardless of their classification were close to each other, except N’Dama, and they showed pairwise Fst close to zero.
The genetic diversity for the whole autosomal SNPs showed reduced levels of heterozygosity compared to all other breeds in the taurine cattle, except for the Sheko. Heterozygosity values of African humped cattle were similarly higher across breeds, which reflect that genetic diversity was consistently conserved across the Horn of Africa. Note that ASI breeds showed similar level of heterozygosity compared to African humped breeds, and showed higher level of heterogeneity within each breeds (Figure 5.6). The degree of inbreeding measured by runs of homozygosity (ROH) showed that taurine breeds including N’Dama have higher level of inbreeding compared to the other breeds. ASI breeds showed a similar pattern of ROH distribution to African humped breeds (Figure 5.7).
179
Figure 5.2 Populations structure of indigenous African cattle. (a) Maximum likelihood tree reconstructed for the 235 cattle samples. The size of dot at each node indicates bootstrap value. (b) PCA plot of 235 cattle samples. The shape and color of points indicate type and group information, respectively. (c) The results of admixture analysis by using cluster from K=2 to K=6
180
Figure 5.3 Variance explained by model in Treemix analysis