Cross-species introgression can have significant impacts on phylogenomic reconstruction of species divergence events. Here, we used simulations to show how the presence of even a small amount of introgression can bias divergence time estimates when gene flow is ignored in the analysis. Using advances in analytical methods under the multispecies coalescent (MSC) model, we demonstrate that by accounting for incomplete lineage sorting and introgression using large phylogenomic data sets this problem can be avoided. The multispecies-coalescent-with-introgression (MSci) model is capable of accurately estimating both divergence times and ancestral effective population sizes, even when only a single diploid individual per species is sampled. We characterize some general expectations for biases in divergence time estimation under three different scenarios: 1) introgression between sister species, 2) introgression between non-sister species, and 3) introgression from an unsampled (i.e., ghost) outgroup lineage. We also conducted simulations under the isolation-with-migration (IM) model, and found that the MSci model assuming episodic gene flow was able to accurately estimate species divergence times despite high levels of continuous gene flow. We estimated divergence times under the MSC and MSci models from two published empirical datasets with previous evidence of introgression, one of 372 target-enrichment loci from baobabs (Adansonia), and another of 1,000 transcriptome loci from fourteen species of the tomato relative, Jaltomata. The empirical analyses not only confirm our findings from simulations, demonstrating that the MSci model can reliably estimate divergence times, but also show that divergence time estimation under the MSC can be robust to the presence of small amounts of introgression in empirical datasets with extensive taxon sampling.
Publication:
Systematic Biology. 2023 Mar 24;syad015. doi: 10.1093/sysbio/syad015. Online ahead of print.
Author:
George P Tiley
Department of Biology, Duke University, Durham, NC, USA
Tomás Flouri
Department of Genetics, Evolution and Environment, University College London, London, UK
Xiyun Jiao
Department of Genetics, Evolution and Environment, University College London, London, UK
Department of Statistics and Data Science, China Southern University of Science and Technology, Shenzhen, Guangdong, China
Jelmer W Poelstra
Department of Biology, Duke University, Durham, NC, USA
Bo Xu
Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
Tianqi Zhu
National Center for Mathematics and Interdisciplinary Sciences, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, China
Key Laboratory of Random Complex Structures and Data Science, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, China
Email: zhutq@amss.ac.cn
Bruce Rannala
Department of Evolution and Ecology, University of California, Davis, Davis, CA, USA
Anne D Yoder
Department of Biology, Duke University, Durham, NC, USA
Ziheng Yang
Department of Genetics, Evolution and Environment, University College London, London, UK
附件下载: