TY - JOUR
T1 - Additive Uncorrelated Relaxed Clock Models for the Dating of Genomic Epidemiology Phylogenies
AU - Didelot, Xavier
AU - Siveroni, Igor
AU - Volz, Erik M.
N1 - Publisher Copyright:
© 2020 The Author(s). Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
PY - 2021/1/1
Y1 - 2021/1/1
N2 - Phylogenetic dating is one of the most powerful and commonly used methods of drawing epidemiological interpretations from pathogen genomic data. Building such trees requires considering a molecular clock model which represents the rate at which substitutions accumulate on genomes. When the molecular clock rate is constant throughout the tree then the clock is said to be strict, but this is often not an acceptable assumption. Alternatively, relaxed clock models consider variations in the clock rate, often based on a distribution of rates for each branch. However, we show here that the distributions of rates across branches in commonly used relaxed clock models are incompatible with the biological expectation that the sum of the numbers of substitutions on two neighboring branches should be distributed as the substitution number on a single branch of equivalent length. We call this expectation the additivity property. We further show how assumptions of commonly used relaxed clock models can lead to estimates of evolutionary rates and dates with low precision and biased confidence intervals. We therefore propose a new additive relaxed clock model where the additivity property is satisfied. We illustrate the use of our new additive relaxed clock model on a range of simulated and real data sets, and we show that using this new model leads to more accurate estimates of mean evolutionary rates and ancestral dates.
AB - Phylogenetic dating is one of the most powerful and commonly used methods of drawing epidemiological interpretations from pathogen genomic data. Building such trees requires considering a molecular clock model which represents the rate at which substitutions accumulate on genomes. When the molecular clock rate is constant throughout the tree then the clock is said to be strict, but this is often not an acceptable assumption. Alternatively, relaxed clock models consider variations in the clock rate, often based on a distribution of rates for each branch. However, we show here that the distributions of rates across branches in commonly used relaxed clock models are incompatible with the biological expectation that the sum of the numbers of substitutions on two neighboring branches should be distributed as the substitution number on a single branch of equivalent length. We call this expectation the additivity property. We further show how assumptions of commonly used relaxed clock models can lead to estimates of evolutionary rates and dates with low precision and biased confidence intervals. We therefore propose a new additive relaxed clock model where the additivity property is satisfied. We illustrate the use of our new additive relaxed clock model on a range of simulated and real data sets, and we show that using this new model leads to more accurate estimates of mean evolutionary rates and ancestral dates.
KW - clock model
KW - dated phylogeny
KW - genomic epidemiology
UR - http://www.scopus.com/inward/record.url?scp=85099427330&partnerID=8YFLogxK
U2 - 10.1093/molbev/msaa193
DO - 10.1093/molbev/msaa193
M3 - Article
C2 - 32722797
AN - SCOPUS:85099427330
SN - 0737-4038
VL - 38
SP - 307
EP - 317
JO - Molecular Biology and Evolution
JF - Molecular Biology and Evolution
IS - 1
ER -