R/DocumentationDiscretization.R
discretization.Rd
Discretizes a continuous phase-type distribution with initial distribution
initDist
and sub-intensity rate matrix T_Mat
.
discretization(object, a = NULL, lambda = NULL)
object | a continuous phase-type distributed object of class |
---|---|
a | a constant that is larger than the maximum of all diagonal entries of the sub-intensity rate matrix. |
lambda | the positive mutation rate at the locus. |
Mogens Bladt and Bo Friis Nielsen (2017): Matrix-Exponential Distributions in Applied Probability. Probability Theory and Stochastic Modelling (Springer), Volume 81.
Depending on the input, the function returns the discretized phase-type
distribution with sub-transition probability matrix equal to either
$$P := I + 1/a * T $$
(if \(a\) is provided) or
$$P = (I-lambda^{-1} * T)^{-1}$$
(if \(\lambda\) is provided). If both \(a\) and \(\lambda\) are provided, the function
returns both distributions in a list. In all three cases, the returned objects are
of type discphasetype
.
The relation between continuous and discrete phase-type distributions is given in the following way. If \(T\) is the sub-intensity rate matrix of a continuous phase-type distribution with representation \(PH(initDist,T)\), then there exists a constant \(a>0\) such that \(P := I + 1/a * T\) is a sub-transition probability matrix and \(DPH(initDist, P)\) is a representation for a discrete phase-type distribution. This holds for any \(a\) larger than the maximum of all diagonal entries in \(T\), as all entries in a sub-transition probability matrix have to be between zero and one. It even holds that for a genealogical model where the total branch length \(\tau ~ PH(initDist, T)\) and the mutation rate at the locus is \(\lambda = \theta/2\), that the number of segregating sites \(S_{Total}\) plus one is discrete phase-type distributed with initial distribution \(initDist\) and sub-transition probability matrix \(P = (I-\lambda^{-1} * T)^{-1}\), i.e. $$S + 1 ~ DPH(initDist, P).$$
## For n=4, the total branch length is phase-type ## distributed with initial distribution initDist <- c(1,0,0,0) ## and sub-intensity rate matrix T_Mat <- matrix(c(-1.5, 1.5, 0, 0, 0, -1, 2/3, 1/3, 0, 0, -0.5, 0, 0, 0, 0, -0.5), nrow = 4, byrow = TRUE) TTotal <- contphasetype(initDist,T_Mat) ## Hence, for theta=2, the number of segregating sites plus one is ## discrete phase-type distributed with the same initial ## distribution and sub-transition probability matrix discretization(TTotal, lambda=1)$P_Mat#> [,1] [,2] [,3] [,4] #> [1,] 0.4 0.3 0.1333333 0.06666667 #> [2,] 0.0 0.5 0.2222222 0.11111111 #> [3,] 0.0 0.0 0.6666667 0.00000000 #> [4,] 0.0 0.0 0.0000000 0.66666667