Hudson-Kreitman-Aquadé test


When dealing with polymorphisms from non-coding regions from multiple loci (or a mixture of coding and non-coding regions), Hudson-Kreitman-Aquadé (1987) test can be used.

Consider two closely related species. Suppose DNA samples were taken from each species at m loci. For locus i, there are three quantities that are readily found from the samples:

if one can compute the mean and variance of Kij and Di under a null hypothesis, a goodness of fit statistic can be constructed to test the null hypothesis. The test using the statistics


is known as the Hudson-Kreitman-Aquadé (HKA) test. Each term in X2 is the square of a normalized variable whose mean and variance are approximately 0 and 1 respectively. As long as these normalized variables roughly follow the standard normal distribution, X2 will have an approximate Chi-square distribution. Simulations shows that the distribution of X2 is indeed close to Chi-square distribution with (2m-2) df. The reason for (2m-2) degree of freedom rather than 3m is that (m+2) parameters have to be estimated.

In order for HKA test to have non-zero degree of freedom, it is necessary that data from at least two loci are available. Also for (2m-2) to be the correct degree of freedom, it is important that loci being studied are unlinked or in linkage equilibrium. It should be noted that HKA test implicitly assumes that the divergence time T is large enough so that the common ancestral sequence of the samples is younger than T. It is also assumed the effective population sizes are the same for all loci, which implies that is is not appropriate to combine data from nuclear, sex-chromosome and mitochondria.


     Contents Prev Next