=========================================================================== Version 1.0 =========================================================================== Introduction This stand-alone software is designed to reliably detect recent positive selection in a varying size population, even if you only have polymorphism data from a single locus (i.e. a very short piece of DNA). The method itself has been described in Li (2011). In the paper, we have proved that the MFDM test is free from the confounding signatures of demography, including population size expansion, bottleneck and population structure. As you may know, it's always difficult to exclude the confounding effects of demography completely when detecting recent positive selection, even if genome-wide polymorphism data is considered. Thus, we hope the software would help you to address whether your targeted nuclear genes are affected by recent positive selection or not. The logic behind this approach is: by examining tree topology, we can distinguish selection from varying population size. By using a simple sampling strategy (when your samples do not cover the whole species range), we can remove the confounding effect of population structure. That's why genome-wide polymorphism data is not always necessary here. However, when multiple-locus data are available, it does improve the power of detecting selection. So in the future, we are expecting to release a new version of software implementing an extended multiple-locus test. In the next versions, we will also provide you a friendly user interface. Note: The MFDM requires the beneficial allele and the sequenced locus are partially linked. So if there is no evidence of recombination on mitochondria, this method may not be suitable for detecting selection on mitochondria. If you have questions, please contact Yuting Wang (wangyuting@picb.ac.cn) or Haipeng Li (lihaipeng@picb.ac.cn). =========================================================================== Installation 1. Unzip the MFDM_1.0.zip file and you will get four files: MFDMCommandLine_1.0.jar, configFile.txt, MFDM_readMe.txt and testFile folder. 2. MFDM requires Java 2 Standard Edition (J2SE) 6 or higher, only the Java Runtime Environment (JRE) is required. You can visit the web: http://www.oracle.com/technetwork/java/javase/downloads/index.html =========================================================================== Input file requirements The MFDM program could process DNA files of standard fasta format (the suffix should be .fa or .fasta, case insensitive). It can analyze single fasta file or multiple fasta files under a common directory (MFDM will recursively traverse the directory and pick fasta files out). For a single fasta file, it must be aligned first and contains samples and at least one outgroup. * Note: If your sample size is n, the minimum significance level you can reach is 2/(n-1). * Note: Assuming your sampling locations do not cover the whole species distribution area, and these uncovered areas are separated into several isolated regions due to natural barriers. Then try to collect one individual/chromosome from each isolated (and uncovered) regions. Those individuals will be used as migrant detectors. Usually, 2 or 5 migrant detectors should be enough. MFDM could analyze 3 types of DNA sequences: phased haplotype; unphased haplotype and diplotype. Below are 3 examples of these data types, where n = 6. Phased haplotype (Recommended!!! MFDM only consider MD and calculate minimum number of recombination events for this file type.): >Hap1 GTGCGCGGAGGCGA >Hap2 TTGCGCGGAGGGGA >Hap3 GTGCGCGGAGGCGA >Hap4 TTGCGCGGAGGCGC >Hap5 TTGCGCGGAGGCGA >Hap6 TTGCGCGGAGGGGC Unphased haplotype: >Ind1_a GTGCGCGGAGGGGA >Ind1_b TTGCGCGGAGGCGA >Ind2_a GTGCGCGGAGGCGA >Ind2_b TTGCGCGGAGGCGC >Ind3_a TTGCGCGGAGGGGA >Ind3_b TTGCGCGGAGGCGC Diplotype: >Ind1 KTGCGCGGAGGSGA >Ind2 KTGCGCGGAGGCGM >Ind3 TTGCGCGGAGGSGM A adenosine C cytidine G guanine T thymidine N A/G/C/T (any) U uridine K G/T (keto) S G/C (strong) Y T/C (pyrimidine) M A/C (amino) W A/T (weak) R G/A (purine) B G/T/C D G/A/T H A/C/T V G/C/A - gap of indeterminate length =========================================================================== Usage To run the project from the command line, go to the specific folder containing MFDMCommandLine_1.0.jar and configFile.txt and type the following: java -jar "MFDMCommandLine_1.0.jar" configFile.txt * Note: DO NOT change the format of configFile.txt !!! You could change the corresponding value below a prompt according to your usage. If the program runs well, you will get your results file in the current directory (where you start the program) (default). If you do not want to use configFile.txt or the configFile.txt does not exist, you can simply type the following: java -jar "MFDMCommandLine_1.0.jar" Then follow the prompt to type corresponding parameters. * Note: If some prompt says that "...(default value)", you can simply skip the input process by pressing ENTER key. And you could use the testFiles to test MFDM! =========================================================================== References Li H. A new test for detecting recent positive selection that is free from the confounding impacts of demography. Mol.Biol.Evol. 2011; 28:365-375. ===========================================================================