XML, exchangeable format


XML, the Extensible Markup Language, is the preferred technology in many information-transfer scenarios because of its ability to encode information in a way that is easy to read, process, and generate. It has goals of simplicity, portability, and flexibility; and continue to be developed in groups that involve industry, development community and academia. Not surprisingly there is a good future for the application basing on XML.

Then what does it look like? Here is the example:

The first three lines are

<?xml version="1.0" encoding="ISO-8859-1" standalone="no" ?>
<!DOCTYPE data SYSTEM "SequenceData.dtd">
<?xml-stylesheet type="text/xsl" href="ShowSequenceData.xsl" media="screen"?>

Please do not edit or delete them if you do not know what you are doing. Just skip and leave them. Moreover, there are two files in your directory, HomeDirectory/data/SequenceData.dtd and HomeDirectory/data/ShowSequenceData.xsl. You have to copy those files to your working directory. For example, you want to put your sequence data in C:\MyData, then you should copy SequenceData.dtd and ShowSequenceData.xsl into C:\MyData. The first file is the definition of keyword, and the last one is the render of XML.

Then the following content is

<data type = "sequence">
     ......
</data>

There is nothing behind "</data>". So make sure "</data>" is the last word. You can put the sequence information, sequence, and authors' information between "<data type = "sequence">" and "</data>".

How to define a population or species?

<population name = "population1">
    ......
    (one or more sequences)
    ......
</population>

You must define the name of population, which is quoted by double quotation.

The definition of sequence is simple. For example,

<seq name = "seq1">AAA</seq>
<seq name = "seq2" count = "2" >AAA</seq>
<seq name = "seq3" count = "2" locus = "CCR5" >AAA</seq>
<seq name = "seq4" count = "2" locus = "CCR5" comment = "Example of comment">AAA</seq>

You have to define the name of sequence. The count of sequence will be 1 when you do not define it. The locus will be "locus1" when you do not define it. The comment will be empty when you do not define it.

You can define the authors' information. The definition of author looks like

<author>
    <name>author's name</name>
    <address >author's address</address>
    <e-mail>author's e-mail</e-mail>
</author>
The following is an example of XML.
<?xml version="1.0" encoding="ISO-8859-1" standalone="no" ?>
<!DOCTYPE data SYSTEM "SequenceData.dtd">
<?xml-stylesheet type="text/xsl" href="ShowSequenceData.xsl" media="screen"?>
 
<data type = "sequence">
   <population name = "population1">
      <seq name = "seq1" locus = "locus1" count = "2" comment = "Example of comment">
         AAAAAAAAAAAATTTTTTTTT
      </seq>
      <seq name = "seq2" locus = "locus2" count = "1">
         GGGGAAAAAAAATTTTTTTTT
      </seq>
   </population>
 
   <population name = "population2">
      <seq name = "seq3" locus = "locus1" count = "2">
         AAAAAAAAAAAATTTTCTTTT
      </seq>
      <seq name = "seq4" locus = "locus1" count = "1">
         GGGGAAAAAAAATTTTTTTTC
      </seq>
   </population>
 
   <author>
       <name>Haipeng Li</name>
       <address>
                 Human Genetics Center,
                 University of Texas at Houston,
                 6901 Bertner Ave.,
                 Houston, Texas 77030,
                 U.S.A.
                 &
                 Kunming Institute of Zoology,
                 Chinese Academy of Sciences,
                 Kunming 650223,
                 P. R. China
       </address>
       <e-mail>hli@sph.uth.tmc.edu</e-mail>
   </author>
 
   <author>
       <name>Yun-Xin Fu</name>
       <address>
                 Human Genetics Center,
                 University of Texas at Houston,
                 6901 Bertner Ave.,
                 Houston, Texas 77030,
                 U.S.A.
       </address>
       <e-mail>fu@hgc.sph.uth.tmc.edu</e-mail>
   </author>
 
</data>
Save it as *.xml, and make sure there are two files, SequenceData.dtd and ShowSequenceData.xsl under the current directory. Then you can view the content when you install the Internet Explorer. It should look like

xmlRender.jpg - 90077 Bytes


     Contents Prev Next