帮忙看看我的报表程序?在线等,急!
alexru
|
1#
alexru 发表于 2006-10-20 22:18
帮忙看看我的报表程序?在线等,急!
帮忙看看我的报表程序?在线等,急!
今有这样一个文件,里面有很多<sequence...>....</sequence>这样的重复部分,今要取得sequence id, cloneid,等几项,并产生报表: [quote]<?xml version="1.0"?> <!DOCTYPE maxml-sequences SYSTEM "http://fantom.gsc.riken.go.jp/maxml/maxml.dtd"> <maxml-sequences> <sequence id="G530106A19"> <altid type="cloneid">G530106A19</altid> <altid type="seqid">106195</altid> <altid type="rearrayid">PS00034I09</altid> <altid type="accession">AK149923</altid> <altid type="estaccession">BY484353</altid> <seqid>106195</seqid> <cloneid>G530106A19</cloneid> <accession>AK149923</accession> <modified_time>Jan 31 2005</modified_time> <annotations> <annotation> <qualifier>cds_location</qualifier> <anntext>No CDS</anntext> <evidence>FANTOM3-Unconfirmed</evidence> </annotation> <annotation> <qualifier>transcript_desc_name</qualifier> <anntext>unclassifiable</anntext> <evidence></evidence> </annotation> </annotations> </sequence> <sequence id="D030025E18"> <altid type="cloneid">D030025E18</altid> <altid type="seqid">56468</altid> <altid type="rearrayid">PX00180G19</altid> <altid type="accession">AK083478</altid> <altid type="estaccession">BB441051 BB655048 BB441051</altid> <altid type="f2seqid">56468</altid> <altid type="mgiclone">MGI:2418721</altid> <altid type="mgimarker">MGI:1345279</altid> <seqid>56468</seqid> <cloneid>D030025E18</cloneid> <accession>AK083478</accession> <modified_time>Jan 31 2005</modified_time> <annotations> <annotation> <qualifier>cds_location</qualifier> <anntext>79..1764</anntext> <evidence>FANTOM2</evidence> </annotation> <annotation> <qualifier>cds_gap</qualifier> <anntext>M1686</anntext> <evidence>FANTOM2</evidence> </annotation> <annotation> <qualifier>transcript_desc_name</qualifier> <anntext>solute carrier family 11 (proton-coupled divalent metal ion transporters), member 2 </anntext> <datasrc xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" xlink:href="http://w ww.informatics.jax.org/searches/accession_report.cgi?id=MGI:1345279">MGD|MGI:1345279</datasrc> <datasrc xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" xlink:href="http://w ww.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Search&db=Nucleotide&term=NM_008732&doptcmdl=G enBank">GB|NM_008732</datasrc> <evidence>BLASTN, 99%, match=1689</evidence> </annotation> <annotation> <qualifier>transcript_desc_symbol</qualifier> <anntext>Slc11a2</anntext> <datasrc xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" xlink:href="http://w ww.informatics.jax.org/searches/accession_report.cgi?id=MGI:1345279">MGD|MGI:1345279</datasrc> <datasrc xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" xlink:href="http://w ww.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Search&db=Nucleotide&term=NM_008732&doptcmdl=G enBank">GB|NM_008732</datasrc> <evidence>BLASTN, 99%, match=1689</evidence> </annotation> <annotation> <qualifier>transcript_desc_synonym</qualifier> <anntext>mk van Nramp2</anntext> <datasrc xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" xlink:href="http://w ww.informatics.jax.org/searches/accession_report.cgi?id=MGI:1345279">MGD|MGI:1345279</datasrc> <datasrc xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" xlink:href="http://w ww.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Search&db=Nucleotide&term=NM_008732&doptcmdl=G : enBank">GB|NM_008732</datasrc> <evidence>BLASTN, 99%, match=1689</evidence> </annotation> <annotation> <qualifier>gene_ontology</qualifier> <anntext>GO:0016021</anntext> <datasrc xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" xlink:href="http://w ww.informatics.jax.org/searches/accession_report.cgi?id=MGI:1345279">MGD|MGI:1345279</datasrc> <evidence>IEA|BLASTN/MGD</evidence> </annotation> <annotation> <qualifier>gene_ontology</qualifier> <anntext>GO:0016020</anntext> <datasrc xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" xlink:href="http://w ww.informatics.jax.org/searches/accession_report.cgi?id=MGI:1345279">MGD|MGI:1345279</datasrc> <evidence>IEA|BLASTN/MGD</evidence> </annotation> <annotation> <qualifier>gene_ontology</qualifier> <anntext>GO:0005381</anntext> <datasrc xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" xlink:href="http://w ww.informatics.jax.org/searches/accession_report.cgi?id=MGI:1345279">MGD|MGI:1345279</datasrc> <evidence>IEA|BLASTN/MGD</evidence> </annotation> <annotation> <qualifier>gene_ontology</qualifier> <anntext>GO:0005215</anntext> <datasrc xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" xlink:href="http://w ww.informatics.jax.org/searches/accession_report.cgi?id=MGI:1345279">MGD|MGI:1345279</datasrc> <evidence>IEA|BLASTN/MGD</evidence> </annotation> <annotation> <qualifier>gene_ontology</qualifier> <anntext>GO:0006810</anntext> <datasrc xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" xlink:href="http://w ww.informatics.jax.org/searches/accession_report.cgi?id=MGI:1345279">MGD|MGI:1345279</datasrc> <evidence>IEA|BLASTN/MGD</evidence> </annotation> </annotations> ................................................[/quote] 我的程序如下,想生成报表格式,可出来很多重复的部分(每一列中),哪里有错了?本想一个序列各条目对应一行的?[code]#!/usr/bin/perl if(!@ARGV){print "Usage:$0 input_file.\n";exit;} format STDOUT= ------------------------------------------------------------------------------ sequence id | cloneid | seqid | rearrayid | accession | estaccession @<<<<<<<<<< @<<<<<<<<<< @<<<<<<<<<@<<<<<<<<<<< @<<<<<<<<<<<< @<<<<<<<<<<<<<< $sequenceid, $cloneid, $seqid, $rearrayid $accession $estaccession . open(IN,"$ARGV[0]") or die "$!"; while($line=<IN>){ if($line=~/<sequence/){ ($sequenceid)=$line=~/"(\S+)"/; write; }elsif($line=~/type="cloneid"/){ ($cloneid)=$line=~/>(\S+)</; write; }elsif($line=~/type="seqid"/){ ($seqid)=$line=~/>(\S+)</; write; }elsif($line=~/type="rearrayid"/){ ($rearrayid)=$line=~/>(\S+)</; write; }elsif($line=~/type="accession"/){ ($accession)=$line=~/>(\S+)</; write; }elsif($line=~/type="estaccession"/){ ($estaccession)=$line=~/>(\S+)</; write; }else{; } } close IN or die "$!"; [/code]多谢! |