biopython Bio.PDB FAQ

学习python进行中! biopython
       
       
       
       
       
       
       
       
Biopython
Bio.PDB module FAQ
1,Focuses
on working with crystal structure biological macromolecules.
2,Well
tested. Nearly 5500 structures from PDB—all seemed to be parsed
correctly
3,Really
fast!
4,Not
directly supported for molecular graphics. But there are quite a few
python-based solutions. You can use Pymol or BTW.
http://pymol.sourceforge.net
5,USAGE:
a,        importing
Bio.PDB :       
>>>from
Bio.PDB import *
b,        input/output
create
a structure object from a PDB file:
  • create
            a PDBParser object:       
            >>>parser=PDBParser()
           
    • create
              a structure object from a PDB file:
              >>>structure=parser.get_structure('DOG',
              '1LKE.pdb')
             
             
      create
      a structure object from an mmCIF file
    • create
              an MMCIFParser object:
              >>>parser=MMCIFParser()
             
    • create
              a structure object from the mmCIF file
              >>>structure=parser.get_structure('DOG','1LKE.cif')
      some
      more level access to an mmCIF file
      you
      can create a python dict that maps all mmCIF tags. If there are
      multiple values, the tag is mapped into a list of values.
      >>>mmcif_dict=MMCIF2Dict('1LKE.cif')
      eg:get
      solvent content from an mmCIF file:
      >>>sc=mmcif_dict('_exptl_crystal.density_percent_sol']
      eg:get
      the list of the y coordinates of all atoms:
      >>>y_list=mmcif_dict['_atom_site.Cartn_y']
      Return
      to parsing the PDB header
      >>>resolution=structure.header['resolution']
      >>>keywords=structure.header['keywords']
      The
      available keys : name, head, deposition_date, release_date,
      structure_method, resolution, structure_reference, journal_reference,
      author, compound
      The
      dict can also be created without creating a Structure object
      >>>file=open(filename,
      'r')
      >>>header_dict=parse_pdb_header(file)
      >>>file.close()
      Download
      structure from the PDB:
      >>>pdb1=PDBList()
      >>>pdb1.retrieve_pdb_file('1LKE')
      The
      PDBList class can also be used
      >>>python
      PDBList.py 1LKE
      you
      must in that directory, of course!
      Try
      to download the entire PDB if necessary
      >>>python
      PDBList.py all /data/pdb
      >>>>>>python
      PDBList.py all /data/pdb -d
      Adding
      the -d option will store all files in the same directory. It's not a
      good choice! Otherwise, they are sorted into PDB-style rectories
      according to their PDB   ID's.
      Keep
      a local copy of the PDB up-to-date using PDBList.py object
      >>>p1=PDBList(pdb='
      /data/pdb')
      >>>p1.update_pdb()
      Use
      the PDBIO class for writing PDB files
      eg:
      saving a structure
      >>>io=PDBIO()
      >>>io.set_structure(s)
      >>>io.save('out.pdb')
      You
      can't write mmCIF files
      the
      overall layout of a structure object:
      SMCRA(structure/model/chain/residue/atom)
      A
      structure consists of models/ A model consists of chains
      A
      chain consists of residues / A residue consists of atoms
      Navigate
      through a structure object:
      >>>p=PDBParser()
      >>>structure=p.get_structure('X',
      'pdb1fat.ent')
      >>>for
      model in structure:
              for
      chain in model:
                      for
      residue in chain:
                              for
      atom in residue:
                                      print
      atom
      some
      other shortcuts:
      >>>#
      iterate over all atoms in a structure
      >>>for
      atom in structure.get_atoms():
              print
      atom
      >>>#
      iterate over all residues in a model
      >>>for
      residue in model.get_residues():
              print
      residue
      structures,
      models, chains, residues, atoms are called Entities in Biopython.
      You
      can always get a parent Entity from a child Entity, eg:
      >>>residue=atom.get_parent()
      >>>chain=residue.get_parent()
      you
      can also test whether an Entity has a certain child use has_it method
      You
      can do that a bit more conveniently
      >>>atoms=structure.get_atoms()
      >>>residue=structure.get_residues()
      >>>atoms=chain.get_atoms()
      简单的说,它们(结构,模块,链,残基,原子)是一个范围问题。你可以从上级中抽取下级内容。也可以综合下级找上级(父子关系)
      >>>#
      get all residues from a structure
      >>>res_list=Selection.unfold_entities(structure,
      'R')
      >>>#
      get all atoms from a chain
      >>>atom_list=Selection.unfold_entities(chain,
      'A')
      A=atom,
      R=residue, C=chain, M=model, S=structure
      也可以跨级操作:
      >>>residue_list=Selection.unfold_entities(atom_list,
      'R')
      >>>chain_list=Selection.unfold_entities(atom_list,'C')
      Extract
      a specific Atom/Residue/Chain/Model from a structure:
      just
      use nest structure as list:
      >>>model=structure[0]
      >>>chain=model['A']
      >>>residue=chain[100]
      >>>atom=residue['CA']
      >>>atom=structure
      [0] ['A'] [100] ['CA']
      Model
      id: an integer which denotes the rank of the model in the
      PDB/mmCIF file.
      The
      model is starts at 0. Crystal structure generally have one model
      id(0), while NMR files usually have more
      Chain
      id: specified in the file, a single
      character(typically a letter)
      Residue
      id: complicated, due to the clumsy
      PDB format. A residue id is a tuple with three elements:
              1,the
      hetero-flag: 'H_' plus the name of the hetero-residue, eg. 'H_GLC',
      or 'W' in the case of a water molecule.
              2,
      sequence identifier in the chain, eg. 100
              3,
      insertion code: eg. 'A'. The insertion code is sometimes used to
      preserve a certain desirable residue numbering scheme
      hetero-flag
      and insertion code can be blank:
      >>>#
      full id
      >>>residue=chain[('
      ', 100, ' ')]
      >>>#
      shortcut id
      >>>residue=chain[100]
      atom
      id: the atom name. Eg: 'CA'
      In
      PDB files, a space can be part of an atom name.
      calcium—'CA..'
      , to distinguish from C alpha atom '.CA.'
      Disorder
      handle
      two
      views: the atom and the residue point of view
                      disordered
      atoms and residues are stored in special objects that behave as if
      there is no disorder