Package csb
[frames] | no frames]

Source Code for Package csb

  1  """
 
  2  CSB is a high-level, object-oriented library used to solve problems in the
 
  3  field of Computational Structural Biology.
 
  4  
 
  5  
 
  6  Introduction
 
  7  ============
 
  8  
 
  9  The library is composed of a set of highly branched python packages
 
 10  (namespaces). Some of the packages are meant to be directly used by
 
 11  the clients (core library), while others are utility modules and take
 
 12  part in the development of the library: 
 
 13  
 
 14      1. Core class library -- object-oriented, granular, with an emphasis
 
 15         on design and clean interfaces. A Sequence is not a string, and a
 
 16         Structure is not a dict or list. Naming conventions matter.
 
 17         
 
 18      2. Application framework -- executable console applications
 
 19         ("protocols"), which consume objects from the core library.
 
 20         The framework ensures that each CSB application is also reusable
 
 21         and can be instantiated as a regular python object without any
 
 22         ugly side effects (sys.exit() and friends). See L{csb.apps} for more
 
 23         details. 
 
 24         
 
 25      3. Test framework -- built on top of the standard unittest as a thin
 
 26         wrapping layer. Provides some sugar like transparent management of
 
 27         test data files, and modular test execution. L{csb.test} will give
 
 28         you all the details. 
 
 29  
 
 30  The core library is roughly composed of:
 
 31  
 
 32      - bioinformatics API: L{csb.bio}, which includes stuff like
 
 33        L{csb.bio.io}, L{csb.bio.structure}, L{csb.bio.sequence},
 
 34        L{csb.bio.hmm}
 
 35      
 
 36      - statistics API: L{csb.statistics}, L{csb.numeric}
 
 37      
 
 38      - utilities - L{csb.io}, L{csb.core}
 
 39  
 
 40  
 
 41  Getting started
 
 42  ===============
 
 43      
 
 44  Perhaps one of the most frequently used parts of the library is the
 
 45  L{csb.bio.structure} module, which provides the L{Structure}, L{Chain},
 
 46  L{Residue} and L{Atom} objects. You could easily build a L{Structure}
 
 47  from scratch, but a far more common scenario is parsing a structure from
 
 48  a PDB file using one of the L{AbstractStructureParser}s. All bio IO
 
 49  objects, including the StructureParser factory, are defined in
 
 50  L{csb.bio.io} and sub-packages:
 
 51  
 
 52      >>> from csb.bio.io.wwpdb import StructureParser
 
 53      >>> p = StructureParser("/some/file/pdb1x80.ent")
 
 54      >>> s = p.parse_structure()
 
 55      >>> print(s)
 
 56      <Structure: 1x80, 2 chains>
 
 57      
 
 58  The last statement will return a L{csb.bio.structure.Structure} instance,
 
 59  which is a composite hierarchical object:
 
 60  
 
 61      >>> for chain_id in s.chains:
 
 62              chain = s.chains[chain_id]
 
 63              for residue in chain.residues:
 
 64                  for atom_id in residue.atoms:
 
 65                      atom = residue.atoms[atom_id]
 
 66                      print(atom.vector)
 
 67  
 
 68  Some of the inner objects in this hierarchy behave just like dictionaries
 
 69  (but are not):
 
 70  
 
 71      >>> s.chains['A']        # access chain A by ID
 
 72      <Chain A: Protein>
 
 73      >>> s['A']               # the same
 
 74      <Chain A: Protein>
 
 75      
 
 76  Others behave like collections:
 
 77  
 
 78      >>> chain.residues[10]               # 1-based access to the residues in the chain
 
 79      <ProteinResidue [10]: PRO 10>
 
 80      >>> chain[10]                        # 0-based, list-like access
 
 81      <ProteinResidue [11]: GLY 11>
 
 82      
 
 83  But all entities are iterable because they inherit the C{items} iterator
 
 84  from L{AbstractEntity}. The above loop can be shortened:
 
 85  
 
 86      >>> for chain in s.items:
 
 87              for residue in chain.items:
 
 88                  for atom in residue.items:
 
 89                      print(atom.vector)
 
 90                      
 
 91  or even more:
 
 92  
 
 93      >>> from csb.bio.structure import Atom
 
 94      >>> for atom in s.components(klass=Atom):
 
 95              print(atom.vector)
 
 96  
 
 97  You may also be interested in extracting a sub-chain from this structure:
 
 98  
 
 99      >>> s.chains['B'].subregion(3, 20)    # from positions 3 to 20, inclusive
 
100      <Chain B: Protein>
 
101      
 
102  or modifying it in some way, for example, in order to append a new residue,
 
103  try:
 
104  
 
105      >>> from csb.bio.structure import ProteinResidue
 
106      >>> from csb.bio.sequence import ProteinAlphabet
 
107      >>> residue = ProteinResidue(401, ProteinAlphabet.ALA)
 
108      >>> s.chains['A'].residues.append(residue)
 
109      
 
110  Finally, you would probably want to save your structure back to a PDB file:
 
111  
 
112      >>> s.to_pdb('/some/file/name.pdb')    
 
113  
 
114  
 
115  Where to go from here
 
116  =====================
 
117  
 
118  If you want to dive into statistics, you could peek inside L{csb.statistics}
 
119  and its sub-packages. For example, L{csb.statistics.pdf} contains a collection
 
120  of L{probability density objects<csb.statistics.pdf.AbstractDensity>},
 
121  like L{Gaussian<csb.statistics.pdf.Normal>} or L{Gamma<csb.statistics.pdf.Gamma>}.
 
122  
 
123  But chances are you would first like to try reading some files, so you could
 
124  start exploring L{csb.bio.io} right now. As we have already seen,
 
125  L{csb.bio.io.wwpdb} provides PDB L{Structure<csb.bio.structure.Structure>}
 
126  parsers, for example L{csb.bio.io.wwpdb.RegularStructureParser} and
 
127  L{csb.bio.io.wwpdb.LegacyStructureParser}.
 
128  
 
129  L{csb.bio.io.fasta} is all about reading FASTA
 
130  L{Sequence<csb.bio.sequence.AbstractSequence>}s and
 
131  L{SequenceAlignment<csb.bio.sequence.AbstractAlignment>}s. Be sure to check out 
 
132  L{csb.bio.io.fasta.SequenceParser}, L{csb.bio.io.fasta.SequenceAlignmentReader}
 
133  and L{csb.bio.io.fasta.StructureAlignmentFactory}.
 
134  
 
135  If you are working with HHpred (L{ProfileHMM<csb.bio.hmm.ProfileHMM>}s,
 
136  L{HHpredHit<csb.bio.hmm.HHpredHit>}s), then L{csb.bio.io.hhpred} is for you.
 
137  This package provides L{csb.bio.io.hhpred.HHProfileParser} and
 
138  L{csb.bio.io.hhpred.HHOutputParser}, which are used to read *.hhm and *.hhr
 
139  files.
 
140  
 
141  Finally, if you want to make some nice plots with matplotlib, you may like the
 
142  clean object-oriented interface of our L{Chart<csb.io.plots.Chart>}. See
 
143  L{csb.io.plots} and maybe also L{csb.io.tsv} to get started.
 
144  
 
145  
 
146  Development
 
147  ===========
 
148  
 
149  When contributing code to CSB, please take into account the following:
 
150  
 
151      1. New features or bug fixes should always be accompanied by test cases.
 
152         Also, always run the complete test suite before committing. For more
 
153         details on this topic, see L{csb.test}.
 
154         
 
155      2. The source code of CSB must be cross-platform and cross-interpreter
 
156         compatible. L{csb.core} and L{csb.io} will give you all necessary
 
157         details on how to use the CSB compatibility layer.
 
158  
 
159  
 
160  License
 
161  =======
 
162  
 
163  CSB is open source and distributed under OSI-approved MIT license::
 
164  
 
165      Copyright (c) 2012 Michael Habeck
 
166      
 
167      Permission is hereby granted, free of charge, to any person obtaining
 
168      a copy of this software and associated documentation files (the
 
169      "Software"), to deal in the Software without restriction, including
 
170      without limitation the rights to use, copy, modify, merge, publish,
 
171      distribute, sublicense, and/or sell copies of the Software, and to
 
172      permit persons to whom the Software is furnished to do so, subject to
 
173      the following conditions:
 
174      
 
175      The above copyright notice and this permission notice shall be
 
176      included in all copies or substantial portions of the Software.
 
177      
 
178      THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
 
179      EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
 
180      MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
 
181      IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
 
182      CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
 
183      TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
 
184      SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
 
185      
 
186  """ 
187  
 
188  __version__ = '1.2.2.614' 
189 190 191 -class Version(object):
192 """ 193 CSB version number. 194 """ 195
196 - def __init__(self):
197 198 version = __version__.split('.') 199 200 if not len(version) in (3, 4): 201 raise ValueError(version) 202 203 self._package = __name__ 204 205 self._major = version[0] 206 self._minor = version[1] 207 self._micro = version[2] 208 self._revision = None 209 210 if len(version) == 4: 211 self._revision = version[3]
212
213 - def __str__(self):
214 return self.short
215
216 - def __repr__(self):
217 return '{0.package} {0.full}'.format(self)
218 219 @property
220 - def major(self):
221 """ 222 Major version (huge, incompatible changes) 223 @rtype: int 224 """ 225 return int(self._major)
226 227 @property
228 - def minor(self):
229 """ 230 Minor version (significant, but compatible changes) 231 @rtype: int 232 """ 233 return int(self._minor)
234 235 @property
236 - def micro(self):
237 """ 238 Micro version (bug fixes and small enhancements) 239 @rtype: int 240 """ 241 return int(self._micro)
242 243 @property
244 - def revision(self):
245 """ 246 Build number (exact repository revision number) 247 @rtype: int 248 """ 249 try: 250 return int(self._revision) 251 except: 252 return self._revision
253 254 @property
255 - def short(self):
256 """ 257 Canonical three-part version number. 258 """ 259 return '{0.major}.{0.minor}.{0.micro}'.format(self)
260 261 @property
262 - def full(self):
263 """ 264 Full version, including the repository revision number. 265 """ 266 return '{0.major}.{0.minor}.{0.micro}.{0.revision}'.format(self)
267 268 @property
269 - def package(self):
270 return self._package
271