13 8 / 2012
Coordinate Mapping update
Following extensive discussion on the dev list of the pros and cons of configuration classes/modules, I have refactored my coordinate mapper to keep configuration as isolated as possible.
All mapping functions use base 0 internally. Transformation to and from 1-based coords is allowed by custom MapPosition objects. (they are currently separate from the Seq* positions but could probably subclass ExactPosition). The MapPosition objects have to_dialect and from_dialect methods that automatically handle conversion between bases and other formatting details.
There are two different ways a user can convert a coordinate from HGVS:
# ... assuming cm is an instance of CoordinateMapper
# Manually construct position from HGVS
CDS_coord = CDSPosition.from_hgvs("6+1")
genomic_coord = cm.c2g(CDS_coord)
print genomic_coord.to_hgvs()
# Pass dialect argument to mapping function
genomic_coord = cm.c2g("6+1", dialect="HGVS")
print genomic_coord.to_hgvs()
Furthermore, the inheritance hierarchy is designed to allow a user to set a default string representation:
# Set MapPositions to print as HGVS by default
def use_hgvs(self):
return str(self.to_hgvs())
MapPosition.__str__ = use_hgvs
The revision as of this writing is passing tests using base 0. I have not yet implemented tests for from_hgvs or to_hgvs, but that’s next on my list. I’m hoping to have time for strand and mixed strand, too.
Update:
The latest revision now tests with default settings and HGVS settings.