13 8 / 2012

Following extensive discussion on the dev list of the pros and cons of configuration classes/modules, I have refactored my coordinate mapper to keep configuration as isolated as possible.

All mapping functions use base 0 internally. Transformation to and from 1-based coords is allowed by custom MapPosition objects. (they are currently separate from the Seq* positions but could probably subclass ExactPosition). The MapPosition objects have to_dialect and from_dialect methods that automatically handle conversion between bases and other formatting details.

There are two different ways a user can convert a coordinate from HGVS:

# ... assuming cm is an instance of CoordinateMapper
# Manually construct position from HGVS
CDS_coord = CDSPosition.from_hgvs("6+1")
genomic_coord = cm.c2g(CDS_coord)
print genomic_coord.to_hgvs()

# Pass dialect argument to mapping function
genomic_coord = cm.c2g("6+1", dialect="HGVS")
print genomic_coord.to_hgvs()

Furthermore, the inheritance hierarchy is designed to allow a user to set a default string representation:

# Set MapPositions to print as HGVS by default
def use_hgvs(self):
    return str(self.to_hgvs())
MapPosition.__str__ = use_hgvs

The revision as of this writing is passing tests using base 0. I have not yet implemented tests for from_hgvs or to_hgvs, but that’s next on my list. I’m hoping to have time for strand and mixed strand, too.

Update:

The latest revision now tests with default settings and HGVS settings.