Local Alignment Algorithm for mtDNA

The vast majority of my work in mtDNA has focused on global alignments, because of the unbelievable efficiency this creates, even when working on consumer devices. However, I’ve basically exhausted the topic of global alignments, so I’ve started to focus on local alignments to further support my research. That is, I’m going to second guess my work, using local alignments, to see if I get the same results. So far, that is exactly the case, using the attached local alignment algorithm, which is very straight forward.

Specifically, it takes a given input genome, and compares it to a comparison genome, by taking 500 bases at a time, and searching one by one for the index of the comparison genome where the match count between those 500 bases from the input genome is maximized, when compared to the comparison genome. It does so for all 500 base segments of the input genome, producing a starting index for each such 500 base segment of the input genome, that is mapped to the comparison genome (i.e., an alignment), and a total match count using that alignment.

It is incomparably slower than my global alignment algorithm, which takes just 0.02 seconds to the find the nearest neighbor of an input genome over a dataset of approximately 650 whole mtDNA genomes, which is obviously really useful since it’s so fast. In contrast, the local alignment algorithm takes 1 hour to find the nearest neighbor of an input genome over the same dataset. This is obviously much less useful for discovery purposes, but my plan is to use it as further evidence for the histories I uncovered using mostly autonomous, extremely fast global alignment methods. Upon reflection, the gist is, global alignment methods are so fast, they allow for high volume, autonomous discovery, that can then be more carefully considered using local alignments.

Here’s the code, more to come soon!


Discover more from Information Overload

Subscribe to get the latest posts sent to your email.

Leave a comment