It turns out the Japanese do have unique mtDNA, but the alignment data provided by the NIH hides this, because it presents the first base of the genome as the first index, without any qualification, as there’s an obvious deletion to the opening sequence of bases. Maybe this is standard, but it’s certainly confusing, and completely wrecks small datasets, where you might not have another sequence with the same deletion. The NIH of course does, and that’s why BLAST returns perfect matches for genomes that contain deletions, and my software didn’t, because I only have 185 genomes.
The underlying paper that the genomes are related to is here:
https://pubmed.ncbi.nlm.nih.gov/34121089/
Again, there’s a blatant deletion in many Japanese mtDNA genomes, right in the opening sequence. This opening sequence is perfectly common to all other populations I sampled, meaning that the Japanese really do have a unique mtDNA genome.
Here’s the opening sequence that’s common globally, right in the opening 15 bases:
GATCACAGGTCTATC
For reference, here’s a Japanese genome with an obvious deletion in the first 15 bases, together for reference with an English genome:
https://www.ncbi.nlm.nih.gov/nuccore/LC597333.1?report=fasta
https://www.ncbi.nlm.nih.gov/nuccore/MK049278.1?report=fasta
Once you account for this by simply shifting the genome, you get perfectly reasonable match counts, around the total size of the mtDNA genome, just like every other population. That said, it’s unique to the Japanese, as far as I know, and that’s quite interesting, especially because they have great health outcomes as far as I’m aware, suggesting that the deletion doesn’t matter, despite being common to literally everyone else (as far as I can tell). Again, literally every other population (using 185 complete genomes) has a perfectly identical opening sequence that is 15 bases long, that is far too long to be the product of chance.
Here’s the updated software that finds the correct alignment accounting for the deletion:
https://www.dropbox.com/s/2lwgtjbzdariiik/Japanese_Delim_CMDNLINE.m?dl=0
Discover more from Information Overload
Subscribe to get the latest posts sent to your email.