A reference sequence can be used to define a global site numbering scheme for multiple sequence alignment. The gap in the reference sequence will be skipped for the numbering. Also, the site that is gap or amino acid/nucleotide for too many tips will be ignored but won't affect numbering.
setSiteNumbering(x, reference, gapChar, ...)
# S3 method for phyMSAmatched
setSiteNumbering(x, reference = NULL, gapChar = "-", minSkipSize = NULL, ...)
The object to set site numbering. It could be a
phyMSAmatched
or a lineagePath
object.
Name of reference for site numbering. The name has to be one of the sequences' name. The default uses the intrinsic alignment numbering
The character to indicate gap. The numbering will skip the
gapChar
for the reference sequence.
Further arguments passed to or from other methods.
The minimum number of tips to have gap or ambiguous amino acid/nucleotide for a site to be ignored in other analysis. This will not affect the numbering. The default is 0.8.
The input x
with numbering mapped to reference
.
data(zikv_tree)
msaPath <- system.file('extdata', 'ZIKV.fasta', package = 'sitePath')
tree <- addMSA(zikv_tree, msaPath = msaPath, msaFormat = 'fasta')
setSiteNumbering(tree)
#> This is a 'lineagePath' object.
#>
#> 7 lineage paths using 8 as "major SNP" threshold