A reference sequence can be used to define a global site numbering scheme for multiple sequence alignment. The gap in the reference sequence will be skipped for the numbering. Also, the site that is gap or amino acid/nucleotide for too many tips will be ignored but won't affect numbering.

setSiteNumbering(x, reference, gapChar, ...)

# S3 method for phyMSAmatched
setSiteNumbering(x, reference = NULL, gapChar = "-", minSkipSize = NULL, ...)

Arguments

x

The object to set site numbering. It could be a phyMSAmatched or a lineagePath object.

reference

Name of reference for site numbering. The name has to be one of the sequences' name. The default uses the intrinsic alignment numbering

gapChar

The character to indicate gap. The numbering will skip the gapChar for the reference sequence.

...

Further arguments passed to or from other methods.

minSkipSize

The minimum number of tips to have gap or ambiguous amino acid/nucleotide for a site to be ignored in other analysis. This will not affect the numbering. The default is 0.8.

Value

The input x with numbering mapped to reference.

Examples

data(zikv_tree)
msaPath <- system.file('extdata', 'ZIKV.fasta', package = 'sitePath')
tree <- addMSA(zikv_tree, msaPath = msaPath, msaFormat = 'fasta')
setSiteNumbering(tree)
#> This is a 'lineagePath' object.
#> 
#> 7 lineage paths using 8 as "major SNP" threshold