Explanation of Parameters:
1. Threshold, t:
For Identifying TPRs: In the first module, while self-comparing the query sequence in a sliding window of size 10, threshold t defines the minimum number of exact matches to be considered for two windows to be similar. The default value of t is 5, which allows for 50% mismatch between windows.
For Identifying POSAAs and SAARs: Here, threshold t corresponds to (1 - percentage mismatch) allowed in the final output. The default value, t = 8, corresponds to 20% mismatch tolerated in the complete repeat region reported.
2. Tuple size, k:
For Identifying TPRs: In the second module, to find the period (i.e., length) of the repeat pattern, a sliding window of size k is used (default k = 3). For finding larger repeat patterns, the value of k may be increased.
For Identifying POSAAs and SAARs: It is an internal parameter in the identification of POSAAs & SAARs and cannot be changed by the user
3. Copy Number,n:
For Identifying TPRs: For a givenn, the program will report an output only if the pattern is found to be occurring minimum n number of times at a given locus. Default n=4 will report all tandem repeats occurring 4 or more times contiguously at a given locus.
For Identifying POSAAs and SAARs: Here,n corresponds to the minimum number of occurrences of an amino acid at a certain period to be reported in the output. Default n=5.
|