Hi! First thank you so much for such a useful program!
I have encountered an error when running avp detect, which is pasted below. I used diamond following your mention in the wiki, which worked well. Any pointer to what is going on would be extremely helpful.
Thanks!!
Linnie
~/software/AvP/avp detect -i C2_output/mafftgroups/ -o C2_output/ -g C2_output/groups.tsv -t C2_output/tmp/taxonomy_nexus.txt -c config.yaml
[+] Setting up
[!] Found 2617 groups and 4370 genes
[+] Reconstructing phylogenies with FastTree
[x] 100%
[+] Analyzing fasttree results
Traceback (most recent call last):
File "/scistor/guest/zrs382/software/AvP/avp", line 7, in
main()
File "/net/sys/pscst000/export/bazis/guest/zrs382/software/AvP/depot/interface.py", line 32, in main
detect.main()
File "/net/sys/pscst000/export/bazis/guest/zrs382/software/AvP/depot/detect.py", line 226, in main
make_nexus_file(gene, group, lineage, gene_nexus_path, group_file, phylogeny_file, colors)
File "/net/sys/pscst000/export/bazis/guest/zrs382/software/AvP/depot/PetNexus.py", line 60, in make_nexus_file
t = Tree(phylogeny_file)
^^^^^^^^^^^^^^^^^^^^
File "/scistor/guest/zrs382/.local/lib/python3.11/site-packages/ete3/coretype/tree.py", line 212, in init
read_newick(newick, root_node = self, format=format,
File "/scistor/guest/zrs382/.local/lib/python3.11/site-packages/ete3/parser/newick.py", line 264, in read_newick
raise NewickError('Unexisting tree file or Malformed newick tree structure.')
ete3.parser.newick.NewickError: Unexisting tree file or Malformed newick tree structure.
You may want to check other newick loading flags like 'format' or 'quoted_node_names'.
The output of avp detect was:
Setting up
[!] Selected 4370 HGT candidates
[+] Parsing hits file and grouping similar queries
[!] Formed 2617 groups
[+] Extracting hits from DB
[+] Writing fasta files
[!] Skipped 432098 hits and 0 taxids.
[+] Aligning fasta files
[x] 100%
[!] Finished with 4370 HGT candidates in 2617 groups
This is my config file:
max_threads: 4
DB path
blast_db_path: /scistor/guest/zrs382/databases/ # blast: change to the local blast_db path
fasta_path: /scistor/guest/zrs382/databases/ # diamond: change to the local fasta path for sp, ur90, or custom database
mode: blast # use blast for blast database, use sp for swissprot database, ur90 for uniref90 or custom database
data_type: AA # data type DNA, AA
Algorithm options
prepare
ai_cutoff: 0
ahs_cutoff: 0
outg_pct_cutoff: 80
selection: "ai or ahs" # select sequences based on which metrics, another example "(ai or ahs) and outg_pct"
percent_identity: 100 # select hits equal or below this number
cutoffextend: 20 # when ingroup hit is found, we take this hit + n hits
number_hits_noingroup: 50 # when no ingroup hit is found, we take this number of hits
trimal: false
min_num_hits: 4 # select queries with at least that many blast hits
percentage_similar_hits: 0.7 # group queries based on this
detect, clasify, evaluate
fastml: true # Use fasttree instead of IQTree
node_support: 0 # nodes below that number will collapse
complex_per_ingroup: 20 # if D/(D+I) smaller than this then node is considered Ingroup
complex_per_donor: 80 # if D/(D+I) greater than this then node is considered Donor
complex_per_node: 90 # if node contains percent number of this category, it is assigned
Program specific options
mafft_options: '--anysymbol --auto'
trimal_options: '-automated1'
#IQ-Tree
iqmodel: '-mset WAG,LG,JTT -AICc -mrate E,I,G,R'
ufbootstrap: 1000
iq_threads: 4
Hi! First thank you so much for such a useful program!
I have encountered an error when running avp detect, which is pasted below. I used diamond following your mention in the wiki, which worked well. Any pointer to what is going on would be extremely helpful.
Thanks!!
Linnie
~/software/AvP/avp detect -i C2_output/mafftgroups/ -o C2_output/ -g C2_output/groups.tsv -t C2_output/tmp/taxonomy_nexus.txt -c config.yaml
[+] Setting up
[!] Found 2617 groups and 4370 genes
[+] Reconstructing phylogenies with FastTree
[x] 100%
[+] Analyzing fasttree results
Traceback (most recent call last):
File "/scistor/guest/zrs382/software/AvP/avp", line 7, in
main()
File "/net/sys/pscst000/export/bazis/guest/zrs382/software/AvP/depot/interface.py", line 32, in main
detect.main()
File "/net/sys/pscst000/export/bazis/guest/zrs382/software/AvP/depot/detect.py", line 226, in main
make_nexus_file(gene, group, lineage, gene_nexus_path, group_file, phylogeny_file, colors)
File "/net/sys/pscst000/export/bazis/guest/zrs382/software/AvP/depot/PetNexus.py", line 60, in make_nexus_file
t = Tree(phylogeny_file)
^^^^^^^^^^^^^^^^^^^^
File "/scistor/guest/zrs382/.local/lib/python3.11/site-packages/ete3/coretype/tree.py", line 212, in init
read_newick(newick, root_node = self, format=format,
File "/scistor/guest/zrs382/.local/lib/python3.11/site-packages/ete3/parser/newick.py", line 264, in read_newick
raise NewickError('Unexisting tree file or Malformed newick tree structure.')
ete3.parser.newick.NewickError: Unexisting tree file or Malformed newick tree structure.
You may want to check other newick loading flags like 'format' or 'quoted_node_names'.
The output of avp detect was:
Setting up
[!] Selected 4370 HGT candidates
[+] Parsing hits file and grouping similar queries
[!] Formed 2617 groups
[+] Extracting hits from DB
[+] Writing fasta files
[!] Skipped 432098 hits and 0 taxids.
[+] Aligning fasta files
[x] 100%
[!] Finished with 4370 HGT candidates in 2617 groups
This is my config file:
max_threads: 4
DB path
blast_db_path: /scistor/guest/zrs382/databases/ # blast: change to the local blast_db path
fasta_path: /scistor/guest/zrs382/databases/ # diamond: change to the local fasta path for sp, ur90, or custom database
mode: blast # use blast for blast database, use sp for swissprot database, ur90 for uniref90 or custom database
data_type: AA # data type DNA, AA
Algorithm options
prepare
ai_cutoff: 0
ahs_cutoff: 0
outg_pct_cutoff: 80
selection: "ai or ahs" # select sequences based on which metrics, another example "(ai or ahs) and outg_pct"
percent_identity: 100 # select hits equal or below this number
cutoffextend: 20 # when ingroup hit is found, we take this hit + n hits
number_hits_noingroup: 50 # when no ingroup hit is found, we take this number of hits
trimal: false
min_num_hits: 4 # select queries with at least that many blast hits
percentage_similar_hits: 0.7 # group queries based on this
detect, clasify, evaluate
fastml: true # Use fasttree instead of IQTree
node_support: 0 # nodes below that number will collapse
complex_per_ingroup: 20 # if D/(D+I) smaller than this then node is considered Ingroup
complex_per_donor: 80 # if D/(D+I) greater than this then node is considered Donor
complex_per_node: 90 # if node contains percent number of this category, it is assigned
Program specific options
mafft_options: '--anysymbol --auto'
trimal_options: '-automated1'
#IQ-Tree
iqmodel: '-mset WAG,LG,JTT -AICc -mrate E,I,G,R'
ufbootstrap: 1000
iq_threads: 4