Genbank accession
WAV88278.1 [GenBank]
Protein name
colanic acid biosynthesis protein
RBP type
TSP
Evidence DepoScope
Probability 1,00
TSP
Evidence RBPdetect
Probability 0,91
TSP
Evidence RBPdetect2
Probability 0,95
Protein sequence
MEIQMARDVYADIGMPSTPFLRLDRFAALVDGKIYIGVKDTDPLNPANQTQVFVEDEDGTLTPVPQPIRTNVSGYPVWNGQVVKLITKIESSMKVLDRNDVQQFYFGNLFKYDPVQMWNLLTSPTGWEYVGTTYGTVKESIQGWLTPFQFVGKAPFTSVAAAIQTMFDTAQAQKLAVNAVGWTGTLDGNVTATDILIMGGTWKGTADVFLDNAVLKGATVNNLRVRFWGGDVRIRDCLFDGKPTASKVGSIVLQANPKTGTIEVTECEFKNGLYGILQQGTGEAVTRGVYRNLSFYKMDGDGIELNVVQKHYDEGCLIDGIQLDTIGSANPSWGIGIGIAGGGPYGWDIPDSQYAKNVTITNVSAVRCRQCIHLEVARDCTVTNVDVNPDMGYGVGSGLTVGGVVCYGSKRITIDGVSGEPVATGTTDVHSLRMVMLEWGVTAGAPSNPCFDMTVRNVHTKQGRVYAGVAAGNGFENRMVFENIDCYTLSLFGVASLLEMSNISCRAFDAVGDDSSGGTTSDGFVTRGLSRLRMINVNAIDVNGYGDQAWSKCSYSDIESIGSNVIATPYPPQGIAGGIGAIMTNSNRTYINKGSSGAWDGNAFPTGKEFMAGDLIVREDGKIFTVTASGAYIPATDNFKIAATAVGDKKLICNVTPIANETSRPWLFGNPLSPGTRILIPGAGAGGATLSTRITRGPYQTPPSNSTAPVTIDIADAIVTATPAGTQLAAAKPIQFRTPA
Physico‐chemical
properties
protein length:740 AA
molecular weight: 78845,34230 Da
isoelectric point:5,12542
aromaticity:0,08514
hydropathy:-0,04892

Domains

Domains [InterPro]
IPR036730
ATT
12–112
IPR009093
ATT
13–115
WAV88278.1
1 740
Architecture
ATT
ATT
STR
RBD
ATT 8-115 | ATT 128-186 | STR 187-484 | RBD 485-740
Legend: ATT STR RBD CBM LEC ENZ CHP LNK TAS TTP UNK Unmapped

Tail Spike Domain Segmentation

Tail Spike Domain Segmentation

This protein has been segmented into three structural domains: N-terminal, central domain, and C-terminal.

Domain Layout
N-terminal
Central
C-terminal
WAV88278.1
1 740
Domain Start End Length (AA) Confidence
N-terminal 1 156 156 0,9917
Central domain 157 564 409 0,9945
C-terminal 565 740 175 0,9367
Legend: N-terminal Central domain C-terminal
3D Structure with Domain Coloring

The structure is colored according to the domain segmentation: N-terminal (blue), Central (green), C-terminal (pink).

Domain Coloring
N-terminal
1-156
Central
157-564
C-terminal
565-740

Taxonomy

  Name Taxonomy ID Lineage
Phage Phage ST231
[NCBI]
3003727 Viruses > Duplodnaviria > Heunggongvirae > Uroviricota > Caudoviricetes
Host Enterobacteriaceae bacterium
[NCBI]
1849603 cellular organisms > Bacteria > Pseudomonadati > Pseudomonadota > Gammaproteobacteria > Enterobacterales

Coding sequence (CDS)

Coding sequence (CDS)
Genbank protein accession
WAV88278.1 [NCBI]
Genbank nucleotide accession
OP921041 [NCBI]
CDS location
range 21651 -> 23873
strand +
CDS
ATGGAGATCCAAATGGCCCGCGACGTATACGCTGATATTGGTATGCCTTCGACGCCTTTCCTCCGTCTGGACCGTTTCGCGGCCCTGGTGGATGGGAAAATCTATATCGGCGTGAAGGACACCGACCCATTGAACCCCGCCAACCAGACCCAGGTCTTTGTCGAGGACGAGGACGGAACGCTGACGCCAGTCCCGCAGCCGATCAGGACCAACGTCTCCGGCTATCCGGTCTGGAACGGTCAAGTAGTCAAACTGATCACCAAGATCGAATCGTCCATGAAGGTTTTAGACCGGAACGACGTCCAGCAATTCTACTTCGGGAACCTGTTCAAATATGATCCGGTCCAGATGTGGAACCTTCTCACCTCGCCGACTGGGTGGGAATATGTCGGGACGACCTACGGGACCGTGAAGGAGTCGATCCAGGGCTGGCTGACGCCTTTCCAGTTCGTGGGTAAAGCGCCATTCACCAGCGTAGCCGCAGCCATTCAAACGATGTTTGACACGGCCCAGGCGCAGAAACTCGCGGTTAATGCCGTGGGCTGGACCGGGACCCTGGACGGCAACGTGACCGCGACCGATATCCTGATCATGGGCGGGACCTGGAAGGGGACGGCAGACGTTTTCCTGGATAATGCAGTCCTGAAAGGCGCGACCGTGAATAACTTACGCGTTCGCTTCTGGGGCGGCGACGTTCGAATCCGTGATTGCCTGTTTGACGGCAAGCCTACGGCCTCCAAAGTGGGATCGATCGTTCTCCAGGCCAACCCGAAAACCGGGACGATCGAGGTCACGGAATGCGAGTTCAAAAATGGCCTTTACGGCATCTTGCAGCAGGGGACCGGGGAAGCCGTTACGCGTGGCGTTTATCGTAACCTTTCGTTCTACAAAATGGACGGCGACGGGATCGAGCTTAACGTGGTCCAGAAGCACTACGATGAAGGCTGTTTGATCGATGGGATCCAACTGGACACTATCGGATCTGCTAACCCGTCGTGGGGCATCGGGATCGGTATCGCTGGCGGTGGCCCGTATGGCTGGGACATCCCGGACAGCCAATACGCGAAGAACGTCACGATCACCAACGTGTCCGCCGTTCGATGCCGCCAGTGTATTCACCTGGAGGTCGCCCGCGACTGTACTGTGACCAACGTCGATGTAAACCCGGATATGGGCTACGGCGTCGGCTCCGGCCTGACCGTGGGGGGAGTGGTCTGTTACGGCTCCAAACGCATCACGATCGACGGCGTGAGCGGCGAGCCAGTGGCTACCGGGACGACCGACGTCCACTCTCTCCGCATGGTTATGCTGGAGTGGGGCGTGACCGCTGGCGCACCGTCTAACCCTTGCTTTGATATGACCGTCCGCAACGTTCACACCAAACAGGGCCGCGTCTACGCTGGCGTGGCGGCGGGGAATGGCTTCGAAAACCGCATGGTGTTCGAAAATATCGACTGTTACACGCTCTCTCTATTCGGCGTGGCGTCCCTGCTGGAAATGTCGAATATCTCTTGTCGCGCATTCGACGCGGTGGGCGACGACTCAAGCGGCGGGACCACTTCGGACGGCTTCGTGACGCGTGGCCTCTCTCGTCTGCGAATGATCAACGTGAACGCGATCGACGTGAACGGCTACGGCGATCAGGCGTGGAGCAAATGTTCTTATTCCGACATCGAGAGCATCGGATCCAACGTGATCGCCACCCCTTATCCTCCCCAGGGCATCGCGGGCGGAATCGGGGCCATCATGACGAACTCCAACCGGACCTACATCAACAAGGGATCGAGCGGGGCGTGGGATGGTAACGCGTTCCCGACCGGAAAGGAGTTTATGGCGGGCGATCTGATTGTTCGCGAGGACGGGAAGATCTTCACCGTGACGGCCTCCGGGGCGTACATCCCGGCTACGGATAATTTCAAGATCGCGGCGACCGCAGTCGGCGACAAAAAGCTGATCTGTAACGTCACCCCGATCGCGAACGAAACCTCCCGCCCGTGGCTGTTCGGGAATCCGCTCTCCCCTGGGACGCGCATCCTGATCCCTGGCGCTGGCGCTGGTGGCGCTACGCTCTCGACTCGCATCACTCGCGGGCCGTACCAGACGCCGCCGAGCAACTCCACGGCTCCGGTAACGATCGATATCGCGGACGCTATCGTGACGGCAACCCCGGCGGGGACGCAGTTAGCAGCAGCGAAGCCGATCCAATTCCGAACTCCGGCATAA

Genome Context

Genome Context

Tertiary structure

PDB ID
a239ccc18c392be618e2861f2a829bb46f6bb5d8dd6a5d09b82c0aced0048685
ESMFold
Source ESMFold
Method ESMFold
Resolution 0,7168
Oligomeric State monomer
Model Confidence
Very high
pLDDT > 90
High
90 > pLDDT > 70
Low
70 > pLDDT > 50
Very low
pLDDT < 50