Genbank accession
WAQ79264.1 [GenBank]
Protein name
tail spike protein
RBP type
TF
Evidence Phold
Probability 1,00
TSP
Evidence DepoScope
Probability 1,00
TSP
Evidence RBPdetect
Probability 0,91
TSP
Evidence RBPdetect2
Probability 0,95
Protein sequence
MNEMFSQGGKGSTGILTNKQAVARHFGVKQSEVVYFSTGAVLSGYKVIYDKASQRAYSLPADIGSGVTAVSLSPAGVLVHSAGNVDLGALAVERKEYINSPGSFTTGATLTVKNQRLIHNGEYYIWNGAFPKVVPASSTPENSGGVANDAWSKLSPLVKSEEVPYVRMPNAINTNVAANMDLMGRRVIYATDFGVTVDSADNADALWELGQYLSTQVTEPVKVIFPAGTSLVGSQYLTGSTGQGGSYKPSYEQRAWTDASAKGWFSIHMTDANIELEMSNWTLKINDGMRLGAFDPVTGSIAPDVVAETPDYSYMAYQGFLIKLYKAPNVVINGGTSDGNLASAVWGGKFGNTGYQIPCYNMWINQSAGARVYRHKYLNSPVDGLYHQSTGSFSFLDIVPRTVIEDCYWDSCGRNCYSLTGGANIDIINPVITRSGNKAGGIGTHYTGPEAGIDIEAEGGNPYNIRVINPKIVNTGKCAFQTVSVPGTVNDILVVGGVLHSMHSEGAVSNAGNARNIKFVGTTIIGSIIDTGWPAAMGFECYSFIDCELQNRYANDYADEYRLNFKVKEFKGNTITFGIPPTINTNHATINIEDQDATPFGLYAERFKENRLIVYGDASKVTFANGLGGIRNFKNAELYVSADGLTGGTLKITVDTSSAAMNGLSTNTANFNFDVAMDKDVGKNVWYARKVNRVAGVMTAVTDSLQDIGSKDGRFGTMYATKGIILRDVGDSTYKRLRSNNGVLEVVADNT
Physico‐chemical
properties
protein length:751 AA
molecular weight: 80709,62170 Da
isoelectric point:5,89030
aromaticity:0,09854
hydropathy:-0,17004

Domains

Domains [InterPro]
G3DSA:3.30.2020.50
ATT
1–100
IPR040775
RBD
91–154
WAQ79264.1
1 751
Architecture
ATT
STR
RBD
ATT 1-234 | STR 255-525 | RBD 546-751
Legend: ATT STR RBD CBM LEC ENZ CHP LNK TAS TTP UNK Unmapped

Tail Spike Domain Segmentation

Tail Spike Domain Segmentation

This protein has been segmented into three structural domains: N-terminal, central domain, and C-terminal.

Domain Layout
N-terminal
Central
C-terminal
WAQ79264.1
1 751
Domain Start End Length (AA) Confidence
N-terminal 1 198 198 0,9925
Central domain 199 680 483 0,9830
C-terminal 681 751 70 0,9222
Legend: N-terminal Central domain C-terminal
3D Structure with Domain Coloring

The structure is colored according to the domain segmentation: N-terminal (blue), Central (green), C-terminal (pink).

Domain Coloring
N-terminal
1-198
Central
199-680
C-terminal
681-751

Taxonomy

  Name Taxonomy ID Lineage
Phage Escherichia phage E20
[NCBI]
3003230 Viruses > Duplodnaviria > Heunggongvirae > Uroviricota > Caudoviricetes
Host Escherichia sp.
[NCBI]
1884818 cellular organisms > Bacteria > Pseudomonadati > Pseudomonadota > Gammaproteobacteria > Enterobacterales

Coding sequence (CDS)

Coding sequence (CDS)
Genbank protein accession
WAQ79264.1 [NCBI]
Genbank nucleotide accession
OP745616.1 [NCBI]
CDS location
range 61169 -> 63424
strand -
CDS
ATGAACGAAATGTTTTCACAAGGTGGTAAGGGTTCAACTGGAATCTTAACCAACAAACAAGCAGTAGCCCGTCACTTTGGGGTTAAGCAATCTGAGGTTGTTTACTTCTCTACTGGTGCTGTACTAAGCGGTTATAAAGTCATCTATGACAAAGCGTCACAGCGTGCTTACTCCTTACCTGCTGATATTGGTTCAGGTGTTACTGCTGTAAGTCTTAGTCCTGCTGGGGTATTAGTTCACTCTGCTGGTAATGTTGACTTAGGTGCTCTGGCTGTAGAACGTAAGGAGTATATTAACTCTCCCGGTTCATTTACTACTGGGGCTACCCTTACCGTTAAAAACCAAAGACTGATTCATAATGGTGAGTATTATATTTGGAATGGTGCATTCCCTAAAGTTGTCCCGGCATCCTCTACACCAGAAAACTCAGGTGGTGTAGCTAATGATGCCTGGAGTAAGTTAAGCCCATTAGTTAAGTCTGAAGAAGTTCCCTATGTTCGTATGCCTAATGCAATTAATACCAACGTAGCTGCTAACATGGATTTAATGGGTAGACGTGTAATCTATGCTACGGATTTTGGTGTAACTGTTGACTCTGCTGATAACGCAGACGCTCTTTGGGAATTGGGTCAATACTTAAGCACACAGGTAACTGAACCGGTTAAGGTGATATTCCCTGCTGGCACTAGTTTAGTAGGTTCCCAGTATCTCACTGGTTCTACTGGTCAGGGTGGTTCTTATAAACCTTCCTATGAGCAGCGTGCCTGGACAGATGCTTCTGCTAAAGGCTGGTTCTCCATTCATATGACGGATGCTAATATCGAACTGGAGATGAGTAACTGGACACTTAAGATTAATGACGGTATGCGTTTAGGGGCTTTTGATCCTGTTACTGGTTCTATTGCACCGGATGTTGTAGCTGAAACCCCTGATTATAGCTATATGGCTTACCAAGGTTTTCTTATTAAACTGTACAAAGCACCTAACGTCGTAATTAATGGTGGGACCAGTGATGGTAACTTAGCCTCTGCTGTATGGGGGGGTAAGTTTGGTAACACTGGTTATCAGATTCCTTGTTATAACATGTGGATTAACCAGTCAGCAGGTGCTCGTGTTTACCGACATAAATACCTTAATTCGCCTGTAGATGGCTTATATCATCAATCTACCGGTAGCTTCAGTTTCTTGGATATTGTTCCACGCACTGTTATTGAAGACTGTTACTGGGATTCCTGTGGGCGCAACTGCTACTCCCTGACTGGTGGAGCTAACATTGATATCATTAATCCAGTAATTACCCGCTCAGGCAATAAAGCTGGTGGGATTGGTACTCATTATACCGGACCGGAAGCAGGTATTGATATTGAAGCTGAAGGGGGAAACCCTTACAACATTCGTGTTATTAACCCTAAAATTGTTAACACCGGTAAGTGTGCATTCCAGACTGTATCCGTACCGGGGACTGTTAACGATATTCTGGTTGTGGGTGGCGTGCTACATAGTATGCACTCTGAGGGTGCTGTATCCAATGCTGGTAATGCCCGTAATATTAAGTTTGTAGGCACAACCATCATAGGCAGTATTATTGATACTGGCTGGCCTGCTGCTATGGGTTTTGAGTGTTATAGCTTTATTGATTGTGAACTTCAGAACAGATATGCAAATGATTATGCTGATGAGTATCGTTTGAATTTCAAAGTTAAAGAGTTTAAAGGGAATACTATTACATTTGGTATCCCACCCACCATTAACACTAACCATGCAACAATTAATATTGAAGATCAGGACGCTACTCCTTTCGGTCTTTACGCAGAACGATTCAAAGAGAATCGGTTAATTGTTTATGGTGATGCATCTAAAGTTACCTTTGCCAATGGTCTAGGTGGCATTCGTAACTTTAAGAATGCAGAGTTGTACGTTTCTGCTGATGGACTTACTGGAGGTACTCTCAAGATTACTGTTGATACTTCTAGTGCAGCAATGAATGGTTTATCTACCAATACTGCAAACTTTAACTTCGATGTTGCTATGGATAAAGATGTTGGCAAAAATGTCTGGTATGCCCGTAAAGTTAATCGTGTTGCAGGGGTTATGACAGCAGTAACAGATTCCTTGCAGGATATTGGTTCTAAGGATGGACGTTTCGGTACTATGTATGCTACCAAAGGGATTATACTCCGAGATGTGGGGGATAGTACTTACAAACGCCTTCGTTCTAATAACGGAGTACTTGAGGTAGTTGCTGATAACACCTAA

Genome Context

Genome Context

Tertiary structure

PDB ID
059a1bd901732668e5f58ffa4a7a744efb4681acbe05d54c4614b7042be8d0e4
ESMFold
Source ESMFold
Method ESMFold
Resolution 0,6655
Oligomeric State monomer
Model Confidence
Very high
pLDDT > 90
High
90 > pLDDT > 70
Low
70 > pLDDT > 50
Very low
pLDDT < 50