Genbank accession
YP_012026693.1 [GenBank]
Protein name
tail fiber
RBP type
TF
Evidence GenBank
Probability 1,00
TSP
Evidence DepoScope
Probability 1,00
TSP
Evidence RBPdetect
Probability 0,89
TF
Evidence RBPdetect2
Probability 0,88
Protein sequence
MVTKTVIPTDITKLKNESLNKNNNLSDLADRAAAWLNVRPIGSTPLAGDPVGDYDAATKRWVENKINTGTVGPTMNGVMNYGVGDFHLRDSRAYIQPYEVVSDGQLLNRADWPELWAYAQMVGAIDDSVWLADKFQRGRYSSGDGTTTFRVPDKNGVQEGSIRALYGRGDGGNSGANGQLFESAAPNITGFFTTYASQTYAQVLGHTGGCFNANNNAFPDGRGDGIAPGSTALSGRYNTCEFNASISSPIYGASTDEILTRNFVGVWVIRASGGFVAANTSWSVINADKTTPNSGVVVFGGSVKSEYKSPSANVVAELKSSKVVNGAWGVSLTISDDRGIPVSIKYDSNGQFSTSHRPAGVGTGNSYSHCAYRAPDWWNNTAYSAFVPILGGGSGNSLKNYRSFACFGQISYPDSSNSYPRAAIAQVRDWALPDGEQGNGKSITVFTFIDNSFDIQYGYNDGSLNYIFSKSPICDERLKKNIQSIDTNIAISNIKQMPYKSYIYKNDENEQVRRGFIAQDLQKIDPQYVREYGDPGREKTLAIDENVMLLDSVAAVKWLINKVEELQEEINILKAK
Physico‐chemical
properties
protein length:576 AA
molecular weight: 62634,74360 Da
isoelectric point:5,63044
aromaticity:0,10417
hydropathy:-0,40347

Domains

Domains [InterPro]
Coil
Unmapped
11–31
DC_1514
STR
18–185
YP_012026693.1
1 576
Architecture
STR
RBD
STR 18-385 | RBD 430-576
Legend: ATT STR RBD CBM LEC ENZ CHP LNK TAS TTP UNK Unmapped

Tail Spike Domain Segmentation

Tail Spike Domain Segmentation

This protein has been segmented into three structural domains: N-terminal, central domain, and C-terminal.

Domain Layout
N-terminal
Central
C-terminal
YP_012026693.1
1 576
Domain Start End Length (AA) Confidence
N-terminal 1 287 287 0,3046
Central domain 288 486 200 0,1512
C-terminal 487 576 89 0,9959
Legend: N-terminal Central domain C-terminal
3D Structure with Domain Coloring

The structure is colored according to the domain segmentation: N-terminal (blue), Central (green), C-terminal (pink).

Domain Coloring
N-terminal
1-287
Central
288-486
C-terminal
487-576

Taxonomy

  Name Taxonomy ID Lineage
Phage Escherichia phage EC100
[NCBI]
2894397 Viruses > Duplodnaviria > Heunggongvirae > Uroviricota > Caudoviricetes
Host Escherichia coli
[NCBI]
562 cellular organisms > Bacteria > Pseudomonadati > Pseudomonadota > Gammaproteobacteria > Enterobacterales

Coding sequence (CDS)

Coding sequence (CDS)
Genbank protein accession
YP_012026693.1 [NCBI]
Genbank nucleotide accession
NC_105587.1 [NCBI]
CDS location
range 79471 -> 81201
strand -
CDS
ATGGTAACTAAAACAGTAATTCCTACGGATATTACAAAACTTAAAAATGAGTCACTTAATAAAAATAATAACCTGAGTGACTTAGCAGACCGTGCAGCAGCGTGGCTAAATGTAAGACCTATTGGTTCAACCCCTCTTGCTGGTGATCCTGTTGGGGATTATGATGCCGCAACTAAACGTTGGGTAGAGAATAAGATTAACACTGGTACAGTTGGACCCACTATGAATGGTGTTATGAACTACGGTGTTGGTGATTTTCACCTCCGAGATAGTCGTGCGTATATTCAACCATACGAAGTCGTTTCGGATGGACAGCTTCTTAATAGGGCTGACTGGCCTGAACTTTGGGCGTATGCACAAATGGTTGGGGCTATAGATGATTCAGTTTGGTTAGCAGATAAATTTCAGCGTGGCAGATACTCATCTGGGGATGGGACAACAACCTTTCGTGTTCCAGATAAGAATGGGGTGCAGGAGGGATCAATACGTGCTCTTTATGGCCGTGGAGATGGTGGGAATAGTGGTGCAAATGGGCAACTGTTCGAATCCGCTGCGCCTAATATTACAGGTTTTTTCACCACATATGCCTCTCAAACCTATGCCCAGGTATTAGGACATACAGGTGGTTGCTTTAATGCAAACAATAATGCGTTTCCTGACGGAAGAGGTGACGGAATCGCTCCAGGGTCTACAGCTTTATCTGGGCGTTATAATACCTGCGAATTCAATGCCTCAATATCAAGTCCGATTTACGGGGCATCAACAGATGAGATTCTTACACGAAACTTTGTCGGTGTGTGGGTAATCCGTGCTTCTGGTGGGTTCGTGGCTGCTAATACCTCGTGGAGTGTTATTAATGCGGATAAAACAACCCCCAATTCAGGTGTTGTCGTTTTTGGAGGTAGTGTTAAATCTGAGTATAAATCACCCAGTGCAAATGTAGTAGCGGAACTAAAATCAAGTAAAGTGGTTAATGGTGCATGGGGTGTTAGTTTAACTATATCAGATGATAGAGGAATCCCTGTAAGCATTAAGTATGACAGTAATGGTCAGTTTTCAACATCTCATAGGCCGGCCGGAGTCGGTACAGGGAATTCCTATAGTCACTGCGCTTATCGCGCGCCAGACTGGTGGAATAACACAGCGTATAGTGCCTTCGTTCCAATTCTTGGGGGTGGTAGCGGTAATTCACTAAAAAACTATAGATCTTTTGCTTGTTTTGGACAAATATCTTATCCTGATTCTAGCAATAGTTATCCTAGAGCTGCAATTGCTCAAGTAAGGGATTGGGCACTGCCAGATGGTGAACAAGGTAATGGCAAGAGTATTACCGTGTTTACATTTATTGATAATTCATTTGATATTCAATATGGGTACAACGATGGTTCGTTAAATTATATTTTTTCAAAATCTCCGATTTGTGATGAAAGACTAAAGAAAAATATTCAGTCCATCGATACTAATATTGCAATAAGTAATATTAAGCAGATGCCCTATAAATCATACATCTATAAAAATGATGAGAATGAACAAGTGAGGCGCGGTTTTATAGCCCAGGATTTACAGAAAATAGATCCCCAATATGTACGGGAATACGGAGATCCAGGAAGAGAGAAAACGTTAGCCATCGATGAAAATGTTATGTTGCTTGATTCCGTTGCTGCGGTTAAGTGGCTTATTAATAAAGTTGAAGAATTGCAAGAAGAAATAAATATATTGAAAGCTAAGTAA

Genome Context

Genome Context

Tertiary structure

PDB ID
8f3d39317944000ec8f76c41047f1678d9157b9d1f7909c248c34e4a83007511
ESMFold
Source ESMFold
Method ESMFold
Resolution 0,6564
Oligomeric State monomer
Model Confidence
Very high
pLDDT > 90
High
90 > pLDDT > 70
Low
70 > pLDDT > 50
Very low
pLDDT < 50

Literature

Title Authors Date PMID Source
Complete genome sequences of 17 Escherichia coli bacteriophages isolated from wastewater, pond water, cow manure and bird feces Vitt,A.R., Ahern,S.J., Gambino,M., Holst Sorensen,M.C. and Brondsted,L. 2022-10-20 GenBank