Genbank accession
UCR81485.1 [GenBank]
Protein name
long tail fiber protein
RBP type
TF
Evidence GenBank
Probability 1,00
TSP
Evidence DepoScope
Probability 1,00
TSP
Evidence RBPdetect
Probability 0,84
TF
Evidence RBPdetect2
Probability 0,95
Protein sequence
MATLKQIQFKRSKTAGQRPAASVLAEGELAINLKDRVLFTKDDQGNIIDLGFAKGGSIDGNVIHKGNYNQTGDYTLNGTFTQTGNFNLTGIARVTRDIIAAGQIMTEGGELITKSSGTAHVRFHDSADRERGIIFSPANDGLTTQVVNIRVQDYKAGSESTFVFNGNGLFSSPEVFGWKSVSTPVIYTNKVITNKKVKDDYDIYSMADNVPLSESTTAINHLRVMRNAVGSGIFHEVKDNDGITWYSGDGLDAYLWSFTWSGGIKSSHSISIGLTPGPKDYSILGPSSIALGDNDTGFKWHQDGYYFSVNNGTKTFLFSPSETTSLRKFIAGYSTNGTDLTTPPTENYALATVVTYHDNNAFGDGQTLLGYYQGGNYHHYFRGKGTTNINTHGGLLVTPGNIDVIGGSVNIDGRNNNSTLMFKGYTMGQSSVDNMYIAVWGNTFTNPSEGTRKNVMEISDGIGWMHYIQRNKDNTVEAVLNGQQTINENIIAKKDIWVDRAVHTIGEITTNAVNGLRIWNNDYGVIFRRSEESLHIIPTAFGEGETGDIGPLRPLSVALNSGKVTIPDLQSSYNTFAANGYIKFAGHGAGAGGYDIQYSQAAPIFQEIDDAAVSKYYPIVKQKFLNGKAVWSLGTEINSGTFVLHHLKEDGSQGHTSRFNADGTVNFPDNVQVGGGEATIARNGNIFSDIWKSFTSAGETTNIRDAIATRVSKEGDTMTGKLTLSAGNDALILTAGEGASSHIRSDVGGTGNWYIGKGGGDNGLGFYSYITQGGVYITNNGEISLSPQGQGTFNFNRDRLHINGTQWVAHQAGDWGNQWRQEAPIFVDFGNVGNDSYYPIIKGKSGITNEGYISGVDFGMRRTTNQWAQAIIRVGNQENGSDPQAIYEFHHNGVLYAPNMVQAGARLSAGGGDPVWTGPCLVIGDNDTGLVHGGDGRINMVANGAHIASWSSSYHSHPGLWDSNGAFWTEVGKAIISHGHLVQANDSYSTYVRDVYVRSDIRVKKDLVKFENASQTLSKINGYTYMQKRGLDEEGNQKWEPNAGLIAQEVQAILPELVEGDPDGEALLRLNYNGVIGLNTAAINEHTAEIAELKSEIEELKKLVKSLLK
Physico‐chemical
properties
protein length:1109 AA
molecular weight: 120217,76820 Da
isoelectric point:5,72365
aromaticity:0,09648
hydropathy:-0,37980

Domains

Domains [InterPro]
DC_0538
STR
1–694
IPR048390
ATT
449–556
DC_0594
RBD
1006–1109
UCR81485.1
1 1109
Architecture
STR
ATT
STR
RBD
STR 1-448 | ATT 449-556 | STR 557-1035 | RBD 1036-1109
Legend: ATT STR RBD CBM LEC ENZ CHP LNK TAS TTP UNK Unmapped

Tail Spike Domain Segmentation

Tail Spike Domain Segmentation

This protein has been segmented into three structural domains: N-terminal, central domain, and C-terminal.

Domain Layout
N-terminal
Central
C-terminal
UCR81485.1
1 1109
Domain Start End Length (AA) Confidence
N-terminal 1 218 218 0,2824
Central domain 219 417 200 0,4295
C-terminal 418 1109 691 0,7251
Legend: N-terminal Central domain C-terminal
3D Structure with Domain Coloring

The structure is colored according to the domain segmentation: N-terminal (blue), Central (green), C-terminal (pink).

Domain Coloring
N-terminal
1-218
Central
219-417
C-terminal
418-1109

Taxonomy

  Name Taxonomy ID Lineage
Phage Escherichia phage vB_EcoM-G157lw
[NCBI]
2880936 Viruses > Duplodnaviria > Heunggongvirae > Uroviricota > Caudoviricetes
Host Escherichia coli
[NCBI]
562 cellular organisms > Bacteria > Pseudomonadati > Pseudomonadota > Gammaproteobacteria > Enterobacterales

Coding sequence (CDS)

Coding sequence (CDS)
Genbank protein accession
UCR81485.1 [NCBI]
Genbank nucleotide accession
OK331996.1 [NCBI]
CDS location
range 23844 -> 27173
strand -
CDS
ATGGCTACTTTAAAACAAATACAATTTAAAAGAAGCAAAACGGCAGGTCAACGTCCAGCTGCTTCAGTATTAGCCGAAGGTGAATTGGCTATAAACTTAAAAGACCGTGTACTTTTTACTAAAGATGACCAGGGAAATATTATTGATTTAGGTTTTGCTAAAGGCGGTAGTATTGACGGGAATGTTATTCATAAAGGCAATTATAACCAAACTGGCGATTATACTTTAAACGGTACATTCACCCAGACTGGTAATTTTAATTTAACCGGTATTGCTCGAGTAACTCGTGATATTATTGCTGCTGGGCAGATTATGACTGAAGGCGGAGAACTTATTACAAAAAGTTCAGGAACGGCACACGTTCGTTTTCATGATTCAGCTGACCGTGAACGCGGTATTATTTTTTCTCCTGCTAATGACGGTTTAACTACACAGGTAGTTAACATCAGAGTTCAAGATTACAAAGCTGGTTCAGAAAGCACTTTCGTTTTTAATGGAAATGGTTTGTTTTCTTCACCAGAAGTTTTTGGGTGGAAATCTGTATCAACTCCGGTAATTTATACCAATAAAGTTATCACCAATAAAAAAGTTAAAGATGATTATGACATCTATTCGATGGCAGATAATGTTCCATTATCTGAAAGCACTACTGCTATTAATCATCTTCGTGTTATGCGTAATGCAGTTGGTTCTGGTATTTTCCATGAAGTTAAAGATAATGATGGAATAACATGGTATAGTGGAGATGGATTAGACGCTTATCTTTGGTCATTTACTTGGAGCGGCGGAATTAAATCAAGTCACTCAATTTCCATCGGTTTAACACCCGGACCTAAGGATTACTCAATATTAGGACCGTCTAGTATCGCTTTAGGAGATAATGATACTGGATTTAAATGGCATCAAGACGGATATTATTTCAGTGTTAACAATGGCACAAAAACGTTTTTATTTAGTCCAAGCGAAACAACTAGCCTAAGAAAATTTATAGCTGGATATTCTACTAACGGAACCGATTTAACTACTCCTCCAACTGAAAATTATGCTCTTGCTACTGTAGTGACATACCATGATAATAACGCGTTTGGGGATGGTCAGACTCTTTTAGGATATTATCAAGGCGGTAACTATCATCATTATTTCCGCGGCAAGGGCACTACAAATATTAATACTCATGGCGGTTTGTTAGTTACTCCAGGCAATATTGACGTTATTGGTGGTTCTGTTAATATCGATGGTAGAAATAATAATTCAACTTTAATGTTTAAAGGCTATACCATGGGTCAAAGCTCCGTTGATAACATGTATATAGCTGTTTGGGGAAATACTTTTACTAATCCAAGTGAAGGCACCCGTAAAAATGTCATGGAAATTTCTGATGGTATTGGATGGATGCATTATATTCAACGTAATAAAGATAATACGGTTGAAGCCGTGTTAAATGGTCAACAAACAATTAATGAAAATATTATTGCGAAAAAGGATATTTGGGTTGACCGAGCAGTTCATACCATTGGCGAAATCACTACAAATGCTGTTAATGGTCTTCGTATTTGGAACAATGATTACGGAGTTATTTTTAGACGCTCAGAAGAAAGTCTTCATATTATTCCTACAGCATTTGGCGAAGGAGAAACCGGTGATATTGGGCCTTTACGTCCTCTCAGCGTAGCTTTAAATTCCGGTAAAGTTACTATTCCAGATTTACAGTCAAGTTATAATACGTTCGCTGCAAATGGTTATATTAAATTTGCTGGTCATGGGGCTGGTGCCGGTGGTTATGATATTCAGTATTCACAAGCTGCTCCTATTTTCCAAGAAATCGATGATGCTGCTGTAAGCAAATATTATCCTATTGTTAAACAGAAGTTTTTAAACGGTAAAGCCGTTTGGTCTTTAGGTACTGAAATTAATTCTGGTACATTTGTTTTACATCATTTGAAAGAAGATGGTTCACAAGGCCATACATCAAGATTTAATGCTGATGGTACAGTTAATTTCCCCGATAACGTTCAAGTTGGCGGCGGTGAAGCTACTATTGCTCGTAATGGTAATATTTTCTCAGATATTTGGAAATCGTTTACTTCTGCGGGAGAAACCACAAATATTCGCGATGCAATAGCTACTCGTGTTTCTAAAGAAGGCGACACGATGACTGGTAAATTGACTTTATCGGCAGGCAATGATGCTCTCATTTTAACTGCGGGCGAAGGTGCTTCATCACATATCCGTAGTGATGTAGGTGGTACAGGTAACTGGTATATAGGCAAAGGCGGCGGCGACAATGGTCTAGGTTTTTATAGTTACATTACACAAGGCGGTGTATACATAACAAATAACGGCGAAATATCGCTTTCTCCTCAAGGTCAAGGAACATTTAATTTTAATAGAGACCGCCTTCATATAAACGGTACACAATGGGTTGCGCACCAAGCTGGTGATTGGGGAAACCAATGGCGACAAGAAGCGCCAATATTTGTAGATTTTGGCAATGTCGGTAATGATAGTTATTACCCGATTATCAAAGGAAAATCTGGTATTACTAATGAAGGATACATATCGGGTGTTGATTTTGGTATGCGACGCACTACTAACCAATGGGCTCAGGCTATTATCCGTGTTGGTAACCAGGAAAATGGTTCTGACCCACAAGCTATCTATGAATTTCACCACAATGGAGTTCTGTATGCTCCTAATATGGTTCAAGCTGGAGCAAGATTATCAGCTGGCGGTGGTGACCCTGTATGGACCGGCCCGTGTCTTGTTATTGGTGATAATGATACTGGATTAGTTCATGGTGGTGACGGCCGAATCAATATGGTTGCAAATGGAGCGCATATTGCTTCGTGGTCTTCATCTTATCATTCTCATCCTGGCCTTTGGGATTCAAATGGAGCTTTTTGGACAGAAGTTGGCAAAGCAATTATTTCTCACGGCCATCTTGTCCAGGCGAATGACAGTTATTCCACATATGTCCGCGATGTTTATGTCCGTTCTGATATTCGTGTTAAAAAAGACCTTGTTAAATTTGAAAATGCTTCACAAACACTTTCAAAAATTAACGGTTACACTTATATGCAGAAGCGAGGCCTAGATGAAGAAGGCAATCAGAAATGGGAACCTAACGCCGGTTTGATAGCTCAAGAAGTTCAAGCTATTTTGCCTGAATTAGTTGAAGGTGACCCTGATGGCGAAGCTTTACTTCGTTTGAACTATAACGGTGTAATTGGTTTAAATACAGCTGCAATCAATGAGCATACTGCAGAAATTGCAGAACTTAAGTCAGAGATTGAAGAGCTTAAAAAATTAGTTAAATCATTGTTAAAATAA

Genome Context

Genome Context

Tertiary structure

PDB ID
93c64abe961963c74bd329088f09c6a48abdead91d4d2fe8ee1dfa2a994f7427
ESMFold
Source ESMFold
Method ESMFold
Resolution 0,5419
Oligomeric State monomer
Model Confidence
Very high
pLDDT > 90
High
90 > pLDDT > 70
Low
70 > pLDDT > 50
Very low
pLDDT < 50