Genbank accession
XQU42846.1 [GenBank]
Protein name
long tail fiber protein
RBP type
TF
Evidence GenBank
Probability 1,00
TSP
Evidence DepoScope
Probability 1,00
TSP
Evidence RBPdetect
Probability 0,84
TF
Evidence RBPdetect2
Probability 0,94
Protein sequence
MATLKQIQFKRSKTAGARPAASVLAEGELAINLKDRVLFTKDDQGNIIDLGFAKGGSIDGNVIHIGNYNQTGDYTLNGTFTQTGDFNLTGIARVTRDIIAAGQIMTEGGELITKSSGTSHVRFFDGNSRERGIIYAPANDGLTTQVVNIRVQDYAAGDESTYAFSGSGLFTSPEVSAWKSISSPQILTDKVITNGKKTGDYDIYSLANNTPLAESETAINHLRVMRNAVGSGIFHEVKDNDGITWYAGDGLDAYLWSFTWSGGLKAGHSISIGLPGGSKGYSELGTASIALGDNDTGFKWHQDGYFHTVNNGTRTFIYGPAETQSLRKMVMGYSPDGILMTTPPTENYALATVVTYHDNNAYGDGQTLLGYYQGGNYHHYFRGKGTTNINTHGGLLVTPGNIDVIGGSVNIDGRNNASTLMFRGNTTGSSSVDNMTISVWGNTFTNPSVGNRKNVMEISDATSWMSYIQRLTTGEVEMNVNGSFESSGVTAGNRGVHTTGEISSGAVNALRIWNADYGAIFRRSEGSLHIIPTAYGEGKHGDIGPLRPFSMALDTGKVTIPDLQSSYNTFAANGYIKFTGHGAGAGGYDIQYVQAAPIFQEIDDDDAINKYYPIVKQKFLNSKAVWSLGTEINSGTFVLHHIKEDGSQGHTSRFNQDGTVNFPDNVQVGGGEATIARNGNIWSDIWKPFTSAGDTTNIRDAIATRVSKEGDTMTGKLTLSAGNDALVLTAGVGASSHIRSDVGGTGNWYIGKGGGDNGLGFYSYITQGGVYITNNGEISLSPQGQGTFNFNRDRLHINGTQWTAHQAGDWANQWRQEAPIFVDFGYVGNDCYYPIIKGKSVITNEGYVSGVDFGMRRITNQWAQAIIRVGNQENASDPQAIFEFHHNGNMYVPDMVKAGVRISAGGGDPAWTGACVVIGDNDTGLVHGGDGRINMVANGMHIASWSSAYHLHEGLWDTTGALWTEQGRAIISFGHLVQQSDAYSTFVRDVYVRSDIRVKKDLVKFENASEKLSKINGYTYMQKRGLDEEGNQKWEPNAGLIAQEVQAILPELVEGDPDGEALLRLNYNGVIGLNTAAINEHTAEIAELKSEIEELKALVKSLLK
Physico‐chemical
properties
protein length:1104 AA
molecular weight: 119051,53080 Da
isoelectric point:5,49630
aromaticity:0,09420
hydropathy:-0,34158

Domains

Domains [InterPro]
DC_0538
STR
1–685
IPR048390
ATT
450–549
IPR030392
CHP
994–1092
IPR030392
CHP
994–1053
XQU42846.1
1 1104
Architecture
STR
ATT
STR
RBD
STR 1-449 | ATT 450-549 | STR 550-1008 | RBD 1009-1104
Legend: ATT STR RBD CBM LEC ENZ CHP LNK TAS TTP UNK Unmapped

Tail Spike Domain Segmentation

Tail Spike Domain Segmentation

This protein has been segmented into three structural domains: N-terminal, central domain, and C-terminal.

Domain Layout
N-terminal
Central
C-terminal
XQU42846.1
1 1104
Domain Start End Length (AA) Confidence
N-terminal 1 216 216 0,2292
Central domain 217 415 200 0,3341
C-terminal 416 1104 688 0,7828
Legend: N-terminal Central domain C-terminal
3D Structure with Domain Coloring

The structure is colored according to the domain segmentation: N-terminal (blue), Central (green), C-terminal (pink).

Domain Coloring
N-terminal
1-216
Central
217-415
C-terminal
416-1104

Taxonomy

  Name Taxonomy ID Lineage
Phage Escherichia phage vB_EcoM_P8
[NCBI]
3412731 Viruses > Duplodnaviria > Heunggongvirae > Uroviricota > Caudoviricetes
Host Escherichia coli
[NCBI]
562 cellular organisms > Bacteria > Pseudomonadati > Pseudomonadota > Gammaproteobacteria > Enterobacterales

Coding sequence (CDS)

Coding sequence (CDS)
Genbank protein accession
XQU42846.1 [NCBI]
Genbank nucleotide accession
PV390659.2 [NCBI]
CDS location
range 60465 -> 63779
strand -
CDS
ATGGCTACTTTAAAACAAATACAATTTAAAAGAAGCAAAACTGCAGGAGCACGTCCTGCCGCTTCAGTATTAGCCGAAGGTGAATTGGCTATAAACTTAAAAGACCGCGTACTTTTTACTAAAGATGACCAAGGAAATATCATTGATCTTGGTTTTGCTAAAGGCGGTAGCATTGACGGGAATGTTATTCATATAGGAAACTATAATCAAACTGGTGATTATACTTTAAATGGTACCTTCACTCAAACCGGTGATTTCAATTTAACTGGTATTGCTCGAGTAACTCGCGATATTATTGCCGCCGGGCAGATTATGACTGAGGGTGGAGAACTTATTACAAAAAGTTCAGGTACATCGCATGTTCGTTTTTTCGATGGCAATAGCCGCGAACGTGGAATCATTTATGCCCCGGCTAATGATGGATTAACTACGCAAGTAGTTAATATCCGCGTTCAAGACTACGCCGCTGGTGATGAAAGCACCTATGCATTTTCAGGCAGTGGCCTATTTACTTCACCTGAAGTATCAGCATGGAAATCTATTTCGTCTCCACAAATTCTGACCGATAAAGTTATTACAAATGGGAAGAAGACAGGCGATTATGATATCTATTCATTAGCAAATAACACTCCGTTGGCAGAAAGCGAAACGGCTATTAACCACCTCCGTGTTATGCGAAATGCCGTAGGATCTGGTATATTCCATGAAGTTAAAGATAACGACGGGATAACCTGGTACGCCGGTGACGGGTTAGATGCCTATCTTTGGTCGTTTACCTGGTCCGGTGGATTGAAAGCAGGCCATTCTATTTCTATAGGTCTTCCGGGCGGCTCTAAAGGATATTCTGAATTAGGGACTGCTTCAATTGCTCTTGGTGATAATGATACCGGATTTAAATGGCATCAGGACGGATATTTTCATACAGTAAACAACGGAACAAGAACTTTCATTTACGGTCCTGCGGAAACACAAAGCCTTAGAAAAATGGTTATGGGTTATTCTCCGGATGGGATTCTTATGACAACGCCACCGACAGAAAACTATGCATTAGCCACTGTTGTTACTTATCATGATAATAACGCGTATGGCGACGGTCAGACTCTTTTAGGTTATTATCAAGGTGGTAATTATCATCACTATTTCCGTGGTAAAGGTACTACAAACATTAATACTCACGGCGGTTTGTTAGTTACTCCCGGTAATATTGACGTTATTGGTGGTTCTGTTAATATTGATGGTCGTAATAATGCTTCTACGCTGATGTTTAGAGGTAACACAACTGGTAGCAGTTCAGTTGATAATATGACAATTTCTGTATGGGGTAATACGTTTACTAATCCTAGTGTAGGTAATCGTAAAAATGTCATGGAAATTTCTGACGCAACTAGCTGGATGAGTTATATTCAAAGACTTACTACCGGTGAAGTAGAAATGAACGTCAATGGTTCATTTGAATCATCCGGTGTTACTGCTGGAAATAGAGGAGTTCACACAACAGGCGAAATTTCATCTGGAGCAGTGAATGCGCTTCGCATTTGGAATGCAGATTATGGAGCCATTTTTAGACGTTCAGAAGGCAGTCTTCATATTATTCCAACTGCTTACGGTGAAGGTAAACATGGCGATATCGGTCCACTTCGCCCGTTTAGTATGGCTTTAGATACTGGTAAAGTTACTATTCCAGATTTACAATCAAGTTACAATACGTTCGCAGCAAACGGCTATATTAAATTTACTGGTCACGGTGCAGGCGCTGGTGGTTATGATATTCAGTATGTTCAAGCAGCTCCTATTTTCCAGGAAATTGATGATGATGATGCTATAAACAAATATTATCCTATTGTTAAACAGAAGTTTTTAAACAGTAAAGCGGTTTGGTCTTTAGGTACTGAAATTAATTCGGGTACATTTGTTTTACACCATATCAAAGAAGATGGATCACAAGGCCATACGTCTCGTTTTAATCAAGACGGTACTGTTAACTTCCCGGATAACGTTCAGGTCGGTGGTGGTGAAGCTACTATTGCTCGAAATGGTAATATCTGGTCTGATATTTGGAAACCGTTTACTTCTGCCGGTGATACTACAAATATTCGCGATGCTATTGCGACTCGTGTTTCGAAAGAAGGCGACACGATGACCGGTAAATTGACTTTATCGGCAGGCAATGATGCTCTTGTTTTAACTGCAGGCGTGGGTGCTTCATCGCACATCCGTAGTGATGTAGGTGGTACAGGTAACTGGTATATAGGCAAAGGCGGCGGCGATAATGGTCTAGGTTTTTACAGTTACATTACACAAGGCGGTGTATACATAACAAATAACGGCGAAATATCGCTTTCTCCTCAAGGTCAAGGAACATTTAATTTTAATAGAGACCGCCTTCATATAAACGGTACACAGTGGACTGCGCACCAAGCTGGCGATTGGGCAAACCAATGGCGACAAGAAGCGCCGATATTTGTAGATTTTGGCTACGTCGGCAATGACTGCTATTATCCTATTATTAAAGGAAAATCTGTTATTACGAATGAAGGATACGTATCTGGTGTTGATTTCGGTATGCGCCGTATCACTAACCAATGGGCTCAGGCTATTATCCGTGTTGGTAACCAGGAAAATGCTAGCGATCCGCAAGCTATCTTCGAATTCCACCATAATGGAAACATGTACGTTCCTGACATGGTTAAAGCTGGAGTAAGAATATCAGCTGGTGGAGGTGACCCTGCATGGACAGGCGCATGTGTTGTTATTGGTGATAATGACACTGGGTTAGTTCATGGCGGTGACGGTCGTATTAACATGGTTGCAAATGGAATGCATATTGCTTCATGGTCGTCCGCTTACCATCTCCATGAAGGTCTTTGGGATACCACTGGTGCTTTGTGGACTGAACAAGGAAGAGCTATTATTTCTTTTGGTCATTTAGTCCAACAAAGCGATGCCTATTCCACATTTGTCCGCGATGTTTATGTCCGTTCTGATATTCGTGTTAAAAAAGACCTTGTTAAATTTGAAAATGCTTCTGAGAAGCTTTCTAAAATTAACGGTTACACTTATATGCAGAAGCGAGGCCTAGATGAAGAAGGAAATCAGAAATGGGAACCTAACGCCGGTTTGATAGCTCAAGAAGTTCAAGCTATTTTACCTGAATTAGTTGAAGGTGACCCTGATGGTGAAGCTTTACTTCGTTTGAACTATAACGGTGTAATTGGTTTAAATACAGCTGCAATCAATGAGCATACTGCAGAAATTGCAGAACTTAAATCAGAAATCGAAGAACTTAAAGCATTAGTTAAATCATTGTTAAAATAA

Genome Context

Genome Context

Tertiary structure

PDB ID
97d95669c95f3bf1289aeb0f670c3ef286eb246f48dc315eaa8d29b1ef9cc68d
ESMFold
Source ESMFold
Method ESMFold
Resolution 0,5276
Oligomeric State monomer
Model Confidence
Very high
pLDDT > 90
High
90 > pLDDT > 70
Low
70 > pLDDT > 50
Very low
pLDDT < 50