Genbank accession
XHH63141.1 [GenBank]
Protein name
tail fiber protein
RBP type
TF
Evidence GenBank
Probability 1,00
TSP
Evidence DepoScope
Probability 1,00
TSP
Evidence RBPdetect
Probability 0,55
TF
Evidence RBPdetect2
Probability 0,94
Protein sequence
MAITKIILQQMVTMDQNSITASKYPKYTVVLSNSISSITAGELTTAIESSKASAAAAKQSEINAKQSELNAKDSENEAEISAASSQQSATQSASSATASANSAKAAKTSETNAKASETAAKTSETNAKASETAAKTSETNANSSKAAAAASASAAKTSETNAAASASAAKTSETNANSSKTAAANSASAAKTSETNAKTSETNAAASATKAENAASGMRDSIGLGNAPRNCPDISGNPSAYIGFMRIMSTAVGFPSIASGESSLTGFISQVDGSPAYTGVFQGWASRSLYTYRWASNIGPQWTRHARKNEVDKLVQLSSETHLLNPGDNAKIIITSNKLWGAYDIENRTYIPLAVGQGGTGGRSAAEARANLQLNRFQRSSDTRTIVCSTDVQADGCYLQVDASGQWGAFNPTTGKWQPLAIAQGGTGGNNTSDARRNLEVMYRRFSTLTGQNLNDLNGDYAGFYYQSLSANATTARNYPIQEAGNLMVLQNSANGTPGCCQIYITFSSNRIYERSYNPGTSTWSPWGSILNSYDPSYCRQLIELGSQHAPLFAGLALTGYSDSTVAAGGIINSYLRATDGTQRVRMRLYPEKLAEGVAAATLQIMGEDTGPSYKTFQFKNNGQLLVPNELNADTIAVRNLNTTQQNLGIPTTGFMGAYQTINAPAGAVDGKYYPVIFYTGGTNGNGVLPVPISIRTPGRSAGHEMNNNVFSGYVTCGGWSDSPNIAYGVFTAYDPRELGILCIKGSNKDYAQHIAVYVHYKAFPVHVMTDPKVVINVPTEDYVLGTNGVKFKFGVTDAGDGNAEGNVSNLLNFTGGGSGFYSNHPFRSGLSPNFALTNNLSTGDAFSATAPSFTFNGSVVGANSFSARGDAVTKNTYTSQLVNSADAIVGQSEFRATEEAGQIIVRDMSSSASHKFFNFNKDGTFSAPSGILSSTGVDWNTQHNTVNKFYGIAGQVNSPENNVVFGGVHIGFSGNYATQLAGRGNRYYLRSIESGTIGAWNRIITDQYADFKVPIFIYKNGEALTLKSSVNDTSESGYLAGRTANNTRMWLFGKTDSSKSVTLSNEMVGAYFTVSDNIVLRTPSYNGGIFADGSGVSVRRSNGREFKYENNMTAAKNGSILLWGNTTGRPTVVECKLDNGYLWYAQENSDGSRVFNINGNAEARAFNQTSDRDLKKNIQEIPNATQSIRKISGYTYNFKDDDMPYAGVIAQEVMEVLPEAISGFTRYTELAGATVDGEPLMGEERFYSVDYGAVTGLLVQVSKESDSRITALESEVSDLKKQIADLTLVVNSLLANR
Physico‐chemical
properties
protein length:1298 AA
molecular weight: 138088,88060 Da
isoelectric point:6,35553
aromaticity:0,08783
hydropathy:-0,35285

Domains

Domains [InterPro]
IPR030392
CHP
1171–1221
XHH63141.1
1 1298
Architecture
ATT
STR
STR
ATT 2-325 | STR 326-612 | STR 838-1298
Legend: ATT STR RBD CBM LEC ENZ CHP LNK TAS TTP UNK Unmapped

Tail Spike Domain Segmentation

Tail Spike Domain Segmentation

This protein has been segmented into three structural domains: N-terminal, central domain, and C-terminal.

Domain Layout
N-terminal
Central
C-terminal
XHH63141.1
1 1298
Domain Start End Length (AA) Confidence
N-terminal 1 322 322 0,5929
Central domain 323 521 200 0,2993
C-terminal 522 1298 776 0,7060
Legend: N-terminal Central domain C-terminal
3D Structure with Domain Coloring

The structure is colored according to the domain segmentation: N-terminal (blue), Central (green), C-terminal (pink).

Domain Coloring
N-terminal
1-322
Central
323-521
C-terminal
522-1298

Taxonomy

  Name Taxonomy ID Lineage
Phage Salmonella phage vB_Si_CECAV_FGS029
[NCBI]
3237254 Viruses > Duplodnaviria > Heunggongvirae > Uroviricota > Caudoviricetes
Host Salmonella enterica subsp. enterica serovar Infantis
[NCBI]
595 Pseudomonadota > Gammaproteobacteria > Enterobacterales > Enterobacteriaceae > Salmonella > Salmonella enterica

Coding sequence (CDS)

Coding sequence (CDS)
Genbank protein accession
XHH63141.1 [NCBI]
Genbank nucleotide accession
PP429240.1 [NCBI]
CDS location
range 81379 -> 85275
strand -
CDS
ATGGCTATAACTAAAATAATTCTACAGCAAATGGTCACTATGGACCAGAATAGTATAACTGCAAGTAAATATCCTAAGTATACAGTTGTACTTTCTAATTCTATTAGTTCTATCACTGCTGGTGAGCTAACTACTGCAATAGAATCCTCTAAAGCTTCTGCAGCAGCAGCTAAACAATCTGAGATTAATGCTAAACAGTCGGAGTTAAATGCTAAAGATTCTGAAAATGAAGCAGAAATTTCTGCGGCATCTTCTCAGCAGTCTGCAACTCAGTCTGCTTCTTCTGCTACTGCTTCTGCTAATAGTGCTAAAGCTGCAAAAACTTCAGAAACCAATGCAAAAGCTAGTGAGACAGCTGCAAAAACTTCAGAAACCAATGCAAAAGCTAGTGAGACGGCTGCAAAAACTTCTGAGACTAATGCTAATAGTAGTAAAGCTGCTGCGGCTGCATCTGCTTCAGCAGCAAAAACCTCAGAAACTAATGCTGCTGCGTCTGCATCTGCTGCTAAAACTTCAGAAACTAACGCTAATAGTAGTAAAACTGCTGCTGCTAACAGTGCTAGTGCTGCTAAAACCTCAGAAACTAATGCTAAAACCTCAGAAACTAATGCGGCTGCTTCTGCTACTAAAGCTGAAAACGCAGCTTCCGGTATGCGTGATTCTATAGGTTTAGGTAATGCCCCTCGTAATTGTCCTGATATTTCTGGTAATCCTTCTGCGTATATTGGCTTTATGCGTATCATGAGTACTGCGGTAGGCTTTCCATCAATAGCCTCTGGAGAAAGTAGTCTTACAGGATTTATTAGTCAAGTGGATGGTAGTCCAGCGTACACTGGTGTTTTCCAAGGATGGGCGTCCCGCTCATTATATACATATCGTTGGGCAAGCAACATAGGCCCACAGTGGACACGTCATGCTCGTAAGAATGAGGTTGATAAGCTCGTCCAACTAAGTTCTGAGACCCACCTATTAAACCCGGGTGATAATGCTAAAATAATTATTACTTCCAATAAACTCTGGGGAGCTTATGATATAGAGAATAGGACATACATACCTCTTGCAGTAGGACAGGGAGGTACCGGGGGTAGATCAGCTGCTGAAGCAAGAGCTAATCTACAGCTAAACCGCTTCCAACGTTCCAGTGATACAAGAACCATCGTTTGTTCAACAGATGTTCAGGCGGATGGCTGTTATTTACAGGTTGATGCTAGTGGTCAGTGGGGCGCATTTAATCCTACAACTGGTAAATGGCAACCTCTCGCAATAGCTCAGGGTGGTACTGGCGGTAATAATACTTCTGATGCGCGTAGAAATCTAGAAGTAATGTACCGTAGATTTTCTACCTTAACAGGACAGAACTTGAATGACCTTAATGGTGATTACGCAGGTTTCTACTACCAGAGCCTATCAGCTAATGCAACTACAGCCCGTAATTACCCTATTCAAGAAGCTGGAAACTTAATGGTACTACAAAATAGTGCTAATGGAACCCCAGGATGTTGTCAGATATACATTACCTTTAGTTCTAATAGAATATATGAACGTAGCTATAACCCAGGTACTTCAACGTGGTCTCCTTGGGGGTCTATTCTTAATAGCTACGATCCTAGTTATTGTAGACAGCTTATAGAACTAGGTTCTCAACATGCTCCTTTATTTGCTGGTTTGGCTTTAACTGGATATAGTGATAGTACTGTAGCTGCCGGCGGTATTATTAATAGCTATCTAAGAGCTACAGATGGTACTCAAAGGGTACGTATGCGCTTATACCCAGAGAAACTTGCTGAGGGGGTTGCGGCTGCAACTCTACAGATTATGGGGGAGGATACTGGCCCCTCGTATAAGACCTTCCAGTTTAAGAATAATGGTCAACTATTAGTACCTAATGAACTGAATGCAGATACTATAGCTGTTAGAAATCTTAACACAACTCAACAAAATCTAGGAATACCTACTACAGGATTTATGGGTGCTTACCAGACTATCAATGCACCTGCTGGTGCTGTAGATGGAAAATATTATCCAGTTATATTTTACACCGGAGGTACTAATGGTAATGGTGTTCTGCCAGTACCTATATCTATTCGTACTCCTGGTAGATCGGCTGGCCATGAGATGAATAATAACGTTTTTTCTGGCTACGTAACTTGTGGTGGCTGGAGCGATAGCCCTAATATAGCATATGGGGTATTTACTGCTTATGATCCTAGAGAATTAGGTATTCTATGTATAAAAGGTAGTAATAAAGACTATGCTCAGCATATAGCAGTTTATGTACACTACAAAGCATTCCCTGTACATGTTATGACAGATCCTAAGGTTGTTATAAATGTTCCTACTGAAGACTATGTATTAGGGACTAACGGGGTTAAATTTAAATTTGGAGTAACAGATGCGGGTGATGGAAATGCTGAGGGTAATGTTAGCAACCTCTTGAACTTTACTGGTGGCGGTTCTGGCTTTTACTCTAATCATCCATTCCGTTCTGGATTATCTCCTAATTTTGCTCTAACTAATAACCTTAGTACTGGAGATGCTTTTTCTGCTACTGCACCTTCTTTTACTTTTAATGGTAGTGTTGTTGGTGCTAATAGTTTTTCTGCTAGAGGTGATGCTGTAACAAAAAATACGTACACATCTCAACTGGTAAATAGTGCCGATGCCATAGTAGGACAAAGTGAGTTTAGGGCAACAGAGGAAGCAGGACAAATTATTGTTAGAGATATGAGTAGTTCTGCTAGTCATAAATTCTTTAACTTCAATAAGGATGGAACCTTTTCAGCTCCTTCTGGTATTTTATCTTCTACTGGTGTAGACTGGAATACACAACATAACACTGTCAATAAGTTTTATGGTATTGCGGGTCAAGTTAATAGTCCTGAAAACAATGTTGTTTTTGGTGGTGTACATATAGGTTTTAGTGGTAACTATGCTACTCAACTAGCCGGTAGGGGTAATAGATATTATCTGAGAAGTATTGAATCTGGCACTATAGGTGCATGGAATCGTATAATTACAGATCAATATGCTGATTTTAAAGTCCCTATTTTTATATATAAAAATGGAGAGGCATTAACCCTAAAATCTAGTGTTAATGATACTTCTGAAAGTGGTTATTTAGCAGGTAGAACTGCTAATAATACTAGAATGTGGCTTTTTGGAAAGACTGATTCTTCTAAGAGTGTTACTCTTAGCAACGAAATGGTCGGAGCATACTTTACTGTTTCTGATAATATTGTACTAAGAACACCCAGCTATAACGGTGGAATATTTGCTGATGGTAGTGGTGTTTCTGTTAGAAGAAGTAACGGTCGTGAATTCAAATATGAGAATAACATGACTGCTGCTAAAAATGGATCTATTCTTCTTTGGGGTAACACTACCGGAAGACCAACTGTTGTTGAGTGTAAGCTTGACAATGGATATTTATGGTATGCTCAAGAAAACTCTGACGGAAGTAGAGTATTCAATATAAACGGAAATGCTGAGGCAAGGGCGTTTAATCAAACTTCTGATAGGGATCTAAAGAAAAATATACAGGAGATACCAAACGCAACCCAATCTATCCGAAAAATTAGCGGATATACCTATAATTTTAAGGATGATGATATGCCATACGCCGGGGTGATTGCCCAAGAAGTAATGGAAGTTCTTCCAGAAGCAATAAGTGGCTTCACTAGATATACAGAATTGGCAGGCGCTACTGTAGATGGGGAACCACTTATGGGTGAAGAGCGTTTCTATTCTGTAGATTATGGAGCTGTTACAGGGTTGCTAGTTCAGGTAAGCAAGGAATCTGATAGCAGGATCACAGCTCTAGAATCAGAAGTATCTGATCTTAAAAAGCAAATTGCAGATCTAACGTTAGTAGTTAACTCTCTACTAGCAAATAGATAA

Genome Context

Genome Context

Tertiary structure

PDB ID
24538a0648f858d578978c7fe01ff0b275e95557f5f9b19981b62e16042f27eb
ESMFold
Source ESMFold
Method ESMFold
Resolution 0,5548
Oligomeric State monomer
Model Confidence
Very high
pLDDT > 90
High
90 > pLDDT > 70
Low
70 > pLDDT > 50
Very low
pLDDT < 50