Genbank accession
XYH44901.1 [GenBank]
Protein name
tail fiber protein
RBP type
TF
Evidence GenBank
Probability 1,00
TSP
Evidence RBPdetect2
Probability 0,90
Protein sequence
MAKAFYEIWKCDSTYTQELQEIFFKEVNIKKEMMKVPTFYFNIDRNAETSLSAPSIGNHIKIIRNNEEVLRGKIQDMEVRDKSIHFEGFSVAAEWKDVTATQNIWRNTTSKTAIETYAGTIGWTGGVIESNTLKKHDYTYVPLIEELELFASLFGKEFGFDEENRKINFKTQIGNDLSTVVRLIRGVNMESFGIRRQQADTYTRVIALGAGEGSKQLKKVVGTYTGIGSTRVFTDKQIKDMDDLTEYANARLAEGQGQDVVTYESRMLSPIFGVEIGDKIWVEDSLNRIDEAMRVMGIEMTFNETEQFDLVLANRNKTLIDLFQRMEKGQRTLANVHHSSAVQPNSMIVVDEDTKNPILMQVENGNFVNITTGNDLGTIENNVSTAQDTASSAQGDASNALITAGSANSKADQALTDASNAQDAASNALSQAGQAQEDASNALTEAEKRVLIETYNTERDQLASDIADKAGLTYVDGQLASKAEKSNTYTKTEVDSAVNSKVSKTTYDVDKNGIVTRLDSAESRITQTESEISTKVSSTTYETDREVSGQLGDANESASIASTNNGKWFRIAKNSGNRAWARFIVRDTTSGQHGTVEFIAGANYNNRSGLEFSLNSFSRYTLFPFSKARILTKGTYDEQYLELLFSASRDSQVSFWIKDNIQSSGWDAVPFVESTTMPEGYVADEFEISKDKATSSRLTSAESSITQHATEIASKVSNTTYQTDKDGITSRLDSAESSITQQAGQIASKVEQTTFNALESKVNTQGTTLTQKADIASLNAFKGEVASWERAYSVKNGYGHYIEDLNGNKLDFDFTYEVQARTVGTGTNTLAISLFKGDGAAFTIEKIYERGTGSNHPVFYLSGGYPAVTTAHSTYYTVELTINKYKGDRTSFNVLSDNINTRVEKNGVVSAINQSAEQITIDASKVNLNGYVTFSSQSKTSPNLIKDYDSFEGFPDGYNPPKKVYSNLTVYNVSSEFAFHGSKSLKTVNNNANSYIYPNGSAGFIPVKQGQKYVASAYAYTTSADPVQVTFAPIFRDDSGNHVSTGLSYPTTVITKVDGWVRISQSFYPPEGATSWTYYLRTVTAGTYTVYWDAISFEEVDASVDYPAPFRSGGFTTIDGDLIETGTLKWDKGFGGSINLGGADNQNGVLNILDDEGGIIASLGESNAGFGNLSADVISDVRNIVYKTSPSHPNYSDGYLNIFVDGLRGSDATGTGHQDNPYATIQYAIDNIPRYLDHSVTVFIYPMKYDENITVSGFKGEGTLTLKTYAWGVQRIRDWSSGSTANTGNHWVEIEALDNGWYSVARGKTVTTNGGNNATYPLTRITDGNKDTGQYADAGSGTGRYVEVDLAGTHDLRNINVWKYYSDGRSYNGIRTEVYQEGYGWREIWDNDGFAGAFRENSAGHRRVAYINGSIVFQSCDKVSLDSLCFDARSTKGIPVYAYNTQYADWRRLYAFADSSASYCYYCYASYVRIHDSEGNGSGTAVICGAYGARVDLFDGVTGGDSIRGLFCYSSATIAGSGDIPYGNSTATLTGTGGTITVSAWTASGYRGKKGIYTKYEDPTPPPPPPPKVVTKTWTSSSAKSWRPNFSGQWYESSVVQGIWSGYGLYRGYWFFGDSIRAAVAGKTITKVRIYLTRNNSGGYSSAQTCYIRGHNYTTQPSSTSTPSYDSSSPATASFAWGEGKWVDITSQWKADLQAGSIRGFMLYTTSTSATQYMKFSPTAKVEVTYYE
Physico‐chemical
properties
protein length:1732 AA
molecular weight: 190211,28530 Da
isoelectric point:5,20608
aromaticity:0,11201
hydropathy:-0,46299

Domains

Domains [InterPro]
DC_1919
ATT
296–609
XYH44901.1
1 1732
Architecture
ENZ
ATT
STR
STR
RBD
STR
RBD
ENZ 100-295 | ATT 296-609 | STR 687-768 | STR 942-1100 | RBD 1199-1256 | STR 1299-1419 | RBD 1551-1732
Legend: ATT STR RBD CBM LEC ENZ CHP LNK TAS TTP UNK Unmapped

Tail Spike Domain Segmentation

Tail Spike Domain Segmentation

This protein has been segmented into three structural domains: N-terminal, central domain, and C-terminal.

Domain Layout
N-terminal
Central
C-terminal
XYH44901.1
1 1732
Domain Start End Length (AA) Confidence
N-terminal 1 557 557 0,9714
Central domain 558 972 416 0,5716
C-terminal 973 1732 759 0,1798
Legend: N-terminal Central domain C-terminal
3D Structure with Domain Coloring

The structure is colored according to the domain segmentation: N-terminal (blue), Central (green), C-terminal (pink).

Domain Coloring
N-terminal
1-557
Central
558-972
C-terminal
973-1732

Taxonomy

  Name Taxonomy ID Lineage
Phage Rossellomorea phage phiT1
[NCBI]
3440538 Viruses >
Host Rossellomorea aquimaris
[NCBI]
189382 cellular organisms > Bacteria > Bacillati > Bacillota > Bacilli > Bacillales

Coding sequence (CDS)

Coding sequence (CDS)
Genbank protein accession
XYH44901.1 [NCBI]
Genbank nucleotide accession
PV767480.1 [NCBI]
CDS location
range 30150 -> 35348
strand -
CDS
ATGGCAAAGGCTTTTTATGAAATTTGGAAATGTGATTCTACTTATACACAAGAACTTCAAGAAATTTTCTTTAAAGAAGTCAATATAAAGAAAGAAATGATGAAAGTACCTACCTTCTATTTCAATATAGATAGGAATGCTGAAACAAGCTTATCTGCCCCTTCTATTGGAAATCATATCAAGATTATCAGAAACAATGAAGAGGTCTTGAGAGGTAAGATTCAGGATATGGAAGTAAGGGATAAGTCCATTCATTTTGAAGGATTTTCGGTAGCTGCAGAATGGAAAGATGTAACTGCCACTCAAAATATTTGGAGAAATACTACTTCCAAAACTGCCATTGAAACTTATGCAGGAACAATAGGGTGGACAGGTGGAGTTATTGAATCAAATACCCTAAAGAAGCATGATTATACCTATGTACCTCTAATTGAAGAGCTTGAGCTTTTTGCTAGTCTGTTTGGTAAAGAATTTGGGTTTGATGAAGAGAACAGAAAGATAAACTTCAAGACTCAAATTGGAAATGATTTATCTACAGTGGTAAGGCTTATCAGAGGGGTAAACATGGAGTCTTTTGGAATCAGAAGGCAGCAAGCTGATACTTACACTAGAGTAATTGCACTAGGGGCAGGTGAAGGTAGCAAGCAGCTAAAGAAGGTTGTGGGAACATATACCGGGATCGGTTCTACAAGGGTATTTACTGACAAGCAAATTAAAGATATGGATGATTTGACTGAATATGCAAATGCAAGGTTAGCTGAAGGTCAAGGTCAAGATGTTGTTACTTATGAGTCAAGAATGCTCTCCCCTATTTTTGGAGTGGAAATTGGAGATAAGATTTGGGTAGAAGATTCCCTAAACAGAATTGATGAAGCAATGAGGGTTATGGGAATTGAGATGACCTTCAATGAGACAGAACAATTTGATTTAGTCCTGGCTAACAGGAACAAAACACTTATTGACCTGTTTCAAAGAATGGAGAAAGGGCAAAGAACTTTAGCTAATGTTCATCACTCTTCTGCAGTACAACCTAACTCTATGATTGTGGTAGATGAAGATACAAAGAATCCGATTCTTATGCAAGTTGAGAATGGTAACTTTGTGAATATCACAACTGGAAATGATTTAGGGACAATTGAGAATAATGTTTCAACTGCTCAAGATACTGCTTCTTCTGCTCAAGGTGATGCTTCAAATGCTTTAATAACTGCAGGCTCTGCAAATTCTAAGGCAGATCAAGCACTAACAGATGCTTCAAATGCTCAAGATGCTGCAAGCAATGCTCTATCTCAAGCAGGTCAAGCTCAAGAAGATGCAAGCAATGCCCTCACTGAAGCAGAAAAGAGGGTACTTATTGAAACTTACAATACTGAAAGAGACCAACTAGCTTCTGACATTGCTGATAAGGCAGGCCTAACTTATGTAGATGGTCAATTGGCTTCAAAAGCTGAAAAGAGCAATACCTATACAAAGACAGAAGTTGACAGTGCAGTAAATTCAAAAGTCTCTAAAACAACCTATGATGTAGATAAGAATGGGATTGTGACAAGGCTAGACAGTGCTGAAAGCAGAATAACACAGACTGAAAGTGAGATTTCCACTAAGGTTTCTAGTACAACTTATGAAACTGACAGGGAAGTATCCGGTCAATTAGGTGATGCAAATGAGTCTGCAAGCATTGCTTCTACAAATAATGGCAAGTGGTTCAGGATTGCTAAAAATAGTGGAAACAGAGCTTGGGCTAGATTTATTGTAAGAGATACTACTTCCGGGCAGCATGGAACAGTGGAGTTTATTGCAGGGGCAAATTATAACAATAGAAGTGGGCTTGAGTTTTCATTAAACTCATTTTCAAGATATACCCTCTTCCCTTTCTCAAAAGCAAGGATACTAACAAAAGGAACTTATGATGAACAGTACCTTGAACTACTCTTCAGTGCAAGTAGGGATTCTCAAGTATCTTTTTGGATCAAGGACAACATTCAGTCAAGTGGATGGGATGCAGTACCATTTGTAGAATCTACCACAATGCCTGAAGGATATGTTGCTGATGAGTTTGAGATTTCAAAAGACAAGGCAACTAGCTCAAGGCTTACAAGTGCTGAAAGTAGTATTACCCAACATGCAACAGAGATTGCTTCTAAGGTGTCAAATACCACTTACCAAACTGATAAGGATGGAATCACTAGCAGGCTAGATAGTGCTGAATCAAGCATTACCCAACAAGCAGGGCAAATTGCTTCTAAGGTTGAGCAGACAACCTTCAATGCCCTTGAGAGCAAAGTAAACACTCAAGGAACTACTCTGACCCAAAAGGCTGACATTGCTTCACTAAATGCCTTTAAGGGAGAAGTGGCAAGTTGGGAAAGAGCTTACAGTGTCAAAAATGGTTATGGGCATTACATTGAGGACTTGAATGGAAACAAGCTAGACTTTGACTTCACTTATGAGGTTCAAGCAAGAACTGTTGGAACTGGAACAAATACCCTTGCAATATCTTTATTTAAAGGGGATGGAGCTGCCTTTACAATTGAGAAGATTTATGAGAGGGGAACAGGCTCTAATCACCCGGTATTTTACCTATCCGGTGGCTACCCTGCAGTCACTACTGCTCATAGCACCTACTACACCGTTGAGCTTACTATCAACAAGTACAAAGGTGACAGAACATCTTTCAATGTGCTTTCTGACAACATAAATACAAGAGTTGAGAAAAATGGAGTAGTGTCTGCAATCAATCAGTCTGCTGAACAAATAACCATTGATGCAAGCAAAGTAAACTTGAATGGGTATGTAACTTTCTCAAGTCAGAGCAAAACTTCTCCTAACTTGATTAAGGACTATGACTCTTTTGAGGGCTTCCCTGATGGGTATAACCCTCCTAAAAAGGTTTATTCAAATCTGACAGTCTACAATGTTTCAAGTGAATTTGCCTTTCATGGCTCTAAGTCATTAAAGACAGTGAATAATAATGCTAATTCATATATTTACCCTAATGGCTCTGCAGGATTTATCCCTGTTAAGCAAGGACAGAAGTATGTGGCTTCTGCATATGCTTACACTACTTCTGCTGATCCAGTGCAAGTAACATTTGCCCCTATCTTTAGGGATGATTCAGGAAATCATGTCTCAACAGGTTTAAGCTATCCTACAACTGTTATTACAAAGGTTGATGGATGGGTAAGGATTTCTCAATCATTTTATCCTCCTGAAGGAGCAACTTCTTGGACTTACTATTTAAGAACTGTTACTGCAGGCACTTATACAGTTTACTGGGATGCTATTTCTTTTGAAGAAGTGGATGCTAGTGTTGACTACCCTGCCCCTTTCAGAAGTGGAGGTTTTACAACCATTGATGGGGATTTAATTGAGACAGGTACTTTAAAGTGGGATAAAGGATTTGGTGGCTCTATTAATCTAGGAGGGGCAGACAATCAGAATGGTGTCCTAAATATACTAGATGATGAGGGTGGAATCATTGCTTCACTTGGGGAGAGCAATGCAGGTTTTGGTAACTTATCTGCAGATGTAATTTCTGATGTAAGGAACATTGTTTATAAAACTTCTCCTTCTCATCCTAACTATAGTGATGGATACCTTAACATCTTTGTTGATGGATTGAGAGGTTCAGATGCAACAGGAACAGGACATCAAGATAATCCATATGCAACTATCCAGTATGCAATTGATAATATCCCTAGATATTTGGATCATTCAGTAACAGTCTTTATTTATCCTATGAAATATGATGAGAACATTACAGTGAGTGGGTTTAAGGGTGAAGGAACATTGACCCTAAAAACATATGCCTGGGGTGTTCAAAGGATTAGGGATTGGTCAAGTGGCTCTACTGCAAATACCGGGAATCACTGGGTAGAGATAGAAGCCCTTGATAATGGTTGGTACAGTGTTGCTAGAGGGAAAACAGTAACTACAAATGGTGGAAACAATGCAACCTACCCTTTAACAAGGATAACTGATGGAAACAAAGACACCGGGCAATATGCTGATGCAGGTTCTGGAACTGGCAGATATGTTGAAGTTGACTTAGCCGGAACTCATGACCTAAGAAACATTAATGTTTGGAAGTATTATTCTGATGGCAGAAGCTACAATGGGATCAGGACAGAAGTTTATCAGGAAGGATATGGATGGAGAGAAATTTGGGATAATGATGGATTTGCCGGGGCATTCAGGGAGAACTCTGCAGGACATAGAAGGGTTGCTTATATCAATGGAAGCATTGTCTTTCAATCATGTGATAAGGTATCACTTGATTCTCTATGTTTTGATGCAAGGTCTACTAAAGGTATTCCGGTTTATGCCTACAATACTCAATATGCTGATTGGAGAAGGCTCTATGCCTTTGCTGATAGCTCTGCTTCATATTGTTATTACTGCTATGCTTCTTATGTAAGAATTCATGACTCTGAAGGGAATGGTTCAGGAACTGCAGTTATTTGTGGAGCTTATGGTGCAAGGGTTGACCTTTTTGATGGAGTGACAGGGGGAGATTCTATTAGAGGTCTTTTCTGCTACTCTTCTGCAACAATAGCAGGGAGTGGAGATATACCTTATGGGAACTCAACTGCAACCTTAACAGGAACAGGTGGAACTATTACAGTATCTGCTTGGACTGCTTCAGGGTACAGGGGAAAGAAAGGAATCTATACTAAGTATGAAGATCCTACTCCACCACCTCCACCACCACCAAAAGTTGTAACTAAAACTTGGACTTCTTCCAGTGCTAAGTCTTGGAGACCTAACTTCTCCGGGCAATGGTATGAATCTTCAGTAGTACAAGGGATTTGGTCAGGATATGGTCTGTATAGGGGTTATTGGTTCTTTGGTGATTCTATTAGGGCTGCAGTAGCAGGCAAGACAATCACTAAGGTAAGAATCTATTTAACAAGGAATAACAGTGGAGGATACTCTTCTGCACAAACTTGTTATATCAGAGGACATAACTACACTACACAACCTTCCAGTACCTCTACTCCTTCTTATGACTCTTCTTCTCCGGCAACTGCTTCATTTGCTTGGGGTGAAGGAAAATGGGTAGACATTACTTCCCAATGGAAGGCAGACCTGCAGGCAGGAAGTATCAGAGGGTTTATGCTTTATACTACTTCTACTTCTGCTACACAATATATGAAGTTTTCCCCTACTGCAAAAGTAGAGGTTACTTATTATGAATAA

Genome Context

Genome Context

Tertiary structure

PDB ID
5b514caa829a388dbf7086886f952900fb626567a184b395194f82f40dd42106
ColabFold
Source ColabFold
Method ColabFold
Resolution 0,6540
Oligomeric State monomer
Model Confidence
Very high
pLDDT > 90
High
90 > pLDDT > 70
Low
70 > pLDDT > 50
Very low
pLDDT < 50