Genbank accession
AZU99740.1 [GenBank]
Protein name
tail fiber protein
RBP type
TF
Evidence GenBank
Probability 1,00
TF
Evidence Phold
Probability 1,00
TSP
Evidence DepoScope
Probability 1,00
TSP
Evidence RBPdetect
Probability 0,91
TSP
Evidence RBPdetect2
Probability 0,95
Protein sequence
MLKKVGMLPTDPYRRYLPSAFDESMNIYEQLITCIEYVNNLGISFNELVDWLDKVVLQQNEKLKEQDQKIDMLRDEWHIFEDYIVNILLEKKVVEILKKWLEDGTLADIINKDVFDMKADTEWVKSEFTKRGVHYKDFGAKLDGVTDDSDAIIAAHNYANEHNYPVIVKNEKFVLNKNVTVKTSTDLTGSILTTTYVDPEPIEYNRTFNLFNIEGAELIDLSTRAINPEFIKGATRIPSLKNQPSGALIIKTEQTDIIRDNGGVQSNIMKAECNIMMKNQYGDLAYPLTKNYVDALKFQVFLRPFEHQLELKFPKVEVKGRIYGIAKVHRNNTSFSGLLMEEINPSPTISSIYTLFEYEDCADMEATNISCPIIGRETKTGENGLGYFLLMTRAAKFRGSNLQQISGWSGINGNWMRDISVVDSNMLVVGGHANVYDLTVDRSVLQKNIIAHGGGVIQLLNSQVIGVANPPNNLSGTGAVQTRWDYDGEFEGEIIVENVVLHNATYVVEYSPSTYNCGRTIVLPKTTIRNVHMRNLLKKKGAGVWFRGYRGEYAGNYPQVTIDSLSWDFVGTYTTRFVEFESDISNSLATNKDFKFYFRNIHPPRLSYGDVFNPITAFIHVPKVTNNDTVVYYDIQNCTVNMGLASTSNLDVTIDNSDFYAVNLLAPTSVTTNGQPAFINVKNSTVHRGVTNFNTGTDTYNRVRLTIASSIFKRLRKTDGSYDPQIGFPIEDFVSYTADNIADARAEIRGDNSARLFGYIDETIWKIKKDQLRIFV
Physico‐chemical
properties
protein length:776 AA
molecular weight: 87840,49940 Da
isoelectric point:5,56274
aromaticity:0,10438
hydropathy:-0,28660

Domains

Domains [InterPro]
DC_0396
ATT
1–225
Coil
Unmapped
56–76
IPR011050
STR
133–215
IPR011050
STR
136–547
AZU99740.1
1 776
Architecture
ATT
STR
RBD
ATT 1-225 | STR 226-547 | RBD 550-776
Legend: ATT STR RBD CBM LEC ENZ CHP LNK TAS TTP UNK Unmapped

Tail Spike Domain Segmentation

Tail Spike Domain Segmentation

This protein has been segmented into three structural domains: N-terminal, central domain, and C-terminal.

Domain Layout
N-terminal
Central
C-terminal
AZU99740.1
1 776
Domain Start End Length (AA) Confidence
N-terminal 1 148 148 0,9927
Central domain 149 765 618 0,9855
C-terminal 766 776 10 0,5603
Legend: N-terminal Central domain C-terminal
3D Structure with Domain Coloring

The structure is colored according to the domain segmentation: N-terminal (blue), Central (green), C-terminal (pink).

Domain Coloring
N-terminal
1-148
Central
149-765
C-terminal
766-776

Taxonomy

  Name Taxonomy ID Lineage
Phage Bacillus phage DK1
[NCBI]
2500808 Viruses > Duplodnaviria > Heunggongvirae > Uroviricota > Caudoviricetes
Host Bacillus cereus
[NCBI]
1396 cellular organisms > Bacteria > Bacillati > Bacillota > Bacilli > Bacillales

Coding sequence (CDS)

Coding sequence (CDS)
Genbank protein accession
AZU99740.1 [NCBI]
Genbank nucleotide accession
MK284526.1 [NCBI]
CDS location
range 19048 -> 21378
strand +
CDS
ATGTTAAAAAAAGTTGGTATGTTGCCAACAGACCCATATCGTAGATATTTACCAAGTGCATTCGATGAATCAATGAATATTTACGAACAGCTTATTACATGTATTGAATATGTAAACAATCTAGGTATTTCTTTCAATGAACTTGTAGACTGGCTAGATAAAGTTGTTTTACAACAAAATGAAAAATTAAAAGAACAAGATCAAAAGATTGACATGTTACGTGATGAGTGGCATATCTTTGAAGATTACATTGTGAATATTCTTTTAGAGAAAAAAGTTGTTGAAATTTTAAAGAAATGGTTGGAAGATGGTACGCTAGCCGACATTATCAATAAAGATGTTTTTGATATGAAAGCCGATACAGAATGGGTAAAATCTGAATTCACTAAACGTGGTGTTCACTATAAAGATTTTGGTGCTAAATTGGATGGTGTTACAGATGATTCTGACGCTATCATAGCAGCCCATAATTACGCGAATGAACATAACTATCCTGTTATTGTAAAGAATGAGAAGTTTGTTTTAAATAAAAACGTAACTGTAAAAACTTCTACTGATTTAACAGGAAGTATTTTAACAACAACTTATGTTGACCCAGAACCAATTGAATATAATCGTACTTTTAATTTATTCAATATTGAAGGTGCTGAATTGATTGATTTATCAACTAGAGCTATCAATCCTGAATTTATAAAAGGAGCAACTAGAATCCCTAGTTTAAAAAATCAACCTTCTGGTGCTTTAATTATTAAAACGGAACAAACTGACATTATTCGTGATAATGGTGGGGTTCAATCTAATATTATGAAAGCGGAATGTAATATTATGATGAAAAACCAATATGGAGACCTTGCATATCCATTAACGAAAAACTATGTTGATGCTTTGAAATTCCAAGTATTCTTAAGACCTTTCGAACATCAATTAGAATTAAAGTTCCCTAAAGTTGAAGTAAAAGGACGTATTTATGGTATTGCCAAAGTACATCGAAATAATACATCATTTAGTGGTTTATTGATGGAAGAAATTAACCCTAGCCCTACAATTTCGAGTATTTATACACTGTTTGAATATGAAGATTGTGCAGATATGGAAGCAACAAATATTTCATGTCCGATCATCGGTAGAGAAACCAAAACAGGAGAAAACGGTTTAGGATACTTCTTACTTATGACGAGAGCCGCTAAGTTTAGAGGTTCAAACCTTCAACAAATTTCTGGCTGGTCTGGTATCAATGGTAACTGGATGAGAGATATTAGTGTTGTGGATTCTAATATGCTTGTTGTTGGCGGACATGCTAACGTATATGATTTAACGGTTGATCGTTCGGTGTTACAGAAAAACATTATTGCTCATGGTGGCGGTGTTATACAACTTCTAAATTCACAAGTTATCGGTGTTGCAAACCCTCCTAACAACTTATCTGGCACAGGTGCTGTGCAAACACGATGGGACTATGATGGAGAATTTGAGGGTGAAATAATTGTTGAAAACGTGGTACTTCATAATGCTACTTATGTTGTTGAATATAGTCCTTCTACTTATAACTGTGGTAGAACAATAGTATTGCCGAAAACAACAATTAGAAATGTTCATATGAGAAACCTTCTTAAAAAGAAAGGTGCTGGTGTTTGGTTCCGTGGATACCGTGGAGAATATGCCGGTAACTATCCACAAGTAACGATTGATTCTCTATCTTGGGATTTCGTGGGAACATATACAACAAGATTTGTTGAATTTGAAAGTGATATTTCTAATAGTTTAGCAACAAATAAAGATTTCAAGTTTTACTTTAGAAATATTCACCCACCACGCTTGTCATATGGTGATGTATTCAATCCAATCACAGCTTTCATTCATGTTCCGAAAGTAACAAATAATGATACAGTTGTGTATTATGATATTCAAAACTGTACTGTTAATATGGGGCTTGCTTCAACTTCAAACTTAGATGTTACGATTGATAATTCTGATTTCTATGCTGTTAACTTACTTGCGCCAACTAGTGTGACAACAAATGGTCAACCGGCATTCATTAATGTTAAAAATTCCACTGTTCATCGTGGGGTAACTAACTTTAATACTGGTACAGATACTTATAATCGTGTTCGCTTAACAATTGCAAGTTCTATCTTTAAAAGATTGAGAAAAACTGATGGTTCTTATGATCCACAAATTGGTTTCCCTATTGAAGATTTCGTTTCTTATACCGCAGATAATATTGCTGATGCTAGAGCTGAAATACGTGGGGATAATTCAGCAAGATTGTTCGGATACATTGATGAAACAATCTGGAAAATTAAGAAAGACCAATTGAGAATATTCGTATAG

Genome Context

Genome Context

Tertiary structure

PDB ID
89f5b2f16f41381f297d0da760c7856bf47b421d77672630b4fd525eba0d03ed
ESMFold
Source ESMFold
Method ESMFold
Resolution 0,6567
Oligomeric State monomer
Model Confidence
Very high
pLDDT > 90
High
90 > pLDDT > 70
Low
70 > pLDDT > 50
Very low
pLDDT < 50