Genbank accession
AZV01373.1 [GenBank]
Protein name
long tail fiber distal subunit
RBP type
TSP
Evidence DepoScope
Probability 1,00
TF
Evidence RBPdetect
Probability 0,82
Protein sequence
MAKIPRIQFKRTKTPGTKPSKDILAEGELAINLADRTLFTKSGDDIIDLGFAKGGTVNGDINQEAGNFTTNGSMTAKTYIDIKNSLNDKKIFRLSYEGDRGALLSVNEGDNKWKTILVPLDEDGTKTIATREHVGKMVSTNDGETKLRAPNQTNLLIAKDNKELVWYDGSTNNRMVNFGAGGLDLAHKGSDIVGVSLYKKDGNQVRIETHAHSSALMLAFAYKDSAGNNSYVINMPKENGILATQGWTDAKYYKRNDSPSFKEYVNVFHNTTGSYTQLGYGNAGATWSVNTGGNWYILKHPLANGTVASQEWSQATHYNKSESDNRYVKIGIGGGYPLHNYNAGGHGYSTWNEKGVRQAYIGFANDGSTEFTISNEKGSNSTINLKAGNVLVNGVTIASQNWVISNFYKKTETYNKSEIDGKVNGRLTQAQGDARYYTKTDSDGRYLRMNITNKTSMPVFYRTANYGDDANQRDLNYAGFYRTNGLNSLPSLIIHVPHSSGLAHGRGIGFDYGSAGYGIYTYAYDANGKYQGQKSIALSEYHYSKTVSDDRYYQKSQTYSRAEVDSRVNGRLTQATADGRYAYKGGANAQNFGANIVDAADVNIRSDLTVKSNLVKINNAIQTVKTLTGYTYDLKLNDNTYKQSAGIIAQDVQKVLPALVTEDSLGLLSVNYNGLTAVLVNAINELSERLERLERGV
Physico‐chemical
properties
protein length:697 AA
molecular weight: 76540,96270 Da
isoelectric point:8,97356
aromaticity:0,09756
hydropathy:-0,59211

Domains

Domains [InterPro]
DC_0000
STR
1–697
IPR030392
CHP
606–697
IPR030392
CHP
606–660
AZV01373.1
1 697
Architecture
STR
STR 1-697
Legend: ATT STR RBD CBM LEC ENZ CHP LNK TAS TTP UNK Unmapped

Tail Spike Domain Segmentation

Tail Spike Domain Segmentation

This protein has been segmented into three structural domains: N-terminal, central domain, and C-terminal.

Domain Layout
N-terminal
Central
C-terminal
AZV01373.1
1 697
Domain Start End Length (AA) Confidence
N-terminal 1 76 76 0,4461
Central domain 77 275 200 0,3134
C-terminal 276 697 421 0,7980
Legend: N-terminal Central domain C-terminal
3D Structure with Domain Coloring

The structure is colored according to the domain segmentation: N-terminal (blue), Central (green), C-terminal (pink).

Domain Coloring
N-terminal
1-76
Central
77-275
C-terminal
276-697

Taxonomy

  Name Taxonomy ID Lineage
Phage Shigella phage vB_SdyM_006
[NCBI]
2500762 Viruses > Duplodnaviria > Heunggongvirae > Uroviricota > Caudoviricetes
Host Shigella dysenteriae
[NCBI]
622 cellular organisms > Bacteria > Pseudomonadati > Pseudomonadota > Gammaproteobacteria > Enterobacterales

Coding sequence (CDS)

Coding sequence (CDS)
Genbank protein accession
AZV01373.1 [NCBI]
Genbank nucleotide accession
MK295204 [NCBI]
CDS location
range 155168 -> 157261
strand +
CDS
ATGGCAAAAATTCCAAGAATACAATTTAAAAGAACTAAAACACCAGGAACAAAGCCTTCAAAAGATATTTTAGCTGAAGGTGAATTAGCAATTAACTTAGCTGATAGAACTCTTTTTACAAAATCCGGAGATGATATTATTGATCTTGGCTTTGCTAAAGGCGGCACAGTTAATGGAGATATTAATCAAGAAGCTGGAAATTTTACAACCAATGGGTCAATGACTGCTAAAACATACATTGATATAAAAAATAGCTTAAATGACAAAAAAATATTTAGGCTTTCTTATGAAGGAGATCGTGGCGCTTTATTGTCTGTTAATGAAGGCGACAATAAATGGAAAACTATACTTGTTCCATTAGACGAAGATGGGACTAAAACTATAGCTACTCGTGAACACGTAGGAAAAATGGTTTCTACTAATGATGGAGAAACCAAATTAAGGGCTCCAAACCAGACTAACTTATTGATTGCAAAAGACAATAAAGAACTTGTTTGGTACGATGGATCCACTAATAATAGAATGGTTAATTTTGGTGCAGGTGGTTTAGATTTAGCCCACAAAGGTAGTGATATTGTCGGAGTTAGTTTATACAAAAAAGATGGCAATCAGGTTCGAATAGAAACCCATGCCCATTCATCGGCCCTTATGTTAGCTTTTGCATATAAAGATTCTGCAGGAAATAATTCATATGTAATCAATATGCCAAAAGAAAATGGTATATTAGCTACACAAGGATGGACAGATGCCAAATACTATAAAAGGAATGACAGCCCATCGTTCAAAGAATACGTTAATGTTTTTCATAATACCACAGGGTCTTATACTCAATTAGGATATGGTAATGCAGGTGCTACATGGTCAGTTAACACAGGCGGCAATTGGTATATATTAAAACATCCCCTAGCTAATGGTACAGTTGCTTCTCAAGAATGGAGTCAAGCTACGCATTACAATAAATCAGAATCTGATAATAGATATGTTAAAATTGGTATAGGTGGTGGTTACCCACTACATAATTATAATGCAGGTGGCCATGGCTATTCTACTTGGAATGAAAAAGGTGTACGTCAAGCTTATATAGGATTCGCAAATGATGGTTCTACAGAATTTACCATTAGTAACGAAAAAGGATCTAATAGTACTATTAACCTTAAAGCGGGTAATGTTTTAGTTAATGGTGTAACTATCGCCTCTCAAAACTGGGTAATTTCTAATTTTTATAAAAAAACAGAAACATATAATAAATCTGAAATTGATGGCAAAGTTAATGGACGATTAACACAGGCCCAAGGTGATGCTAGATATTATACTAAAACCGATTCTGACGGTCGTTACTTACGAATGAATATCACCAACAAAACTAGTATGCCGGTATTCTATAGAACAGCAAACTATGGAGATGATGCAAACCAACGTGATTTAAATTACGCAGGATTTTATAGAACTAACGGATTAAACAGCTTACCGAGTCTCATAATTCATGTTCCTCATAGTAGTGGGCTAGCTCATGGAAGAGGTATAGGATTTGACTACGGTTCAGCCGGATATGGGATTTATACTTATGCATATGATGCGAATGGAAAATATCAAGGTCAAAAATCTATCGCATTGAGCGAGTATCACTATTCAAAAACTGTATCAGATGATAGATATTACCAGAAGTCTCAAACATATTCTAGAGCAGAAGTTGATTCTAGAGTAAACGGAAGATTAACTCAAGCCACGGCGGATGGAAGATACGCGTATAAAGGTGGTGCTAATGCTCAAAACTTTGGTGCTAACATAGTTGATGCTGCTGACGTTAATATTCGTTCAGATTTAACTGTAAAATCAAATTTAGTTAAAATAAATAATGCTATACAAACTGTTAAAACATTAACAGGGTATACATATGATTTAAAATTAAACGATAATACATATAAACAATCTGCAGGTATAATAGCACAAGATGTACAAAAAGTTCTACCAGCATTAGTTACAGAGGACTCTTTAGGATTATTATCTGTTAATTATAATGGTTTAACTGCAGTTCTTGTTAATGCTATAAATGAATTATCTGAAAGATTAGAAAGATTAGAACGAGGTGTATAA

Genome Context

Genome Context

Tertiary structure

PDB ID
b133169ded063ddbe16ad1d595cad03be81698035a096d27cec345c93f17d81b
ESMFold
Source ESMFold
Method ESMFold
Resolution 0,6690
Oligomeric State monomer
Model Confidence
Very high
pLDDT > 90
High
90 > pLDDT > 70
Low
70 > pLDDT > 50
Very low
pLDDT < 50

Literature

Title Authors Date PMID Source
Characterization of novel bacteriophages infecting Shigella spp. and E. coli O157: H7 Shahin,K., Bao,H. and Wang,R. 2020-06-17 GenBank