Genbank accession
YP_007675386.1 [GenBank]
Protein name
central tail fiber J
RBP type
TF
Evidence GenBank
Probability 1,00
TF
Evidence Phold
Probability 1,00
TSP
Evidence RBPdetect
Probability 0,78
TF
Evidence RBPdetect2
Probability 0,96
Protein sequence
MKDGDIGSRQFVTLVDVIGEGQIAGFPSAIDAGLTHGSISYRIASLKDLFLNGTQVLRDGASNTDPNINDFNFGTSEANAPSFFSRLGTSNQTKIQGLVETERDRTVGVTVTQSQSQTVTITDTSTEGVRVTIGFPRLQEIEDDGNIKGTTVEYDIEVRKQDNTLIKKINPETLLTGLDRSIHTSGGRVTGKSTSPYFKDHIIVFPEDLADSDFPVTVKVSRQTADSSDAKLANAFEFTTLTELVFDSPTYLNTAYAAIRFDAEIFRAVPQRMYRVRGRLVKIPHNSTVRSDGSLSFSGDFNGTLKATKEYCNDPAWVLYDILTESVAGFGDFVAETEVDKYSFYNASVYNSELINDGEGGTAPRFSCNIVIQRSTNAYTLLDRIASIMRGSLYIDDGVITLCQDRPTTSTYFFSYANITEDGFVYTGASQRTKDTVINVKYFSNETRSFEYETVEDTAANQSKYGVVVKNIEAVGCNNQAQARRAGLWHLFTQNNETETVAFTTTADAGSLVRPNQIITVQDPVRSGLRRSGRIKTATTTQITVDNTKDLPTSHTTGDQLSVILTDGTMETKTVSDITGNVITVSSAYTSAPQAFSVWLLLRATTETEDFRVLSVTEEDNTFTINAMFHNSSKYAFVEDGASITVPQITTLLLPKSAPSNLSAEELIIALGNRAVSKLVLSWQPVSGVTEYSVKYQFNNGNVITERVTAPTFEIFDSELGEYKFEVFSYNAVGEPSTIPTTLTFNAQGKTALPADVQNLTIEPYNDDFVKLRFSKSTDVDVIHGGNVVVRHSNLTDGTGTFTNSVDLINALSGNISETLIPAIAGEVILKFRDDGGRLSSGETSVIISPPNQQPKLTAFTDREDTDATPFGGTKTNTFFDSTLGGLTLASTTTIDDVTELIDTLSQIDFLGDVASTGSYEFANPLDLGSTMDTKLTRHFVTESFYAGSFIDQRTELIDTWNDIDQLTAFETNAALFVATTTQDPALSTSGTYTINNGSGGAGTIITITKASHGYSVGSFVVVDFTTGTGVDENYQIISKTTDTFTLTSATSLNTSGNCTYGAEFSDFNIFTNGVLRGRGFKFKVEMSSNDKAQTILLKELGYTATLNRRVETVNSLIASGTSTKAVVFQDKFFTGFSGTSVAAGAALPTIGIVIENAQSGDFFSLSSISSTGFSIDIKNGSSFVDRNFKYTAVGFGRGS
Physico‐chemical
properties
protein length:1200 AA
molecular weight: 130041,57780 Da
isoelectric point:4,69589
aromaticity:0,09333
hydropathy:-0,18617

Domains

Domains [InterPro]
DC_0129
STR
4–828
IPR053171
Unmapped
82–764
YP_007675386.1
1 1200
Architecture
STR
ATT
STR
ATT
STR
RBD
STR
RBD
STR 4-101 | ATT 102-245 | STR 246-373 | ATT 374-535 | STR 536-828 | RBD 829-986 | STR 987-1064 | RBD 1065-1196 |
Legend: ATT STR RBD CBM LEC ENZ CHP LNK TAS TTP UNK Unmapped

Tail Spike Domain Segmentation

Tail Spike Domain Segmentation

This protein has been segmented into three structural domains: N-terminal, central domain, and C-terminal.

Domain Layout
N-terminal
Central
C-terminal
YP_007675386.1
1 1200
Domain Start End Length (AA) Confidence
N-terminal 1 867 867 0,9532
Central domain 868 1072 206 0,3199
C-terminal 1073 1200 127 0,5687
Legend: N-terminal Central domain C-terminal
3D Structure with Domain Coloring

The structure is colored according to the domain segmentation: N-terminal (blue), Central (green), C-terminal (pink).

Domain Coloring
N-terminal
1-867
Central
868-1072
C-terminal
1073-1200

Taxonomy

  Name Taxonomy ID Lineage
Phage Cyanophage MED4-117
[NCBI]
889954 Viruses > Duplodnaviria > Heunggongvirae > Uroviricota > Caudoviricetes
Host Prochlorococcus
[NCBI]
1218 Bacteria > Cyanobacteria > Prochlorales > Prochlorococcaceae >

Coding sequence (CDS)

Coding sequence (CDS)
Genbank protein accession
YP_007675386.1 [NCBI]
Genbank nucleotide accession
NC_020857.1 [NCBI]
CDS location
range 29741 -> 33343
strand -
CDS
ATGAAAGATGGAGATATAGGGTCAAGGCAGTTTGTTACATTAGTGGATGTAATTGGGGAAGGTCAAATAGCTGGTTTTCCCTCTGCTATAGATGCTGGATTAACTCATGGGTCTATTTCTTATCGAATAGCAAGCCTTAAAGATTTATTTCTTAATGGAACTCAAGTTTTAAGAGATGGTGCTAGTAATACAGATCCAAATATTAATGATTTCAATTTTGGTACAAGTGAAGCTAATGCCCCATCTTTTTTTTCAAGGCTTGGCACTTCAAACCAGACAAAAATACAAGGACTTGTAGAAACTGAAAGAGACAGAACTGTAGGAGTTACAGTAACACAATCTCAATCTCAAACTGTAACTATTACTGACACTTCAACTGAGGGTGTAAGGGTAACTATTGGTTTTCCTAGATTACAAGAAATTGAAGATGATGGAAATATAAAAGGAACTACTGTTGAATACGATATAGAGGTAAGAAAACAAGATAATACTTTAATAAAAAAAATTAATCCCGAGACTTTGTTAACTGGTTTAGATCGTAGTATTCACACCTCTGGGGGTCGAGTTACTGGTAAAAGCACTTCTCCATATTTTAAAGATCATATAATTGTTTTTCCTGAAGATTTAGCTGATTCTGATTTCCCAGTTACTGTTAAAGTTTCTAGACAAACTGCTGACAGTTCAGATGCAAAACTTGCGAATGCTTTTGAATTTACGACTTTAACTGAATTAGTCTTTGATAGTCCAACTTATCTAAACACAGCTTATGCAGCAATAAGGTTTGATGCTGAAATTTTTAGAGCAGTCCCACAGCGAATGTATAGGGTAAGAGGTCGTCTTGTAAAAATTCCACATAACTCAACTGTTAGATCAGATGGTTCTTTGTCATTTAGTGGTGATTTTAATGGAACTTTAAAAGCAACTAAAGAATATTGCAATGATCCCGCATGGGTTTTGTATGACATTCTTACAGAATCCGTTGCTGGTTTTGGAGACTTTGTTGCTGAAACAGAGGTCGATAAATATTCTTTTTATAATGCCTCTGTCTATAATTCAGAATTGATCAATGATGGCGAGGGTGGTACAGCCCCAAGATTTAGTTGCAATATTGTAATTCAAAGAAGTACAAACGCATATACTTTGTTAGATAGGATTGCTTCTATTATGAGAGGTAGCTTATACATTGATGATGGTGTAATAACTCTTTGCCAAGATAGACCAACAACAAGCACTTATTTTTTTTCTTATGCCAATATTACTGAAGATGGTTTTGTATATACAGGTGCAAGTCAAAGAACAAAAGATACAGTAATTAATGTCAAATATTTTAGTAATGAAACAAGATCTTTTGAATATGAAACTGTTGAGGATACAGCAGCCAATCAATCAAAATATGGTGTAGTAGTTAAAAATATAGAGGCAGTAGGTTGTAATAATCAAGCACAAGCTAGAAGGGCTGGTTTATGGCATCTGTTCACACAAAATAATGAAACTGAGACTGTAGCTTTTACAACTACTGCTGATGCTGGTTCTTTAGTAAGACCAAATCAAATTATTACTGTCCAAGACCCTGTGCGTAGTGGTTTAAGAAGATCAGGAAGAATAAAAACTGCTACAACCACACAAATAACAGTGGATAACACAAAAGATTTACCAACCTCACATACAACAGGAGATCAATTATCTGTAATTTTGACAGATGGAACAATGGAGACAAAGACAGTTTCAGATATTACAGGGAATGTTATTACTGTATCTAGTGCTTATACTTCCGCCCCTCAAGCTTTTAGTGTTTGGTTACTACTTAGAGCAACCACAGAAACTGAGGACTTTAGGGTTTTGTCTGTTACTGAAGAAGATAATACGTTTACTATCAATGCAATGTTTCATAATTCCAGTAAATATGCTTTTGTTGAAGATGGTGCATCTATTACAGTTCCACAAATAACAACTTTATTACTACCTAAATCAGCGCCCAGTAACTTATCGGCAGAGGAATTAATTATTGCATTAGGAAACAGGGCTGTAAGTAAATTAGTTTTAAGCTGGCAGCCAGTTTCAGGTGTTACAGAATATTCAGTTAAATATCAATTCAACAATGGTAATGTCATTACAGAAAGAGTTACCGCACCTACTTTTGAAATATTTGATTCTGAATTAGGAGAATACAAATTTGAAGTTTTTAGTTATAACGCAGTTGGAGAACCAAGTACAATTCCAACAACCCTTACTTTCAATGCTCAAGGTAAAACTGCTTTACCAGCAGATGTACAAAACTTAACGATAGAACCCTATAACGATGATTTTGTAAAACTTAGATTTTCGAAATCTACAGATGTTGACGTTATTCATGGTGGAAACGTGGTAGTCAGGCACAGCAATCTTACAGATGGCACAGGAACCTTTACTAACTCTGTTGATCTAATAAACGCTTTATCTGGCAATATATCTGAAACTCTAATTCCAGCGATAGCGGGTGAGGTAATATTAAAATTCCGTGATGATGGAGGCCGCTTAAGTTCTGGAGAAACGTCTGTTATTATATCGCCACCAAATCAACAGCCAAAATTAACAGCTTTTACAGACAGAGAAGATACAGATGCTACACCTTTCGGAGGGACAAAAACTAATACATTTTTTGATTCAACACTTGGGGGTTTAACTTTAGCTTCAACAACGACTATTGATGACGTAACTGAATTGATAGATACTTTATCTCAGATAGATTTTTTGGGTGATGTTGCCTCTACTGGTTCTTATGAGTTTGCCAACCCTTTGGATTTAGGTTCGACAATGGACACTAAATTGACTAGGCATTTTGTTACAGAATCTTTTTATGCTGGTTCATTTATAGACCAGAGGACAGAATTAATAGATACATGGAATGATATTGACCAGTTAACAGCTTTTGAAACTAATGCTGCTCTATTTGTGGCAACTACAACGCAAGATCCAGCTTTATCTACTTCTGGAACTTACACAATAAATAATGGCTCGGGTGGTGCTGGGACAATAATTACAATTACAAAGGCTTCTCATGGTTATTCTGTTGGTAGTTTTGTTGTTGTTGATTTTACTACGGGTACTGGTGTTGATGAAAATTATCAAATAATCTCAAAAACTACTGATACTTTCACTCTTACTTCTGCAACATCTCTGAATACAAGTGGAAATTGTACTTATGGAGCAGAATTTAGTGATTTCAACATTTTTACAAATGGTGTATTAAGAGGCAGAGGGTTTAAATTTAAAGTTGAAATGTCTTCTAATGACAAAGCACAAACAATTCTTCTAAAAGAACTTGGTTATACTGCCACACTTAACAGAAGAGTTGAAACTGTAAATTCTCTAATTGCCTCTGGTACTTCAACTAAAGCAGTAGTTTTTCAAGATAAGTTTTTTACGGGCTTCAGCGGTACAAGTGTAGCTGCTGGTGCAGCTTTACCTACTATTGGAATAGTAATAGAAAACGCACAGTCAGGTGATTTCTTTTCTTTGTCTTCTATTAGTTCAACTGGATTTTCAATAGATATAAAAAATGGATCTAGTTTTGTTGATAGGAATTTTAAATATACTGCTGTTGGTTTTGGTCGTGGCTCTTAA

Genome Context

Genome Context

Tertiary structure

PDB ID
07be1474016db05ba8fc309104009b75ec1bd4e0bfd767a4fd73f8284585a325
ESMFold
Source ESMFold
Method ESMFold
Resolution 0,7826
Oligomeric State monomer
Model Confidence
Very high
pLDDT > 90
High
90 > pLDDT > 70
Low
70 > pLDDT > 50
Very low
pLDDT < 50

Literature

Title Authors Date PMID Source
The Genome Sequence of Cyanophage MED4-117 Henn,M.R., Sullivan,M.S., Osburne,M.S., Levin,J., Malboeuf,C., Casali,M., Russ,C., Lennon,N., Chapman,S.B., Erlich,R., Young,S.K., Yandava,C., Zeng,Q., Alvarado,L., Anderson,S., Berlin,A., Chen,Z., Freedman,E., Gellesch,M., Goldberg,J., Green,L., Griggs,A., Gujja,S., Heilman,E.R., Heiman,D., Hollinger,A., Howarth,C., Larson,L., Mehta,T., Pearson,M., Roberts,A., Ryan,E., Saif,S., Shea,T., Shenoy,N., Sisk,P., Stolte,C., Sykes,S., White,J., Yu,Q., Coleman,M.L., Huang,K.H., Weigele,P.R., DeFrancesco,A.S., Kern,S.E., Thompson,L.R., Fu,R., Hombeck,B., Chisholm,S.W., Haas,B., Nusbaum,C. and Birren,B. 2011-09-23 GenBank