Protein
View in Explore- Genbank accession
- CAH1093988.1 [GenBank]
- Protein name
- hemagglutinin/invasin
- RBP type
-
TSPTFTF
- Protein sequence
-
MNKIYRLIWNDVLGIWVTASELSKARGKRSSGNNIKNKNIGNSSSKVKYNQYINIWVGGIFLSALISTGVFAAGSADIGTPIVTAAIGPSNNCVSQAATNLTSVVPWTCLGQTATGGFTLINGVQANAQNVGQANLDARASGQEAIAIGFIDTTASGTNSVAIGETANALGKNSVAIGKNIIADNSNINEDLSNNVAIGSNSIASTKGSQYTVGSGGAVAIGYGEHANGVGAVAIGSSNTAEGDGAVAVGRINNAIGDGTIALGDSCIANGDRAVAVGSVAHAIGTAAVAVGDSSTADGWSAISIGRVANADADFSIAMGDFAQALSYSALALGRNSYIDANSNESIAQGYGSSVTTARNAVAIGSDATVSNRNSGIAIGDNAQVIVGSHINDVNGLPGSIAIGLNARSVGGSEVIIGSYAAQNAQRYDDGAKNTYGSDSVMLGNQAGRNSQYLYATFIGDQAGQNSIGQHNSYVGQNAARYRTGSYNTALGSNALTGISNTTNSTGNSNTAIGTASGVSLEGDGNTLTGVYTGQRIIGNNNSAFGNNAGANIKGNENIAIGYLSGVSSQGNNNVLLGSNSNIGVNTNNVVSIGTNTRATKENSIALGANANTVTDATLESEANLNGLTYGNFAGQVTNTGMQLSVGSAGAERQIKNVGSGSISETSTDAINGSQLYATNNILGNVANSTIDILGGDASLNSNGTLSMSNIGGTGQNTIDSAIAASRTKVAAGTNVADVVKTTGSNGQDIYTVNAKGSTASAGSSAVTVTAGTADANNVTDYAVDLSQSTKDSLVKADTALQSVVTQIDGVDVKTVNKDDNKVNFVTGDNVELTANADGSITVGTAADVTFNTVNTTNLTATGETKLGDSFTVNNGGSYYTGPITEGNHITNKTYVDQATAASRTEVAAGTNVADVVKTTGSNGQNIYTVNAKGSTASAGSSAITVTAGSPDANNVTDYAVDLSQSTKDSLVKADTALQSVVTQIDGVDVKTVNKDDNKVNFVTGDNVELTANADGSITVGTAADVTFNTVNTTNLTATGETKLGDSFTVNNGGSYYTGPITEGNHITNKTYVDQATAASRTEVEQGKNITVTSSTGVDGQNIYTVATADEVDFNKVTVGDTTITTDGIVIANGPSITKDGISAGDKKVTDVADGLISADSKDAINGSQLFGLGNNLTQLFGGNALYTNNQITWSNIGGTGQNTIDDAIKHVNDQAANANQGWNVSTDSGSNATSTVKPGQTVNINGDSDNGVLVTNSGNDIKVGLADQIKIGAGDNAVSIDGNSGTIQAGDVLIDGSKGNISAGKVTVNGEAGTVNGLTNTTWNPNNIFSGQAATEDQLQQVAQNATAAATAAKTTVSAGENITVSSSKNADGSTNYQVATSKDVKFDTVTSGSITTDKVSVGNITIDQTGINAGASKVTNVADGTINSTSKDAINGSQLHASNTNIYNYLGGGANYETNTGPTYNVGGGSYNNVGDALNSLDQQVTNVSNQLEQAFYTTNKRIDDLEDHANGGIAQAMATAGLPQAYIPGKSMMAISGGTYRGESGYAIGMSSISDNGKWVFKMSGSGNSRGDFGGTVGAGIQW
- Physico‐chemical
properties -
protein length: 1586 AA molecular weight: 159783,24250 Da isoelectric point: 4,49292 aromaticity: 0,04729 hydropathy: -0,20845
Domains
Domains [InterPro]
IPR024973
STR
1–34
STR
1–34
IPR011049
STR
125–214
STR
125–214
cd12820
STR
137–258
STR
137–258
IPR008640
Unmapped
155–180
Unmapped
155–180
IPR008640
Unmapped
227–250
Unmapped
227–250
IPR008640
Unmapped
255–279
Unmapped
255–279
1
1586
Architecture
STR 1-34 | STR 114-695 | STR 1005-1174 | RBD 1175-1190 | STR 1287-1447 | RBD 1448-1461 | STR 1463-1511 | RBD 1512-1525 | ATT 1526-1586
Legend:
ATT
STR
RBD
CBM
LEC
ENZ
CHP
LNK
TAS
TTP
UNK
Unmapped
Tail Spike Domain Segmentation
Tail Spike Domain Segmentation
This protein has been segmented into three structural domains: N-terminal, central domain, and C-terminal.
Domain Layout
1
1586
| Domain | Start | End | Length (AA) | Confidence |
|---|---|---|---|---|
| N-terminal | 1 | 113 | 113 | 0,4015 |
| Central domain | 114 | 1504 | 1392 | 0,4060 |
| C-terminal | 1505 | 1586 | 81 | 0,3786 |
Note: Constraints were applied during segmentation.
Fixed 60 C-terminal predictions appearing before Central domain
Fixed 60 C-terminal predictions appearing before Central domain
Legend:
N-terminal
Central domain
C-terminal
3D Structure with Domain Coloring
The structure is colored according to the domain segmentation: N-terminal (blue), Central (green), C-terminal (pink).
Domain Coloring
N-terminal
1-113
1-113
Central
114-1504
114-1504
C-terminal
1505-1586
1505-1586
Taxonomy
| Name | Taxonomy ID | Lineage | |
|---|---|---|---|
| Phage |
Acinetobacter phage MD-2021a [NCBI] |
2899278 | Viruses > Duplodnaviria > Heunggongvirae > Uroviricota > Caudoviricetes |
| Host | No host information | ||
Coding sequence (CDS)
Coding sequence (CDS)
Genbank protein accession
CAH1093988.1
[NCBI]
Genbank nucleotide accession
CAKLQH020000029
[NCBI]
CDS location
range 3211 -> 7971
strand -
strand -
CDS
ATGAATAAGATATATCGGTTAATTTGGAATGATGTTCTAGGTATTTGGGTTACTGCTTCAGAACTCTCTAAGGCCAGAGGGAAGCGTAGTTCAGGTAATAATATTAAAAATAAAAATATTGGTAATTCATCTAGTAAAGTTAAATATAATCAATATATTAACATTTGGGTGGGTGGTATCTTTTTAAGTGCTTTAATATCAACTGGTGTGTTTGCAGCAGGTAGCGCAGATATAGGAACACCAATAGTAACTGCGGCTATTGGACCGAGCAATAATTGTGTGAGTCAAGCGGCAACCAATCTTACTAGTGTCGTTCCATGGACATGTCTTGGCCAAACAGCAACTGGCGGTTTTACTCTCATTAACGGCGTGCAAGCCAACGCACAAAATGTAGGTCAGGCTAATTTAGATGCAAGAGCTTCTGGACAAGAAGCAATTGCAATTGGTTTTATTGATACAACTGCTTCGGGTACTAATAGTGTGGCAATTGGAGAAACTGCTAATGCGTTAGGTAAAAACTCAGTTGCTATTGGTAAAAATATTATCGCAGATAACTCTAATATAAATGAAGACCTATCTAATAATGTAGCGATTGGATCTAACAGTATAGCTAGTACAAAAGGTAGTCAATATACTGTAGGTTCTGGAGGTGCTGTAGCTATTGGTTATGGTGAGCATGCTAATGGTGTCGGGGCTGTAGCGATAGGTTCATCAAATACAGCTGAAGGTGATGGGGCAGTTGCAGTAGGTCGAATAAATAATGCAATTGGAGATGGAACTATTGCGCTAGGAGACAGTTGTATTGCAAATGGAGATCGGGCAGTCGCAGTAGGTTCTGTTGCCCATGCTATTGGGACTGCTGCTGTGGCTGTTGGTGATAGCTCTACGGCAGATGGATGGAGTGCAATCAGTATTGGTCGTGTTGCAAATGCCGATGCCGATTTCTCAATAGCTATGGGTGATTTTGCTCAAGCATTATCTTATAGTGCGTTAGCTTTGGGGCGTAACAGTTATATTGATGCTAATTCTAATGAATCAATTGCTCAAGGTTATGGAAGTTCAGTGACAACTGCACGAAATGCTGTAGCAATTGGAAGTGATGCAACCGTAAGTAATCGTAATAGTGGTATTGCTATTGGTGATAATGCACAAGTAATTGTGGGAAGTCACATTAATGATGTGAATGGCCTACCTGGTTCTATAGCAATAGGTCTAAATGCTCGTTCAGTAGGTGGTAGCGAAGTTATTATTGGATCTTATGCCGCTCAGAATGCGCAACGCTATGATGACGGTGCTAAAAATACCTATGGTTCAGATAGTGTTATGTTAGGCAATCAAGCTGGGCGTAATTCTCAATATCTATATGCGACATTTATTGGTGATCAAGCAGGTCAAAATAGCATTGGTCAACACAATAGTTATGTAGGGCAAAATGCAGCAAGATACCGAACTGGGAGCTATAATACCGCTCTAGGTAGTAATGCATTAACAGGAATTAGTAATACTACAAACTCTACTGGTAATTCTAATACGGCTATCGGTACGGCTTCAGGTGTCTCTCTAGAAGGAGATGGCAATACATTGACAGGTGTATATACAGGTCAACGCATTATTGGAAATAATAATTCGGCATTCGGAAATAATGCTGGTGCAAATATAAAAGGAAATGAGAATATTGCTATAGGGTACCTTTCTGGGGTTTCTTCTCAAGGCAACAACAATGTTCTTTTAGGCTCAAATTCTAATATAGGTGTAAATACTAATAATGTTGTAAGTATTGGAACGAATACTCGAGCTACTAAAGAGAATTCAATTGCTTTAGGTGCAAATGCAAATACTGTTACTGATGCAACACTAGAGTCAGAGGCAAATTTAAATGGTTTGACCTATGGTAATTTTGCGGGACAAGTTACAAATACAGGGATGCAATTGTCAGTTGGTTCAGCGGGAGCAGAACGTCAAATTAAAAATGTAGGATCTGGTTCTATTTCCGAGACTAGTACAGATGCTATCAATGGTAGTCAGTTATATGCAACGAATAATATATTAGGTAATGTTGCTAATAGCACAATAGATATTTTAGGAGGTGATGCTAGTTTAAATAGTAATGGCACACTTTCAATGAGTAACATTGGTGGTACTGGACAAAATACAATTGATTCAGCAATAGCAGCATCAAGAACGAAGGTTGCAGCGGGCACCAATGTTGCCGATGTAGTTAAAACTACAGGCAGCAATGGACAGGATATTTATACGGTAAATGCCAAAGGTAGCACTGCGTCAGCAGGATCAAGTGCAGTTACGGTAACGGCAGGTACAGCTGATGCCAACAATGTTACGGACTATGCAGTTGATTTGAGTCAAAGTACCAAAGACAGTCTGGTTAAGGCGGATACAGCCTTGCAAAGTGTGGTGACACAGATTGACGGAGTTGATGTTAAGACAGTCAATAAAGATGACAACAAGGTTAACTTTGTGACAGGAGACAATGTTGAGTTAACAGCGAATGCGGATGGCAGTATCACGGTGGGTACGGCAGCAGATGTGACATTCAATACAGTGAACACGACGAACCTGACTGCGACAGGAGAAACCAAGCTAGGCGATAGCTTCACGGTGAATAATGGAGGCAGTTACTATACAGGTCCAATTACCGAAGGCAATCACATTACCAATAAAACCTATGTAGACCAGGCAACAGCAGCATCAAGAACGGAGGTTGCAGCGGGAACGAATGTTGCCGATGTAGTTAAAACTACAGGTAGCAATGGACAGAATATTTATACGGTAAATGCCAAAGGTAGCACTGCGTCAGCAGGATCAAGTGCAATTACGGTAACGGCAGGTAGTCCTGATGCCAACAATGTTACGGACTATGCAGTTGATTTGAGTCAAAGTACCAAAGACAGTCTGGTTAAGGCGGATACAGCCTTGCAAAGTGTGGTGACGCAGATTGACGGAGTTGATGTTAAGACAGTCAATAAAGATGACAACAAGGTTAACTTTGTGACAGGAGACAATGTTGAGTTAACAGCGAATGCGGATGGCAGTATCACGGTGGGTACGGCAGCAGATGTGACATTCAATACAGTGAACACGACGAACCTGACTGCGACAGGAGAAACCAAGCTAGGCGATAGCTTCACGGTGAATAATGGAGGCAGCTACTACACAGGCCCAATTACCGAAGGAAACCATATTACCAATAAAACCTATGTAGACCAGGCAACAGCAGCATCAAGAACGGAGGTTGAACAAGGGAAAAATATTACTGTTACTTCAAGCACTGGTGTGGATGGGCAAAATATATACACTGTTGCTACAGCAGATGAGGTTGATTTTAATAAGGTCACAGTTGGCGATACCACAATTACAACAGACGGTATTGTTATCGCTAATGGCCCAAGCATAACAAAAGATGGTATTAGTGCAGGTGATAAAAAAGTAACTGATGTTGCAGATGGTCTTATTAGTGCAGATTCTAAAGATGCCATCAATGGTAGTCAGCTTTTTGGCTTAGGTAATAACTTGACTCAACTGTTTGGTGGTAATGCTCTTTATACTAATAATCAGATTACTTGGAGCAATATTGGTGGCACTGGGCAAAACACCATCGATGATGCAATTAAACATGTAAATGATCAGGCTGCAAATGCAAACCAAGGTTGGAATGTAAGTACTGATTCAGGTTCGAATGCGACAAGCACAGTAAAACCAGGTCAAACTGTAAATATTAATGGTGACTCTGATAATGGTGTACTAGTAACGAATTCTGGTAATGACATTAAAGTGGGCTTAGCTGATCAAATTAAAATTGGCGCAGGAGATAATGCAGTATCAATTGATGGTAATTCTGGAACAATTCAAGCTGGAGATGTTTTGATTGATGGATCAAAAGGCAATATCTCTGCTGGAAAAGTTACTGTTAATGGTGAGGCTGGTACAGTTAATGGATTAACAAATACGACTTGGAATCCTAACAATATTTTTTCAGGACAAGCTGCTACTGAAGATCAACTACAACAGGTTGCTCAAAATGCAACTGCGGCTGCGACAGCTGCTAAAACAACAGTATCGGCTGGTGAGAATATTACGGTTTCAAGTAGTAAAAATGCCGATGGTAGTACGAACTATCAAGTGGCAACCAGTAAAGATGTGAAGTTTGATACAGTTACTTCTGGTTCGATTACTACAGATAAAGTTTCTGTAGGAAATATCACAATTGATCAAACAGGTATAAATGCTGGTGCAAGTAAAGTTACAAATGTAGCGGACGGCACAATTAACTCTACATCGAAAGATGCGATTAATGGATCTCAGTTACATGCGAGCAACACCAATATTTATAACTACTTAGGTGGTGGTGCCAATTATGAAACCAATACAGGCCCTACTTATAATGTGGGTGGAGGCTCTTACAATAATGTTGGAGATGCACTGAATTCCTTAGATCAACAAGTCACTAATGTAAGTAATCAATTAGAACAAGCATTCTATACAACGAATAAACGTATCGATGATTTAGAAGATCATGCCAATGGAGGTATTGCTCAAGCAATGGCAACCGCGGGATTACCACAAGCTTACATTCCAGGTAAAAGTATGATGGCAATCAGCGGTGGTACGTACCGAGGTGAATCTGGTTATGCAATAGGTATGTCATCGATTTCAGATAATGGAAAATGGGTCTTTAAAATGTCTGGAAGTGGTAATTCCCGTGGTGATTTTGGAGGAACTGTAGGTGCCGGTATCCAATGGTAA
Genome Context
Genome Context
Tertiary structure
PDB ID
bb7bcbc0dfb9693e721716fdd7660f607ff1e98b9469bc1cf3e11c523661fe76
Model Confidence
Very high
pLDDT > 90
pLDDT > 90
High
90 > pLDDT > 70
90 > pLDDT > 70
Low
70 > pLDDT > 50
70 > pLDDT > 50
Very low
pLDDT < 50
pLDDT < 50