UniProt accession
Q2LIE5 [UniProt]
Protein name
Collagen triple helix repeat protein
RBP type
TF
Evidence Phold
Probability 1,00
TF
Evidence RBPdetect2
Probability 0,90
Protein sequence
MDCFKKGKFIPFPCALPIPEAGPTGPTGPPGSAGGSTGPTGPTGPQGLQGIQGVQGNPGTTGPQGIQGIQGIPGVSGPIGPIGPTGIQGVQGIQGFPGIPGPMGPIGLTGPTGIQGIQGIQGVQGIQGIQGDVGPTGPQGIPGIPGLTGPTGSQGVTGVTGPSGGPPGPTGATGPTGPAGGPPGPTGPTGPAGGPTGLTGPTGPTGPTGIQGIQGVQGTQGIPGPTGPQGIQGVQGLQGIPGIPGSMGPTGLTGPTGLQGIQGIQGNPGPTGPFGPTGPTGLQGIQGLQGIQGIPGPTGPQGIQGPTGPASTLSTKAILFGGTNSGFQRIAGSPGADSQDIPYVLGGAGSVVGLSASISINNLPIGVYTIRVCKNVPINLAAPGPGQVISTIILTTTAVISGTIILTINPSDIGAQPVRVFNPNLVIAPATVAWSSTIPGDIVARGDAMSLFITPGITQNAVYTVFLHTGN
Physico‐chemical
properties
protein length:471 AA
molecular weight: 44647,08150 Da
isoelectric point:7,70512
aromaticity:0,02972
hydropathy:0,06051

Domains

Domains [InterPro]
DC_1341
STR
6–86
DC_0620
STR
71–146
DC_0620
STR
241–467
Q2LIE5
1 471
Architecture
STR
STR
STR 6-146 | STR 181-467 |
Legend: ATT STR RBD CBM LEC ENZ CHP LNK TAS TTP UNK Unmapped

Taxonomy

  Name Taxonomy ID Lineage
Phage Bacillus phage WBeta
[NCBI]
2885908 Uroviricota > Caudoviricetes > Wbetavirus >
Host Bacillus anthracis
[NCBI]
1392 cellular organisms > Bacteria > Bacillati > Bacillota > Bacilli > Bacillales

Coding sequence (CDS)

Coding sequence (CDS)
Genbank protein accession
ABC40439.1 [NCBI]
Genbank nucleotide accession
DQ289555 [NCBI]
CDS location
range 33523 -> 34938
strand +
CDS
ATGGATTGTTTTAAAAAAGGTAAATTTATACCATTTCCATGTGCTTTACCAATTCCTGAAGCTGGTCCTACTGGCCCAACTGGTCCACCTGGATCAGCTGGAGGCTCGACCGGTCCAACTGGTCCAACCGGCCCGCAGGGTTTACAAGGGATTCAAGGGGTTCAAGGGAATCCAGGAACTACTGGACCTCAAGGAATTCAAGGAATTCAAGGAATTCCAGGGGTTTCAGGTCCTATTGGTCCTATTGGTCCTACTGGAATCCAAGGAGTTCAAGGCATTCAAGGATTTCCTGGCATTCCAGGTCCTATGGGCCCGATAGGACTAACCGGTCCGACTGGTATCCAAGGTATTCAAGGGATTCAGGGAGTTCAAGGTATCCAAGGTATTCAAGGGGATGTAGGCCCAACTGGCCCTCAGGGAATTCCGGGTATTCCAGGATTAACTGGCCCAACTGGCTCTCAAGGTGTTACTGGAGTTACTGGCCCATCCGGAGGCCCACCAGGTCCAACTGGTGCAACAGGTCCAACCGGTCCAGCTGGAGGCCCACCAGGTCCAACAGGTCCAACCGGTCCAGCTGGAGGTCCAACAGGATTAACTGGCCCGACTGGCCCGACTGGTCCAACAGGAATTCAAGGTATTCAAGGGGTACAGGGTACTCAAGGTATTCCGGGTCCAACTGGTCCACAAGGGATCCAAGGAGTTCAAGGACTTCAAGGAATACCAGGCATTCCAGGTTCTATGGGCCCAACAGGACTAACTGGTCCGACTGGGCTTCAAGGTATTCAAGGGATTCAGGGGAATCCAGGTCCGACTGGTCCCTTTGGCCCGACTGGCCCGACCGGGCTTCAAGGTATTCAAGGCTTACAGGGTATTCAAGGTATTCCAGGTCCAACAGGACCTCAAGGAATCCAAGGTCCAACAGGACCTGCTAGCACACTTTCCACAAAAGCTATTCTTTTTGGGGGTACTAATTCAGGGTTTCAACGTATAGCTGGATCACCGGGTGCAGATTCACAAGACATTCCTTATGTACTTGGCGGAGCTGGTAGTGTTGTAGGTCTTTCTGCTTCTATAAGTATTAATAATTTACCAATAGGAGTATATACAATACGAGTATGTAAAAATGTTCCTATTAATCTTGCTGCTCCGGGGCCTGGCCAAGTAATATCTACAATTATTCTTACAACTACAGCAGTGATTAGTGGCACTATTATATTGACTATTAATCCTTCTGATATTGGTGCACAACCTGTAAGAGTATTTAACCCTAATTTAGTTATAGCACCTGCTACAGTTGCTTGGAGCAGTACAATACCTGGTGACATAGTTGCAAGAGGTGATGCAATGTCACTTTTTATAACTCCAGGTATTACGCAAAATGCTGTGTATACAGTATTCTTGCATACAGGAAATTAA

Genome Context

Genome Context

Gene Ontology

Description Category Evidence (source)
GO:0031012 extracellular matrix Cellular Component IEA:TreeGrafter (UniProt)
GO:0005615 extracellular space Cellular Component IEA:TreeGrafter (UniProt)
GO:0030020 extracellular matrix structural constituent conferring tensile strength Molecular Function IEA:TreeGrafter (UniProt)
GO:0030198 extracellular matrix organization Biological Process IEA:TreeGrafter (UniProt)

Tertiary structure

PDB ID
214a0fd865977e5b7cfb47bcc9065cfe581071dd4dfe7fdb0666a5d98bdcfc1c
ESMFold
Source ESMFold
Method ESMFold
Resolution 0,6091
Oligomeric State monomer
Model Confidence
Very high
pLDDT > 90
High
90 > pLDDT > 70
Low
70 > pLDDT > 50
Very low
pLDDT < 50

Literature

Title Authors Date PMID Source
16585764 PubMed