TY - JOUR
T1 - The MULTICOM toolbox for protein structure prediction
AU - Cheng, Jianlin
AU - Li, Jilong
AU - Wang, Zheng
AU - Eickholt, Jesse
AU - Deng, Xin
N1 - Funding Information:
The work is partially supported by a NIH grant (5R01GM093123) to JC, a NLM fellowship to JE, and a Shumaker fellowship to ZW.
PY - 2012/4/30
Y1 - 2012/4/30
N2 - Background: As genome sequencing is becoming routine in biomedical research, the total number of protein sequences is increasing exponentially, recently reaching over 108 million. However, only a tiny portion of these proteins (i.e. ~75,000 or < 0.07%) have solved tertiary structures determined by experimental techniques. The gap between protein sequence and structure continues to enlarge rapidly as the throughput of genome sequencing techniques is much higher than that of protein structure determination techniques. Computational software tools for predicting protein structure and structural features from protein sequences are crucial to make use of this vast repository of protein resources.Results: To meet the need, we have developed a comprehensive MULTICOM toolbox consisting of a set of protein structure and structural feature prediction tools. These tools include secondary structure prediction, solvent accessibility prediction, disorder region prediction, domain boundary prediction, contact map prediction, disulfide bond prediction, beta-sheet topology prediction, fold recognition, multiple template combination and alignment, template-based tertiary structure modeling, protein model quality assessment, and mutation stability prediction.Conclusions: These tools have been rigorously tested by many users in the last several years and/or during the last three rounds of the Critical Assessment of Techniques for Protein Structure Prediction (CASP7-9) from 2006 to 2010, achieving state-of-the-art or near performance. In order to facilitate bioinformatics research and technological development in the field, we have made the MULTICOM toolbox freely available as web services and/or software packages for academic use and scientific research. It is available at http://sysbio.rnet.missouri.edu/multicom_toolbox/.
AB - Background: As genome sequencing is becoming routine in biomedical research, the total number of protein sequences is increasing exponentially, recently reaching over 108 million. However, only a tiny portion of these proteins (i.e. ~75,000 or < 0.07%) have solved tertiary structures determined by experimental techniques. The gap between protein sequence and structure continues to enlarge rapidly as the throughput of genome sequencing techniques is much higher than that of protein structure determination techniques. Computational software tools for predicting protein structure and structural features from protein sequences are crucial to make use of this vast repository of protein resources.Results: To meet the need, we have developed a comprehensive MULTICOM toolbox consisting of a set of protein structure and structural feature prediction tools. These tools include secondary structure prediction, solvent accessibility prediction, disorder region prediction, domain boundary prediction, contact map prediction, disulfide bond prediction, beta-sheet topology prediction, fold recognition, multiple template combination and alignment, template-based tertiary structure modeling, protein model quality assessment, and mutation stability prediction.Conclusions: These tools have been rigorously tested by many users in the last several years and/or during the last three rounds of the Critical Assessment of Techniques for Protein Structure Prediction (CASP7-9) from 2006 to 2010, achieving state-of-the-art or near performance. In order to facilitate bioinformatics research and technological development in the field, we have made the MULTICOM toolbox freely available as web services and/or software packages for academic use and scientific research. It is available at http://sysbio.rnet.missouri.edu/multicom_toolbox/.
KW - Bioinformatics tool
KW - Contact map
KW - Domain
KW - Fold recognition
KW - Protein disorder
KW - Protein model quality assessment
KW - Protein structure prediction
KW - Secondary structure
KW - Solvent accessibility
KW - Tertiary structure
UR - http://www.scopus.com/inward/record.url?scp=84862182129&partnerID=8YFLogxK
U2 - 10.1186/1471-2105-13-65
DO - 10.1186/1471-2105-13-65
M3 - Article
C2 - 22545707
AN - SCOPUS:84862182129
SN - 1471-2105
VL - 13
JO - BMC Bioinformatics
JF - BMC Bioinformatics
IS - 1
M1 - 65
ER -