IBC > Journal Article
Journal Article Synopsis
IBC 2014, vol. 6, article no. 3 | doi: 10.4051/ibc.2014.6.4.0003
view 6617 | download 1691 | rating 0.0 | comment 0
Full Report (Bioinformatics/Computational biology/Molecular modeling)
An approach for a substitution matrix based on protein blocks and physiochemical properties of amino acids through PCA
Youngki You1, In Hwan Jang1, Kyungro Lee2, Heon Joo Kim1 and Kwan Hee Lee1,*
1
School of Life Science, Handong Global University, Pohang, 791-708, Republic of Korea
2
Department of Biotechnology Yonsei, University, Seoul, 120-749, Republic of Korea
*Corresponding author
received: February 25, 2014 ; accepted: August 29, 2014 ; published : November 05, 2014
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Synopsis

Amino acid substitution matrices are essential tools for protein sequence analysis, homology sequence search in protein databases and multiple sequence alignment. The PAM matrix was the first widely used amino acid substitution matrix. The BLOSUM series then succeeded the PAM matrix. Most substitution matrixes were developed by using the statistical frequency of substitution between each amino acid at blocks representing groups of protein families or related proteins. However, substitution of amino acids is based on the similarity of physiochemical properties of each amino acid. In this study, a new approach was used to obtain major physiochemical properties in multiple sequence alignment. Frequency of amino acid substitution in multiple sequence alignment database and selected attributes of amino acids in physiochemical properties database were merged. This merged data showed the major physiochemical properties through principle components analysis. Using factor analysis, these four principle components were interpreted as flexibility of electronic movement, polarity, negative charge and structural flexibility. Applying these four components, BAPS was constructed and validated for accuracy. When comparing receiver operated characteristic (ROC50) values, BAPS scored slightly lower than BLOSUM and PAM. However, when evaluating for accuracy by comparing results from multiple sequence alignment with the structural alignment results of two test data sets with known three-dimensional structure in the homologous structure alignment database, the result of the test for BAPS was comparatively equivalent or better than results for prior matrices including PAM, Gonnet, Identity and Genetic code matrix.

Keywords : BAPS, factor analysis, principle component analysis, scoring matrix, sequence alignment
Post-publication appraisal
Rate this manuscript
        Exceptional Highly recommended Recommended Fair Current rating: 0.0
Open discussion                       
(Open discussion is for 90 days after the initial publication)
:: Comments
Main text PDF(795. KB)
(Print version)
Send to a friend
References
Reviewed by
- Kwang-Hwi Cho
- Sukjoon Yoon
Edited by
- Keun Woo Lee
Author's Commentary
Export Citation
Bookmark
  StumbleUpon Facebook Connotea CiteULike twitter
PubMed
- Youngki You
- Kwan Hee Lee
Google Scholar
- Youngki You
- Kwan Hee Lee
Interdisciplinary Bio Central (IBC) ISSN : 2005-8543 | Contact IBC//
All site content, except where otherwise noted, is licensed under a Creative Commons Attribution License.