跳转至内容
Merck
CN
  • Automated clustering of ensembles of alternative models in protein structure databases.

Automated clustering of ensembles of alternative models in protein structure databases.

Protein engineering, design & selection : PEDS (2004-08-21)
Francisco S Domingues, Jörg Rahnenführer, Thomas Lengauer
摘要

Experimentally determined protein structures have been classified in different public databases according to their structural and evolutionary relationships. Frequently, alternative structural models, determined using X-ray crystallography or NMR spectroscopy, are available for a protein. These models can present significant structural dissimilarity. Currently there is no classification available for these alternative structures. In order to classify them, we developed STRuster, an automated method for clustering ensembles of structural models according to their backbone structure. The method is based on the calculation of carbon alpha (Calpha) distance matrices. Two filters are applied in the calculation of the dissimilarity measure in order to identify both large and small (but significant) backbone conformational changes. The resulting dissimilarity value is used for hierarchical clustering and partitioning around medoids (PAM). Hierarchical clustering reflects the hierarchy of similarities between all pairs of models, while PAM groups the models into the 'optimal' number of clusters. The method has been applied to cluster the structures in each SCOP species level and can be easily applied to any other sets of conformers. The results are available at: http://bioinf.mpi-sb.mpg.de/projects/struster/.