Model-based clustering with certainty estimation: Implication for clade assignment of influenza viruses

Journal of Biometrics & Biostatistics

ISSN: 2155-6180

Open Access

Model-based clustering with certainty estimation: Implication for clade assignment of influenza viruses

5th International Conference on Biometrics & Biostatistics

October 20-21, 2016 Houston, USA

Shunpu Zhang

University of Central Florida, USA

Posters & Accepted Abstracts: J Biom Biostat

Abstract :

Clustering is a common technique used by molecular biologists to group homologous sequences and identify co-expressed genes. There remain issues such as how to cluster molecular sequences accurately and in particular how to evaluate the certainty of a cluster. We presented a model-based clustering method to analyze molecular sequences, described a subset bootstrap scheme to evaluate a certainty of the clusters, and showed an effective way using 3D visualization to examine clusters. The above methods were applied for the clade assignment of influenza viral hemagglutinin (HA) sequences. For the high pathogenic avian influenza (HPAI) A (H5N1) HA sequences, nine clusters were obtained using the model-based method, which agrees with previous findings; the certainties for sequences assigned to a cluster were all 1.0, the certainties for clusters were also very high (0.92-1.0), with an overall clustering certainty of 0.95. For influenza A (H7) HA sequences, 10 HA clusters were assigned and the vast majority of sequences could be assigned to a cluster with a certainty of more than 0.99; the certainties for clusters, however, varied from 0.40 to 0.98. We suspect such certainty variation is attributed to the dissimilar homogeneity of sequence data within cluster. In both cases, the certainty values estimated using the subset bootstrap method are all higher than those calculated based upon the standard bootstrap method, suggesting our bootstrap scheme is more robust for the estimation of clustering certainty.

Biography :


Google Scholar citation report
Citations: 3254

Journal of Biometrics & Biostatistics received 3254 citations as per Google Scholar report

Journal of Biometrics & Biostatistics peer review process verified at publons

Indexed In

arrow_upward arrow_upward