Biosynthetic Gene Clusters in Organism: The Sole Source of New Drug Discovery

Molecular and Genetic Medicine

ISSN: 1747-0862

Open Access

Review Article - (2021) Volume 15, Issue 6

Biosynthetic Gene Clusters in Organism: The Sole Source of New Drug Discovery

Chakraborty P1* and Chakraborty A2
*Correspondence: Chakraborty P, Kalpana Chawla Center for Space and Nanoscience, Indian Institute of Chemical Biology (Retd.), Kolkata, India, Email:
1Kalpana Chawla Center for Space and Nanoscience, Indian Institute of Chemical Biology (Retd.), Kolkata, India
2St. Xavier’s College, Kolkata, India


Various microorganisms e.g., bacteria, fungi and higher organism e.g., plant during their metabolism produces both primary and secondary metabolites, needed for their survival and defense. Often these secondary metabolites are being used in drug discovery. The genes for the enzymes of these secondary metabolic pathways are generally grouped together as biosynthetic gene clusters (BGCs) and hidden away in organism’s genome. Genome mining of the unexplored microbes, medicinal plants, and underexplored human microbiota with emerging genomics research involving next-generation sequencing technology along with bioinformatics tools like anti-SMASH (antibiotics and Secondary Metabolite Analysis Shell), planti-SMASH, may help in finding many BGCs and subsequently in the discovery of many new drugs in future.


Organism • Genome • Secondary metabolites • Biosynthetic gene clusters • Drugs


The genes for the enzymes of the secondary metabolic pathway of the organisms generally are organized in clusters and hidden away in organism’s genome [1]. These clusters are known as biosynthetic gene clusters (BGC) and have the potentials to produce drugs or drug-like molecules. Geneclusters of the secondary metabolites have various roles, they are involved in the synthesis of pharmaceutically important natural products such as antitumour alkaloids noscapine from medicinal plant Papaver somniferum (Opium poppy), vinblastine/vincristine from Catharanthus roseus and tetracycline, Protegenins A-D, penicillin and lanosterol-14α-demethylase(CYP51) inhibitors from microbial gene clusters (Figures 1 and 2) [1-6]. Gene-clusters of plant secondary metabolites also have important roles as plant defense compounds, e.g. avenacins (triterpene glycoside) in oat, momilactone, oryzalide diterpenes in rice, as well as gene-clusters for anti-nutritional compounds, e.g., steroidal alkaloids (α-solanine, α-tomatine) in potato and tomato and many more [7-9].


Figure 1. Biosynthetic gene clusters for secondary metabolites from various medicinal plants.


Figure 2. Biosynthetic gene-cluster for secondary metabolites from microbes.

Literature Review

Emerging genomics research involving draft genome, next-generation sequencing technology and several bioinformatics tools, e.g., anti-SMASH, planti-SMASH have now revolutionized in both plant and microbial genome sequencing, identification and characterization of biosynthetic gene cluster and so on [2,10-15]. These tools anti-SMASH/planti-SMASH/fungi-SMASH so far identified a wide variety of candidate BGC numbers across plant and microbial taxonomy. The underlying principles of these tools that they first identifies biosynthetic genes that are located in close proximity to each other, and then it looks for the co-occurrence of at least three biosynthetic enzymecoding genes, comprising at least two different enzyme types. The identified clusters are then extended to incorporate any flanking genes, and, moreover, each cluster is classified based on the presence of core enzymes [15]. These cluster identification gave us many remarkable drugs/antibiotics from many microbial and plant sources (Figures 1 and 2) [16].

Biosynthetic gene clusters for secondary metabolites/ drugs from medicinal plants

There are huge numbers of medicinal plants around the globe, though most of them still remain unexplored. Medicinal plants are tremendous source of secondary metabolites and proper identification and characterization all of them really a challenging problem. Emerging genomic research and genome sequencing of an organism can find many BGCs for secondary metabolites and thus believed that this research have far greater potential than from century old classic bioactivity screens. Recently, analysis of a draft genome sequence of a medicinal plant Catharanthus roseus shows partial clustering of genes encoding monoterpenoid indole alkaloid (MIA) biosynthetic pathway of anticancer drug vinblastine, vincristine (Figure 1) [2]. Seven small clusters each contained two to three genes were found for enzymes responsible for biosynthesis of vinblastine/vincristine. Another anti-tumour alkaloid noscapine was recently discovered from medicinal plant Papaver somniferum (opium poppy). In opium poppy, a gene-cluster of 10-genes encoding five distinct classes of enzymes including first committed step for biosynthetic pathway of noscapine was found (Figure 1) [1,17]. Beside these, biosynthetic gene clusters for many nutraceuticals, anti-nutritional compounds and plant defense were discovered in last 3-4 years or so.

Biosynthetic gene clusters for secondary metabolites/ drugs from microbes

Microbes, specially bacteria and fungi are the source of huge number of secondary metabolites and have produced many drugs. Targeted genome mining and genome sequencing of microorganisms recently produced many drugs/drug like molecules, e.g., discovery of Protegenins A-D through an orphan biosynthetic gene cluster from Pseudomonas protegens, and fungalderived restricticin, an inhibitor of lanosterol demethylase (CYP51) [4,6] . These compounds have significance in the development of anti-fungal drugs and in antioomycete and plant-protective effects. Genome sequencing of Streptomyces rimosus, known as a producer of tetracycline antibiotic exhibited 35-71 BGCs per genome including PKS (polyketide synthase), NRPS (nonribosomal peptide synthetase) and hybrid clusters. These indicates sequencing multiple strains of the same species may improve the chances of natural product drug discovery, and if there is a question of decreased effect (emergence of resistance) of front-line antibiotic tetracycline, then this genomic analyses of S. rimosus suggest that this resource may be explored for many novel antibiotic [3]. Genomic analysis of fungi, Aspergillus nidulans and P. chrysogenum, the producer of remarkable antibiotic penicillin, indicates that penicillin biosynthesis genes pcbAB, pcbC, and penDE are clustered in a single 18-kb region in a wild type strains of the filamentous fungi (Figure 2) [5]. However, these authors argued that penicillin production is not always dependent on the number of multiple copies of penicillin biosynthesis gene cluster.

Finally, it is to be noted that we may lose some important bacterial and fungal sources in drug discovery research, because their biosynthetic gene clusters for secondary metabolites remain cryptic (silent) under normal laboratory culture condition. As we know natural products have been a major source of therapeutic molecules, researchers from all over the world are now applying different strategies to activate these silent clusters, which may have a significant impact on drug discovery.


Medicinal plants and microbes are gifted for natural product mediated drug discovery. These organisms produces huge number of secondary metabolites derived from biosynthetic gene clusters grouped together in their genomes. Genomic research and gene cluster finder now proves that they have far more potential and thus can find many BGCs and their corresponding secondary metabolites than classic bioactivity screening. As diversified plant and microbial genomes around the globe remains largely unexplored, the genomic research must continue at a pace to explore these organisms for new bioactive chemical entities for drug discovery.


Unfortunately, till now only few hundred plant genome sequences are available, however, recently announced 10,000 plant genome sequencing project (10KP) might fill this gap. Next generation sequencing technology now revolutionized genomic research and may be helpful even for recalcitrant plant genomes and hopefully in future this technology along with some bioinformatics tools like anti-SMASH will be able to find many more BGCs and their secondary metabolites from diverse plants and microorganisms found in environment and in the microbiota.


Google scholar citation report
Citations: 2905

Molecular and Genetic Medicine received 2905 citations as per google scholar report

Molecular and Genetic Medicine peer review process verified at publons

Indexed In

arrow_upward arrow_upward