Controllability in Multiplex Biological Networks with Insights into Virus-Related Diseases

Alireza Khanteymoori1; Soudeh Behrouzinia1; Fariba Dehghanian2

doi:10.37421/0974-7230.2025.18.561

Research Article - (2025) Volume 18, Issue 1

Controllability in Multiplex Biological Networks with Insights into Virus-Related Diseases

Alireza Khanteymoori¹^*, Soudeh Behrouzinia¹ and Fariba Dehghanian²

^*Correspondence: Alireza Khanteymoori, Department of Computer Engineering, University of Zanjan, Zanjan, Iran, Email:

Author information

¹Department of Computer Engineering, University of Zanjan, Zanjan, Iran
²Department of Cell and Molecular Biology and Microbiology, University of Isfahan, Iran

Received: 15-Mar-2024, Manuscript No. JCSB-24-130208; Editor assigned: 18-Mar-2024, Pre QC No. JCSB-24-130208 (PQ); Reviewed: 03-Apr-2024, QC No. JCSB-24-130208; Revised: 15-Jan-2025, Manuscript No. JCSB-24-130208 (R); Published: 22-Jan-2025 , DOI: 10.37421/0974-7230.2025.18.561
Copyright: © 2025 Khanteymoori A, et al. This is an open-access article distributed under the terms of the creative commons attribution license which permits unrestricted use, distribution and reproduction in any medium, provided the original author and source are credited.

Abstract

Inspired by biological systems and grounded in mathematical and computational models, complex networks have been extensively employed to represent diverse biological phenomena. The significance of network controllability in comprehending intricate biological systems is universally acknowledged, leading to the development of several algorithms aimed at analyzing network controllability. These algorithms serve the purpose of manipulating input signals to guide biological system dynamics toward desired states. New studies in biological systems have shown that there are complicated connections between nodes that are hard to show in simple networks. In response to these complexities, multiplex networks have emerged as robust constructs capable of accommodating and capturing multiple relationships simultaneously simultaneously within high-dimensional spaces. This research introduces a framework designed to regulate the behavior of biological multiplex networks by identifying pivotal driver nodes. This framework is presented to identify minimum driver nodes that the efficacy of it evaluated through its application to authentic biological multiplexes. Applied to three virus multiplex networks, the framework underscores the potential for identified driver nodes to serve as targets for drug enrichment or as subjects for investigating intricate diseases. The implications of this research extend to identifying potential driver genes for various virus-related diseases within the landscape of biological multiplex networks.

Keywords

Biological networks • Complex networks • Multiplex networks • Controllability • Virus-related diseases

Introduction

Biological networks provide a conceptual and intuitive framework to understand biological systems, which have been increasingly employed in diagnosing and controlling diseases. These networks have become one of the most important fields of biomedical research Yu, et al., and offer a meaningful approach to understanding various diseases. Controllability is a fundamental idea in modern control theory [1]. It has proven to be a useful tool for guiding the behavior of biological systems from a starting point to a wanted end point in a set amount of time. Studies have demonstrated that manipulating and controlling intercellular networks can provide novel drug targets and new ways to treat diseases Liu, et al., Iglesias and Ingalls [2]. In recent years, exploring network controllability through linear dynamic systems has provided novel insights into modern molecular biology. This knowledge aids in identifying key host elements that regulate cell progression during infection, comprehend disease dynamics, spotlight proteins as potential drug targets and facilitate further biological investigation. One of the earliest proposed methods for complete control over gene regulatory networks was the computerbased experimental tests by Liu, et al. Liu, et al., which required direct control over up to 80% of the nodes and was deemed impractical for medical applications. In subsequent studies investigating control features under different network topology, Minimum Dominating Sets (MDS) have been proposed to control interaction network dynamics Wu, et al., Khuri and Wuchty [3].

Recent studies conducted on network controllability have demonstrated that the ability to manipulate and control intercellular networks can provide insights for discovering novel drug targets Asgari, et al. One of the most significant studies proposed a novel algorithm for identifying the essential number of driver nodes to achieve target control while minimizing the number of mediator nodes Ebrahimi et al. Until now, the emphasis in research on controllability methods has been on single layer networks, where a single type of interaction links nodes. However, in reality, complex networks are usually composed of multiple, interconnected layers, known as multiplex networks. These layers interact in various ways, making them a more sophisticated and realistic representation of biological systems [4]. Despite their significance, most controllability methods developed for biomedical systems in recent years have mainly focused on single-layer networks. Therefore, there is a pressing need to understand and control multilayer networks as they pose a fundamental challenge Wang, et al.

In contemporary times, multiplex networks have emerged as promising tools to address the complex inter dependencies that are not fully captured by single-layer networks Bianconi [5]. Consequently, controlling multiplex networks has become a critical and formidable issue for various applications, including drug design Menichetti, et al.

While recent advancements have been in understanding the controllability of multiplex networks, controlling such networks remains a fundamental challenge. Yuan et al. Yuan et al. developed a comprehensive framework that enables controllability analysis in multiplex networks, employing multiple relation and layer networks [6]. Proposed a theoretical approach utilizing disjoint path covers to calculate the minimum number of inputs required to fully control multiplex, multi-timescale networks. The correlation strength of interconnections plays a pivotal role in determining the controllability of multilayer networks, as demonstrated by the findings of Wang, et al. [7]. Finally, Zheng, et al., proposed a general framework for identifying minimal driver nodes and controlling nonlinear dynamical systems to steer multilayered nonlinear dynamical systems toward desired states. Their framework is based on the assumption that each node of the multiplex network is either a driver node in each layer or it is not a driver node in any layer Zhao. However, in real multiplex networks, nodes may exhibit different properties across different layers. Therefore, this assumption may not always hold true. Different layers within a multiplex network could have different sets of driver nodes, which can vary based on the specific attributes or interactions within each layer [8].

We investigated this assumption on the HIV multiplex network, which is common with Zhao and Zhou, as well as other datasets. Subsequently, we introduced a framework to identify the driver nodes in multiplex networks.

Multiplex networks refer to multilayer networks where each layer corresponds to a specific type of interaction between nodes and the nodes in each layer are the same and have a one-to-one mapping. In biological contexts, controlling multiplex networks is crucial for various applications and understanding how to control such systems is still a challenging and fundamental issue. Specifically, we analyze multiplex networks characterized by a predetermined set of nodes connected through diverse interaction types. In our study, we introduce a framework to ascertain the minimum number of driver nodes and investigate the influence of the multiplex structure on network controllability within biological systems [9,10]. The effectiveness of this framework is demonstrated by its successful application to different multiplex networks involving interactions between viruses and hosts. Our results indicate that certain nodes we identified might have potential as targets for drugs in biological experiments or play significant roles in important biological processes, as indicated in the literature [11].

Materials and Methods

Controllability of biological networks

Although real-world systems, such as biological networks, are influenced by complex nonlinear processes, their behavior is often described using linear models. This preference for linear models in network controllability research is because of the availability of effective tools and methods for understanding systems with linear dynamics. On the other hand, the controllability of nonlinear systems at their equilibrium shares several traits with that of linear systems. As a result, the concept of structural controllability can be a suitable criterion for assessing the controllability of nonlinear systems [12].

Structural controllability offers a solution to address the challenges posed by incomplete knowledge about the state of real networks and the high computational costs associated with network controllability. The idea of structural controllability was first introduced by Lin, et al. in 1974 Hemminger and Beineke, offering a way to assess the controllability of networked systems. A key advantage of structural controllability is its ability to enable control across virtually all conceivable parameter values [13]. This implies that if a system demonstrates controllability for a specific set of nonzero system parameters, it will also exhibit controllability for all other parameters except those within a set of measure zero Dion et al.; Zhang et al.

Considering the factors outlined earlier, the primary aim of this study is to explore the structural controllability of biological networks using linear dynamic models. In its most basic representation, the behavior of a Linear Time Invariant (LTI) system with N nodes can be captured by equation (1).

dx(t)/dt=Ax(t)+Bu(t) (1)

Where x(t)=(x1(t), . . . , x_N(t))^T corresponds to the state of N nodes, indicating the network’s status at time t. A represents the network’s adjacency matrix (N × N), capturing how nodes are connected and the strength of their interactions. B as a (N × M) input matrix (M ≤ N) identifies the nodes controlled by external control, while u(t)=(u₁(t), . . . , u_M (t))^T signifies the input signals at time t that guide the system’s behavior. A node directly receiving a control signal is known as a driver node [14]. From now, driver nodes of network will be referred to as ND. In the controllability of a complex network, we need a set of nodes with different input signals. Controllability refers to the capability of directing a linear networked system, described by equation (1), towards a desired outcome using a suitable control signal within a specific time frame. This ability exists if and only if the (N × NM) controllability matrix C=(B, AB, AiB, . . . , A⁽ⁿ⁻¹⁾B) has full rank, that is rank(C)=N . The Kalman rank condition Sontag offers a controllability test for a linear networked system, evaluating the system’s controllability through a designated set of inputs.

Over the past few decades, substantial research has been directed towards examining the controllability of isolated networks to uncover the underlying mechanisms. Controllability involves the capacity to guide a system towards a particular state by applying a control signal, where the driver node is the node that directly receives this signal. Control theory asserts that a networked system is controllable if each node can be controlled independently, yet this approach is often unfeasible and costly [15]. As a result, researchers have been investigating the problem of identifying the Minimum Set of Driver Nodes (MDNS), aiming to achieve network control with the least possible input signal while ensuring controllability Liu, et al.

The problem of identifying the MDNS is challenging for large networks due to its computational complexity. Nonetheless, certain studies have demonstrated that this issue can be converted into a graph-theoretical problem called maximum matching. This approach has been previously used in various studies Commault, et al., and matching has been widely researched in graph theory with various real-world applications Nepusz and Vicsek Liu, et al., proved that the ‘maximum matching’ can be calculated the minimum number of driver nodes required within a network. This breakthrough paved the way for progress in network control through matching techniques. Despite significant advancements in single-layer network controllability, it has become increasingly evident that many intricate biological systems possess a multiplex network structure configuration and involve highly complex nonlinear processes. In the following, multiplex networks are discussed in the next section [16].

Multiplex networks

Multiplex networks are commonly employed to represent various interactions among entities in the real world. A multiplex network Bianconi consists of multiple interacting networks organized into layers. These networks can be visualized as a set of individual single-layer network, sharing identical nodes but featuring distinct edges. A common strategy for modeling multiplex networks involves illustrating various types of interactions among entities, with each type associated with a unique layer. Each layer represents a separate network capturing a specific interaction type [17].

Multiplex networks are represented as G=(V, E, L), where V is the set of nodes, E is the set of undirected edges among the nodes, including intra layer and inter-layer edges and L is discrete layers that share the same nodes and edges, depicting diverse interaction types [18]. Every layer depicting diverse interaction types and provides distinct insights into the network’s attributes. Two fundamental attributes of multiplex networks are the one-to-one correspondence of replica nodes across layers and the linking of interactions to their corresponding replica nodes. In our context, the nodes correspond to proteins and the edges represent relationships between them.

In this case, one way to interpret a multiplex network is no distinction be- tween the identity of corresponding replica nodes without interlinks between layers. Given |L|=K and |V|=N, the multiplex network can be represented with G=(G₁, G₂, .., G_α, ..., G_K). Each network such as G_α=(V_α. E_α) formed by the same set of nodes, i.e. V_α=V={i|i ∈ {1, 2, ..., N}} and by the set of edges E_α. In this case, the complete information regarding the multiplex network is represented by K different adjacency matrices, denoted by a^[α]_ij, corresponding to the network layer α. The adjacency matrices a^[α] of unweighted, undirected multiplex networks are N × N matrices with elements [19].

a_ij ^[α]={1 if node i is linked to node j in layer α at oherwise 0} ( 2)

For weighted, undirected multiplex networks elements w_ij [α] are given by:

a_ij ^[α]w_ij ^[α]= {w if node i is linked to node j in layer α with w_ij^[α] Otherwise 0} (3)

Real-world biological systems often exhibit a multiplex network structure, where nodes are interconnected across various layers through multiple relationships. Managing such systems poses a practical challenge, with the goal is conducting the system dynamics toward the desired outputs by determining the minimum set of driver nodes. To this end, we introduce a Framework called Controllability in Multiplex Biological Networks (FCMBN) based on the proposed approach of Criado, et al. The primary goal of this framework is to investigate the controllability characteristics of complex systems by determining the minimum set of driver nodes needed for control. This approach is a valuable tool for assessing the controllability of real-world multiplex systems, offering significant insights into their dynamics and potential manipulations. The framework’s content in Figure 1 describes the step-by-step process of identifying the essential set of driver nodes [20].

The architecture of FCMBN is divided into two main sections. The initial section delves into elucidating two distinct multiplex network modeling approaches: A(I) and A(II). Each approach transforms a multiplex network into a flattened structure tailored to the network’s inherent characteristics. The result of this phase is two distinct networks, which serve as inputs for the subsequent section. The latter section focus on identifying the essential driver node-set using the Switchboard Dynamic Control (SBD) method. SBD operates as a simplified representation of the underlying dynamic processes found in various real-world networks Nepusz and Vicsek. In the following, sufficient detail is explained for the framework process.

Network modeling

A(I) 1. Identifying a bipartite network of the multiplex network: In this section, we use the bipartite networks to model multiplex network to flatted network. The initial step of this approach involves utilizing the concept of bipartite networks to illustrate the connections between nodes and layers within a multiplex network.

To initiate, a bipartite graph is constructed from a multiplex network. In this graph, one side represents the network layers and the other represents the nodes. This bipartite graph effectively portrays the behavior of nodes within the multiplex network, with edges linking the layers and nodes. Any two edges with a common node in the bipartite graph indicates interactions of nodes within a specific layer in multiplex network (Figure 1).

Figure 1. Introduced framework on a multiplex network in two main sections: multiplex network modeling and Driver node identification. Note: A(I)-1, The bipartite graph corresponding to a multiplex network, so that α, β and γ are network layers and a, b, c, d, e and f are nodes of the network. A(I)-2, The node-mode projections of the bipartite graph as Flatted Network. A(II), corresponding to modeling of multiplex network to the supra-adjacency network that is used as another Flatted Network. Step 1, extracting the line graph L(G) of A(I) and A(II) flatted networks. The edges of A(I) and A(II) flatted networks are equivalent to the nodes of this network. Step 2, Applying the minimum input theorem to L(G) to investigate controlling paths. Step 3, mapping back from L(G) and identifying driver nodes. The two an and e nodes be selected as driver nodes because control paths starts from either an ore. specific layer in multiplex network.

A(I) 2. Identifying node-mode projections of the bipartite network: In the following of A(I)-1, we employ a comprehensive method called “one-mode projection” to condense data of bipartite networks and analyze the interconnections among a specific group of nodes (Figure 2).

Figure 2. Algorithm 1 A (I)-1: Bipartite network of the multiple network.

The method we apply in bipartite networks is referred to as nodemode projection. This process entails creating a network that includes nodes from just one of the two sets, where two nodes in this resulting network are linked only if they share a neighboring layer in the original graph. For instance, two nodes would be connected within output network if they have a connection in the same layer in multiplex network. The outcomes of this section is a flatted network that can use as the input network for the subsequent step.

A(II). Identifying the supra-adjacency matrix of the multiplex network: In this section, we use the supra-adjacency matrix to model multiplex network to flatted network. While adjacency matrices are suitable for representing simple networks, they are inadequate for accurately describing multiplex networks, which demand higher-order matrices for proper depiction. A prevalent strategy involves utilizing the supra adjacency matrix, which involves flattening the multiplex network.

Figure 3. Algorithm 2 A (I)-2: one-mode projection of bipartite network.

That visualizes and flatten all the adjacency matrices of a multiplex network, interlayer and intralayer, into a single, large matrix.

The supra-adjacency matrix comprehensively portrays the entire multiplex network, encapsulating all interlayer and intralayer adjacency matrices within a single, expansive matrix. This matrix structure arranges each layer’s adjacency matrix as a distinct block along the main diagonal of the supra-adjacency matrix. Meanwhile, inter-layer connections are encoded in off-diagonal blocks through diagonal matrices. More precisely, the diagonal is populated with intralayer adjacency matrices, while the interlayer adjacency matrices are positioned elsewhere. This segment employs the supraadjacency matrix for modeling the multiplex network. The resulting adjacency matrix of this new network is the supra-adjacency representation of the multiplex network named flatted network that can use as the input network for the subsequent step.

Step 1. Identifying line graph L(G) of the multiplex network: The concept of the line graph was originally presented by Hemminger et al. Hemminger and Beineke. In graph theory, the line graph of a directed graph represented as L(G) that is a directed graph that conveys the relationships among edges in the G. In L(G), the nodes represent the edges that exist in the original graph G. The label of each node in L(G) is constructed based on two node associated with the edge in the initial graph. Every edge in L(G) represents a directed path with length of two that spans three nodes in the original graph G (Figure 4).

Figure 4. Algorithm 3 A(II): Supra-adjacency matrix of multiplex network.

The nodes in L(G) correspond to the edges in G and each node in L(G) refers to an edge in G.
The adjacency between two nodes in L(G) is influenced by whether their corresponding edges in G share an endpoint.

Based on definition 4, in the line graph, each node in L(G) corresponds to an edge from the original graph G. Likewise, every edge in L(G) represents a directed path of length two in G. Therefore, according to the definition of switchboard dynamics, the equivalence between the switchboard dynamics in G and the linear time-invariant dynamics on the nodes L(G). It is shown that determining the minimum set of driver nodes necessary for exerting control over SBD within a network can be accomplished by utilizing of line graph and maximum matching. We use the line graph concept to analyze the A(I) and A(II) flatted networks (Figure 5).

Figure 5. Algorithm 4 step 1: L(G) of network.

Step 2. Applying the maximum matching to L(N) and the minimum input theorem

If a vertex is matched, it means that it serves as the terminal (ending) point of an edge belonging to the matching set. Conversely, if a vertex is unmatched, it indicates that it does not function as the ending point for any edge within the matching. In the Minimum Input Theorem states that when all nodes are matched, the matching achieves perfection and in this case, the count of driver nodes is 1. Conversely, when not all nodes are matched, the count of Minimum (Figure 6).

Figure 6. Algorithm 5 step 2: Applying the maximum matching to L(N).

Dominating Sets (MDSs) can be calculated as nD=N−MN^∗, where MN^∗ represents the number of nodes that have been successfully matched. Zhao and Zhou.

By employing the minimum input theorem, we can convert the controllability challenge of a linear networked system into a Maximum Matching (MM) problem. The primary objective of the maximum matching problem is to identify the most extensive set of edges where no two edges share any common nodes. In graph theory, a matching set comprises edges that do not share any common nodes. In this section, an equivalence is established between the mini- mum set of driver nodes needed to maintain switchboard dynamics on a network G (V,E) and the linear time-invariant dynamics on the vertices of L(G). This process involves utilizing the maximum matching on L(G), where matching de- notes a group of directed edges without shared starting or ending nodes. During this step, the separate control paths is identified utilizing the minimum input theorem in L(G).

Step 3. Identifying the driver nodes by using mapping back from L(N): In this step, our attention should now shift towards the control paths resulting from the maximum matching in L(G), utilized for identifying driver nodes. The output of the previous step provides us with a collection of matching paths. By mapping these paths from L(G) back onto G, we obtain the control paths in G. The traversing of these control paths form a comprehensive cover of all edges in G and ensures that each edge in G is encompassed by at least one of these walks. In G, the sequence for traversing the edges is determined by the control paths. Utilizing the sequence of traversing, the driver nodes in G can be pinpointed based on those originating from the walks. As a result, all nodes of G that each walk starts from them, are determined as driver nodes.

Results and Discussion

To explore the controllability of multiplex networks and examine the roles of proteins within a biological group across different interaction types, we analyzed different biological multiplex networks. Specifically, we focused on the Human-HIV1, Human-Herpes4 and Hepatitis C multiplex networks. Each dataset has an associated file containing genetic and protein interaction data in the different layers. These layers were obtained from diverse genetic interactions in the comprehensive biological repository.

Our results demonstrated the effectiveness of introduced framework in determining the minimum set of driver nodes within multiplex networks, making them suitable for controlling multiplexnetworked systems. The specification of these multiplex networks is shown in Table 1. The outcomes of the driver node identification process for these real networks are presented in Tables 2 to 4.

In network analysis, centrality measures play a crucial role in identifying the most important nodes within a network using realvalued metrics. These measures find application in highlighting key nodes in biological networks. The objective of introduced centrality measures has ranked nodes and layers within a network, a vital aspect in various biological network analyses. Among the commonly utilized centrality measures for analyzing multiplex networks is the MultiRank algorithm.

In network analysis, centrality measures play a crucial role in identifying the most important nodes within a network. These measures find application in highlighting key nodes in biological networks. Among the commonly utilized centrality measures for analyzing multiplex networks is the MultiRank algorithm. In this paper, we demonstrate the applicability of the MultiRank algorithm in identifying important nodes within biological multiplex network datasets, serving as an informative metric for effectively assigning rankings to both nodes and layers in multiplex networks. The algorithm operates based on the assumption that the centrality of a node is influenced by the centrality of the layers it is connected to and vice versa the centrality of the layers depends on the centrality of the nodes within them. The MultiRank algorithm incorporates three parameters Rahmede et al. that determine its behavior.

The parameter “a” can take values of 1 or 0, When a=1, the influence of layers is proportional to their total weight, whereas a=0 assigns an equal influence to all layers.
The parameter “s” can take values of 1 or -1. For s=1, layers with more central nodes have a greater influence, while for s=-1, exert a greater influence when they encompass a smaller number of highly influential nodes.
The parameter “γ” is a scaling factor that can either enhance (γ<1) or suppress (γ>1) the influence of the layers. The specific value of γ depends on the chosen value of s.

Here, we describe the use of MultiRank on three significant multiplex network examples: The Human-HIV1 multiplex, Human- Herpes4 multiplex and Hepatitusc multiplex. The MultiRank centrality criterion yields results that validate the findings of the framework and demonstrate the significance of the identified driver genes. Furthermore, we conducted analyses of biological processes and signaling pathways associated with the identified driver nodes using the Database (DAVID; https://david.ncifcrf.gov/) Dennis, et al. and the Protein Analysis Through Evolutionary Relationships (PANT HER; http://www.pantherdb.org/) public database platforms. These tools are designed to provide functional interpretations and signaling pathways for comprehensive gene lists derived from genomic studies. Our utilization of these platforms enabled us to functional and pathway analyses of the protein set, validating our outcomes.

Dataset

The availability of protein and genetic interaction datasets is essential for analyzing biological network properties and studying the functions of genes and proteins. A widely utilized and openly accessible database containing gene/protein interactions is BioGRID, accessible at http://www.thebiogrid.org. These datasets hold significant importance in multiplex network research due to their highquality, well-defined nature. They serve as valuable resources for evaluating new algorithms and measures aimed at extracting pertinent information Stark et al.; de Domenico, et al. BioGRID stands as an expansive and freely accessible repository, encompassing over 720,000 interactions sourced from more than 41,000 publications. These interactions within the database have been meticulously curated, derived from high-throughput datasets and focused studies concerning genes and proteins.

Networks	Layers	Desc
HHMG network	5	Physical association, direct interaction, colocalization, association and suppressive genetic interaction are defined by inequality
HSVMG network	4	Direct interaction, Physical association, association, colocalization
HCVMG network	3	Physical association, association, colocalization

Table 1. Description of three multiplex networks: HHMG, HSVMG and HCVMG networks.

Analyzing multiplex network layers as independent networks

As stated in the introduction, the method proposed in Zhao and Zhou relies on the assumption that in multiplex networks, a node is either a driver node in all layers or not a driver node in any layer. However, real multiplex networks often exhibit varying characteristics across layers, making it inappropriate to assume uniformity in node definitions. Therefore, we cannot assume that each node in a multiplex network is consistently a driver node across all layers. To delve deeper into this issue, we conducted separate analyses of each layer within the multiplex network, identifying driver nodes for each layer individually. The results detailing the number of driver nodes identified for each layer are presented in Tables 2-4.

Layers	#nodes	#edges	#driver nodes
L1	1005	869	35+246=281
L2	380	434	7+625=632
L3	34	33	11+971=982
L4	21	18	14+984=998
L5	2	1	1+1003=1004

Table 2. The number of driver nodes of HHMG multiplex network base on layers HHMG multiplex network (#Layer=5, #Nodes=1005, # Edges=1355).

Layers	#nodes	#edges	#driver nodes
L1	42	37	11+174=185
L2	202	209	15+14=29
L3	10	7	4+206=210
L4	9	6	4+207=211

Table 3. The number of driver nodes of HCVMG multiplex network base on layers HCVMG multiplex network (#Layer=4, #Nodes=216, #Edges=269).

Layers	#nodes	#edges	#drivernodes
L1	82	88	13+23=36
L2	44	47	5+61=66
L3	3	2	1+102=103

Table 4. The number of driver nodes of HSVMG multiplex network base on layers HSVMG multiplex network (#Layer=3, #Nodes=105, #Edges=137).

The workflow in this section involves analyzing each layer of the network separately and applying the proposed method to identify driver nodes. Within each layer’s network, some nodes become isolated due to a lack of communication with other nodes. Isolated nodes, by definition, have no connections with other nodes and cannot be accessed from them. Therefore, they should be considered as driver nodes for controlling the entire network. The number of isolated nodes identified as driver nodes in each layer is highlighted in red. For instance, in the first layer of the HHMG network, 35 nodes are identified as driver nodes. Considering the 246 isolated nodes in this layer, which also function as driver nodes, the total count of driver nodes for this layer becomes 281. Similar results are observed for other multiplex networks.

If the proposed hypothesis is correct, the integration of driver nodes of all layers for each multiplex network should be considered as the final sum of the driver nodes of the entire network. This means that almost all nodes in the multiplex network should be considered as driver nodes which is practically unacceptable.

A common objective in controlling a real network is to minimize the number of driver nodes or control inputs. This reduction can enhance efficiency and cost-effectiveness in network management. Streamlining the system by minimizing driver nodes can lead to easier control and reduced the failures. Furthermore, fewer control inputs simplify the design and implementation of control strategies, facilitating network maintenance and scalability. Ultimately, minimizing the number of driver nodes contributes to optimizing the performance and operation of large networks.

In a network, it is uncommon for all nodes to be designated as driver nodes. Driver nodes are typically identified as nodes that have a higher level of responsibility or control within the network and they help manage the overall operation of the network. Having all nodes identified as driver nodes may not be practical or efficient, as it may lead to confusion and potential conflicts in network management. It is important to designate driver nodes strategically based on the network architecture.

Real multiplex networks

Human-HIV1 Multiplex Gpi (HHMG) network: The Human Immunodeficiency Virus (HIV) was first introduced to the human population between 1920 and 1940. This virus is one of the most notorious pathogens encountered by humans, causing an infection approximately every 9½ minutes. The impact of HIV/AIDS has been profound, affecting society both in terms of health and economically. The human-HIV1 multiplex is a network dataset that portrays various genetic interactions involving HIV-1. This dataset consists of 5 layers, each representing distinct types of genetic interactions and 1005 nodes representing proteins. Each layer is directed and unweighted presenting a comprehensive portrayal of the intricate relationships within the system.

In this study, we utilized FCMBN on the HHMG network to address the minimum set of driver nodes using two distinct modeling approaches to flatted network. Upon applying the proposed framework to the HHMG network, we detected two driver node sets, each corresponding to a different modeling approach(A(I) and A(II)). Table 5 presents our analysis results on the 5-layer Human-HIV1 Multiplex network.

A(I) approach	A(II) approach
nD: 57	nD: 82
Driver nodes	Driver nodes
Vpr, Tat, Env, Gag-pol, TAR, Gag,	Vpr, Tat, Env, Gag-pol, Gag, Nef, Vpu,
CD59, THY1, CD63, Nef, CD247,	Rev, HCK, Vif, RNF216, Vpu, Rev,
Vpu, Rev, EIF2C2, APOBEC3G,	HCK, Vif, RNF216, MDM2, COPB1,
Siglec1, Vif, TCEB2, TCEB1, MDM2,	ANXA2, SIRT2, SIRT6, HSPA4,
COPB1, ANXA2, SIRT2, SIRT3,	VPS4A, PSMA6 PSMC1, CHMP4A,
SIRT6, HSPA4, VPS4A, NMT2,	AP1G1, SMUG1, DDB2, HGS,
MR1, PSMA6, PSMC1, CHMP4A,	PAFAH1B1, MDFIC, NCOA2, Slmb,
AP1G1, SMUG1, Nedd4, PTPN23,	Pparg, CD4, TAR, CD247, BST2,
DDB2, HGS, UBE2F, Mapk1,	EIF2C2, JUN, APOBEC3G, Siglec1,
MAPK1, Pparg, MDFIC, NCOA2,	SIRT1, SIRT3, NMT2, NMT1,
SUPT4H1, PPP1CA, CXCR4,	PRKDC, Nedd4, BROX, PTPN23,
VPS18, PCSK1, PCSK5, PCSK6,	CDK7, FBXW11, HUWE1, UBE2F,
FURIN, CTSG, Slmb, MON2, BROX,	Mapk1, MAPK1, VPS18, MON2,
SUPT5H	CALM1, PCSK1, PCSK5, PCSK6,
	FURIN, CTSG, CCNT1, CD59,
	THY1, CD63, SOCS1, MR1, TRIM5,
	CREBBP, CUL5, PSIP1, TSG101,
	POLR2A, FYN, CUL2, RBX1,
	SUPT5H, TERT, SUPT4H1BTF3,
	HMGN2, PPP1CA, CXCR4

Table 5. The minimum driver nodes of HHMG multiplex networks. (#Layer=5, #Nodes=1005, Edges=1355).

The essential driver node sets required to manage the Human- HIV1 multiplex network have differences based on the modeling approaches used. Specifically, the A(I) and A(II) approaches identified 57 and 82 driver nodes, respectively, representing roughly (∼0.06) and (∼0.08) of the total network nodes. These findings suImmune response-regulating cell surface receptor signaling path wayest that achieving complete control over the nodes requires independent control over approximately 6% and 8% of them, respectively. A(II) modeling approach characterizes the minimum number of driver nodes, indicating that a small group of nodes could potentially manage the Human-HIV1 Multiplex network.

More importantly, some identified proteins as driver nodes have been experimentally confirmed to play crucial roles in vital biological processes, serve as potential drug targets or be essential host cells for treating HIV-1. Notably, ENV, GAG-POL, GAG and NEF are among the important proteins identified as driver nodes using the two approaches, with the former being a crucial drug target for treating HIV-1. Table 2 shows the primary difference between the two approaches. A(II) has performed relatively better than the first approach in the HHMG network, as it identified significant nodes related to HIV disease. However, it ranks second in identifying the minimum number of driver nodes.

Several identified nodes as driver nodes, are host receptors considering the extensive interaction between viral and host proteins. Among them, CD4 is an indispensable host cell that assumes a central role in protecting the body and mounting an effective defense against infections. However, HIV exploits CD4 to reproduce and disseminate throughout the body. The research conducted by Zheng et al. Zheng et al. reveals that Most of our identified protiens are thesame with their driver nodes and are crucial players in mediating the interactions between hosts and viruses, important biological processes and drug targets.

In addition, we compared the driver proteins identified by FCMBN and the outcomes of the MultiRank algorithm for protein ranking. In Figure 7, we present a graphical representation of the top 16 proteins per MultiRank evaluated for s=1; -1 and a=0; 1. The analysis reveals that, regardless of the cases s=1; -1 and a=1; 0, TAT, REV, ENV and GAG-POL consistently secured the top positions in the ranking. The positions of these proteins remain stable across all cases. Furthermore, we observed that the ranking of nodes appears to be more consistent for the cases s=1 compared to the cases s=-1. Notably, the essential genes such as ENV, GAG-POL, GAG and NEF, which play a significant role in treating HIV-1, occupy top-ranking positions based on multi Rank ranking.

Figure 7. Ranking of the 16 top proteins in the Human-HIV1 Multiplex Network (listed from top to bottom in order of decreasing centrality) according to the Multi Rank algorithm is here shown for different values of the parameters s=1; -1 and a=0;1 as a function of γ (0.3).

The biological processes and pathways related to the driver proteins identified through the two approaches were subjected to analysis using the DAVID and Panther tools. This analysis aimed to uncover the biological functions and pathways associated with these driver proteins. The results are visually presented in Figure 8 and Figure 9, outlining processes that regulate viral transcription frequency, rate or extent, viral life cycle, viral processes, viral budding and immune responses. Notably, A(II) stood out among the two approaches by identifying the largest number of nodes associated with all the examined processes in the Panther and DAVID analyses, as depicted in the figures. For further insight, the outcomes of the signaling pathway analysis are presented in Figures 8-11.

Figure 8. Panther analyses for biological process in the human-HIV1 multiplex network.

Figure 9. David analyses for biological process in the human- HIV1 multiplex network.

Figure 10. Panther analyses for signaling pathway in the human- HIV1 multiplex network.

Figure 11. David analyses for signaling pathway in the human-HIV1 multiplex network.

Human-Herpes4 Multiplex Gpi (HSVMG) network

HSV-1 and HSV-2, which belong to the human Herpesviridae family, are known culprits behind viral infections in most individuals. These viruses are notably contagious and often result in conditions such as cold sores and genital herpes, given their ability to spread through viral shedding by an infected individual. The HSVMG, a multiplex network featured in Bio GRID, encompasses 216 nodes and 259 regulatory interactions distributed across 4 layers. This network is characterized by its unweighted and directed nature.

After subjecting introduced framework to testing on various realworld net- works, the results demonstrate its capability to accurately identify driver nodes that effectively control the system. The introduced framework with two A(I) and A(II) modeling approaches identified 14(∼0.06) and 22(∼0.10) driver nodes, respectively. These results indicate the percentage of driver nodes required to control multiplex networks. We evaluated the results of the two approaches by considering the count of identified driver proteins. The A(I) approach outperformed the other approach by discovering 14 driver proteins, followed by A(II) with 22. However, in the case of this specific network, A(II) performed better than A(I) by identifying valuable nodes related to herpes disease.

The two proposed approaches identified several driver nodes in the HSVMG network, including EBNA-LP, EBNA-3C, LMP-1, EBNA-1, LMP-2A and EBNA-3C, which have been shown to play essential roles in various biological processes based on existing literature. Table 6 compares results obtained from the two approaches and their effectiveness in the HSVMG network. Consistent with experimental evidence, some of the predicted driver proteins, such as EBNA-LP, have been important implicated in hostvirus interactions.

A(I) approach	A(II) approach
nD: 14	nD: 22
Driver nodes	Driver nodes
BRLF1, CHEK2, EBNA-1, EBNA-	EBNA-LP, EBNA-3B/EBNA-3C,
LP, SPI1, EBNA1BP2, HMGB2,	EBNA1BP2, HMGB2, LMP-1,
LMP-1, SUMO2, BBLF2/BBLF3,	SUMO2, BBLF2/BBLF3, BPLF1,
SUMO1, BPLF1, BRRF1, ITCH	BRRF1, MYC, BRLF1, CHEK2,
	EBNA-1, SPI1, LMP-2A, SUMO1,
	ZNF350, TRIM28, NEDD8, UBC,
	CDKN2A, ITCH

Table 6. The minimum driver nodes of HSVMG multiplex networks. (#Layer=4, #Nodes=216, Edges=269).

To demonstrate the efficacy of the proposed framework in the HSVMG network, we applied the MultiRank algorithm to rank its nodes. In Figure 12, we present the results for the top 23 proteins ranked by MultiRank. The MultiRank analysis reaffirms the significance of the identified protein drivers. We observed that the ranking and positions of the identified driver nodes were consistent for each case where s=1 and s=-1, as well as a=1. Furthermore, the EBNA-LP protein consistently ranked in each case, where s=1, -1 and a=1,0; its position remained stable in every scenario.

Figure 12. Ranking of the 23 top proteins in the human-herpes4 multiplex

Network (listed from top to bottom in order of decreasing centrality) according to the Multi Rank algorithm is here shown for different values of the parameters s=-1,1; 1and a=0; 1 as a function of γ (0.3).

The crucial biological processes and pathways linked to the driver proteins within HSVMG, corresponding to the identified driver nodes, are illustrated in Figures 13-16. These outcomes are derived from the analysis conducted using DAVID and Panther tools. These visual representations offer valuable insights into how driver nodes are distributed across various biological processes and pathways for each approach.

Figure 13. Panther analyses for biological process in the human-herpes4 multiplex network.

Figure 14. David analyses for biological process in the Human-Herpes4 Multiplex Network.

Figure 15. Panther analyses for signaling pathway in the human-herpes4 multiplex network.

Figure 16. David analyses for signaling pathway in the human-herpes 4 multiplex network.

Hepatitis multiplex Gpi (HCVMG) Network

Hepatitis C, caused by the Hepatitis C Virus (HCV), is a significant infectious disease with a particular affinity for the liver. While its primary impact is on the liver, it can potentially lead to liver disease, liver failure, liver cancer or the development of enlarged blood vessels in the esophagus and stomach over an extended period. Investigated as a multiplex network, HCVMG represents distinct types of genetic interactions, comprising 216 nodes and 259 regulatory interactions distributed across 3 layers. Each layer delineates the collaborations within a specific context and directed and unweighted connections characterize the network.

We evaluated the introduced framework’s performance with two modeling approach on the HCVMG network. We analyzed the number of drivers identified and the proportion of validated driver proteins among all the predicted drivers. The findings indicated that each approach required approximately (∼0.09) and (∼0.15) of the total number of nodes in the network as the minimum required driver nodes for network control. Table 7 presents the analysis results of the 3-layer hepatitis C virus multiplex network.

A(I) approach	A(II) approach
nD: 10	nD: 15
Driver nodes	Driver nodes
HCVgp1, HIST2H2BE, PSMA3,	HCVgp1, YY1, EP300, UBQLN1,
CHMP4B, PSMA4, HCVgp1,	PSMA3, CHMP4B, PSMA4, TN-
YY1, EP300, UBQLN1, PSMA3,	FRSF1A, FBXL2, SMAD3, SMURF1,
CHMP4B, TNFRSF1A, FBXL2,	SMURF2, HIST2H2BE, OAS1,
OAS1, ALDH9A1, SMAD2	ALDH9A1

Table 7. The minimum proteins drivers of HCVMG multiplex networks. (#Layer=3, #Nodes=105, Edges=137).

The proposed framework was applied to the HCVMG network, identifying ten driver nodes in A(I), HCVgp1, HIST2H2BE, PSMA3, CHMP4B, PSMA4, TNFRSF1A, FBXL2, OAS1, ALDH9A1 and SMAD2. and fifteen driver nodes in A(II), HCVgp1, YY1, EP300, UBQLN1, PSMA3, CHMP4B, PSMA4, TN- FRSF1A, FBXL2, SMAD3, SMURF1, SMURF2, HIST2H2BE, OAS1 and ALDH9A1. Our study also highlights the crucial role of HCVgp1 in Hepatitis C and its involvement in various biological processes. Notably, HCVgp1 was identified as a driver node in all two modeling approaches.

The predicted driver nodes in the two modeling approaches include several significant proteins, such as gp1, YY1, OAS1 and E1 (E1A binding protein p300), enriched in multiple biological processes and viral life cycles. For instance, YY1 has been found to inhibit the replication of different viruses, such as hepatitis C and human immunodeficiency viruses. Furthermore, some of the identified driver nodes are known to host receptors. PSMA3, for example, interacts with the Hepatitis C Virus (HCV) F protein in host cells. Based on this information, it can be inferred that the A(II) approach outperforms the A(I) approach in identifying more significant driver nodes.

Additionally, Figure 17 illustrates the multi rank centrality of the hepatitis C multiplex network for various multirank parameters. MultiRank centrality results corroborate the significance of HCVgp1 and YY1 in the context of hepatitis C. Notably, the rankings and positions of the identified driver nodes remain consistent across all cases with s=1 or -1 and a=1 or 0. Figures 18 and 19 present the validated outcomes pertaining to biological processes, while Figures 20 and 21 showcase pathway analysis using Panther and DAVID tools in the HCVMG network. The analyses presented in these figures shed light on the involvement of these proteins in critical biological processes and pathways associated with viral diseases.

Figure 17. Ranking of the 16 top proteins in the hepatitis c multiplex network (listed from top to bottom in order of decreasing centrality) according to the MultiRank algorithm is here shown for different values of the parameters s=-1,1; 1and a=0; 1 as a function of γ (0.3).

Figure 18. Panther analyses for biological process in the human-hepatitis c multiplex network.

Figure 19. David analyses for biological process in the human-hepatitis c multiplex network.

Figure 20. Panther analyses for signaling pathway in the human-hepatitis c multiplex network.

Figure 21. David analyses for signaling pathway in the human-hepatitis c multiplex network.

Conclusion

The controllability of a networked system is an important feature that plays an essential role in controlling issues. In the realm of complex systems, the prevalence of interconnected networks is undeniable and the potential for modeling them as multiplex networks presents a more nuanced understanding of biological processes compared to isolated network models. This underscores the critical importance of investigating multiplex network controllability.

With this context in mind, the central focus of this study has been to introduce a comprehensive framework for identifying the minimum number of driver nodes within multiplex networks. These driver nodes hold the key to orchestrating transitions within the network, steering it away from unfavorable states and towards more desirable configurations through strategic interventions. The introduced framework hinging on the results of related research, multirank centrality results and public database platforms such as David and Panther has proven to be efficacious. Confirming the importance of our identified driver nodes by these refrences is evidence for the potency of this framework in unearthing highly influential driver nodes.

Applying introduced framework to three multiplex networks associated with viral diseases and analysis of introduced driver nodes validates our framework’s effectiveness with two modeling approaches. In particular, though the A(II) approach may not have reached the minimum stipulated driver node count, it has excelled in terms of the identified nodes’ significance compared to the A(I) approach. Furthermore, the observation that the minimum driver nodes identified by the FCMBN framework hold activity potential as drug targets and vital contributors to essential biological processes strengthens the real-world applicability of our framework. In effect, our work introduces an efficient avenue for analyzing and exploring multiplex networks. It is an efficient way to explore and analyze multiplex networks and identify the drug targets for leading the transitions of underlying biological networks and studying complex diseases.

References

Newman, Mark EJ. "The Structure and Function of Complex Networks." SIAM Review 45 (2003): 167-256.
[Crossref] [Google Scholar]
Commault, Christian, Jean-Michel Dion and Jacob W. van der Woude. "Characterization of Generic Properties of Linear Structured Systems for Efficient Computations." Kybernetika 38 (2002): 503-520.
[Google Scholar]
Criado, Regino, Julio Flores, Alejandro Garcia del Amo and Miguel Romance, et al. "Line Graphs for a Multiplex Network." Chaos 26 (2016).
[Crossref] [Google Scholar] [PubMed]
De Domenico, Manlio, Albert Solé-Ribalta, Emanuele Cozzo and Mikko Kivelä, et al. "Mathematical Formulation of Multilayer Networks." Phys Rev X 3 (2013): 041022.
[Crossref] [Google Scholar]
De Domenico, Manlio, Vincenzo Nicosia, Alexandre Arenas and Vito Latora. "Structural Reducibility of Multilayer Networks." Nat Commun 6 (2015): 6864.
[Crossref] [Google Scholar] [PubMed]
Dennis, Glynn, Brad T. Sherman, Douglas A. Hosack and Jun Yang, et al. "DAVID: Database for Annotation, Visualization and Integrated Discovery." Genome Biol 4 (2003): 1-11.
[Crossref] [Google Scholar] [PubMed]
Dion, Jean-Michel, Christian Commault and Jacob Van der Woude. "Generic Properties and Control of Linear Structured Systems: A Survey." Automatica 39 (2003): 1125-1144.
[Crossref] [Google Scholar]
Ebrahimi, Ali, Abbas Nowzari-Dalini, Mahdi Jalili and Ali Masoudi-Nejad. "Target Controllability With Minimal Mediators in Complex Biological Networks." Genomics 112 (2020): 4938-4944.
[Crossref] [Google Scholar] [PubMed]
Kanhaiya, Krishna, Eugen Czeizler, Cristian Gratie and Ion Petre. "Controlling Directed Protein Interaction Networks in Cancer." Sci Rep 7 (2017): 10327.
[Crossref] [Google Scholar] [PubMed]
Khuri, Sawsan and Stefan Wuchty. "Essentiality and Centrality in Protein Interaction Networks Revisited." BMC bioinformatics 16 (2015): 1-8.
[Crossref] [Google Scholar] [PubMed]
Liu, Yang-Yu, Jean-Jacques Slotine and Albert-László Barabási. "Controllability of Complex Networks." Nature 473 (2011): 167-173.
[Crossref] [Google Scholar] [PubMed]
Llabrés, Mercè, Gabriel Riera, Francesc Rosselló and Gabriel Valiente. "Alignment Of Biological Networks By Integer Linear Programming: Virus-Host Protein-Protein Interaction Networks." BMC Bioinformatics 21 (2020): 434.
[Crossref] [Google Scholar] [PubMed]
Menichetti, Giulia, Luca Dall’Asta and Ginestra Bianconi. "Control of Multilayer Networks." Sci Rep 6 (2016): 20706.
[Crossref] [Google Scholar] [PubMed]
Mi, Huaiyu, Anushya Muruganujan, John T. Casagrande and Paul D. Thomas. "Large-Scale Gene Function Analysis with the Panther Classification System." Nat Protoc 8 (2013): 1551-1566.
[Crossref] [Google Scholar] [PubMed]
NediÄ?, Angelia and Alex Olshevsky. "Distributed Optimization Over Time-Varying Directed Graphs." IEEE Trans Autom Control 60 (2014): 601-615.
[Crossref] [Google Scholar]
Nepusz, Tamás and Tamás Vicsek. "Controlling Edge Dynamics in Complex Networks." Nat Phys 8 (2012): 568-573.
[Google Scholar]
Pósfai, Márton, Jianxi Gao, Sean P. Cornelius and Albert-Laszlo Barabasi, et al. "Controllability of Multiplex, Multi-Time-Scale Networks." Phys Rev E 94 (2016): 032316.
[Crossref] [Google Scholar] [PubMed]
Rahmede, Christoph, Jacopo Iacovacci, Alex Arenas and Ginestra Bianconi. "Centralities of Nodes and Influences of Layers in Large Multiplex Networks." J Complex Netw 6 (2018): 733-752.
[Crossref] [Google Scholar]
Rivailler, Pierre, Young-gyu Cho and Fred Wang. "Complete Genomic Sequence of an Epstein-Barr Virus-Related Herpesvirus Naturally Infecting a New World Primate: A Defining Point in the Evolution of Oncogenic Lymphocryptoviruses." J Virol 76 (2002): 12055-12068.
[Crossref] [Google Scholar] [PubMed]
Stark, Chris, Bobby-Joe Breitkreutz, Teresa Reguly, Lorrie Boucher, Ashton Breitkreutz and Mike Tyers. "BioGRID: A General Repository for Interaction Datasets." Nucleic Acids Res 34 (2006): D535-D539.

Awards & Nominations

50+ Million Readerbase

Journal Highlights

Google Scholar citation report

Citations: 2279

Journal of Computer Science & Systems Biology received 2279 citations as per Google Scholar report

Journal of Computer Science & Systems Biology peer review process verified at publons

Indexed In

CAS Source Index (CASSI)
Index Copernicus
Google Scholar
Sherpa Romeo
Academic Journals Database
Genamics JournalSeek
JournalTOCs
CiteFactor
Electronic Journals Library
RefSeek
Hamdard University
EBSCO A-Z
Directory of Abstract Indexing for Journals
World Catalogue of Scientific Journals
OCLC- WorldCat
Scholarsteer
SWB online catalog
Virtual Library of Biology (vifabio)
Publons
Dtu findit
Geneva Foundation for Medical Education and Research

Journal of Computer Science & Systems Biology

Controllability in Multiplex Biological Networks with Insights into Virus-Related Diseases

Abstract

Keywords

Introduction

Materials and Methods

Results and Discussion

Conclusion

References

Awards & Nominations

50+ Million Readerbase

Journal Highlights

Google Scholar citation report

Citations: 2279

Journal of Computer Science & Systems Biology peer review process verified at publons

Indexed In

Related Links

Open Access Journals