Open Access

Modularity in complex multilayer networks with multiple aspects: a static perspective

Contributed equally
Applied Informatics20174:7

DOI: 10.1186/s40535-017-0035-4

Received: 17 March 2017

Accepted: 28 April 2017

Published: 8 May 2017


Complex systems are usually illustrated by networks which capture the topology of the interactions between the entities. To better understand the roles played by the entities in the system, one needs to uncover the underlying community structure of the system. In recent years, systems with interactions that have various types or can change over time between the entities have attracted much research attention. However, algorithms aiming at solving the key problem—community detection—in multilayer networks are still limited. In this work, we first introduce the multilayer network model representation. Then based on this model, we naturally derive the multilayer modularity—a widely adopted objective function of community detection in networks—from a static perspective to evaluate the quality of the communities detected in multilayer networks. It enables us to better understand the essence of the modularity by pointing out the specific kind of communities that will lead to a high modularity score. We also propose a spectral method called mSpec for the optimization of the proposed modularity function based on the supra-adjacency representation of the multilayer networks. Experiments on the electroencephalograph network and the comparison results on several empirical multilayer networks demonstrate the feasibility and reliable performance of the proposed method.


Modularity Community detection Multilayer Multiple aspects Hamiltonian Spectral method


Complex systems are usually illustrated by networks which capture the topology of the interactions between the entities (Strogatz 2001; Newman 2010; Wasserman and Faust 1994; Girvan and Newman 2002; Lambiotte et al. 2014; Zhang et al. 2015, 2016). For systems with more complicated entity interconnections, edges with different attributes, e.g., directed graphs (Newman 2010; Bang-Jensen and Gutin 2008), weighted graphs (Newman 2004; Barrat et al. 2004), signed graphs (Doreian and Mrvar 2009; Yang et al. 2007) and so on, have been thoroughly studied. In recent years, systems with entity interactions that have various types or can change over time have attracted an increasing research attention (Verbrugge 1979; Szell et al. 2010; Rocklin and Pinar 2013; Holme and Saramäki 2012). For example, a person interacts with his friends in Facebook and uses emails for business will demonstrate different behaviors in Facebook social network and email social network. Such networks are usually interpreted as a combination of different “layers” (or “views,” “edge colors,” “relations,” “slices,” etc., in the literature), and is regarded as multilayer networks. In different contexts, “multigraph,” “multiplex network,” “multirelational network,” “multislice network,” “multilevel network,” “network of network,” and “temporal network” always refer to a similar network structure (Kivelä et al. 2014). Following the conventional terminology in network science, we refer to networks with such structure as multilayer networks.

Although there is actually no consensus on its definition, a community usually refers to a group of nodes that are compactly connected with each other and sparsely connected with those nodes outside the group. By partitioning a network into communities, we obtain its community structure, which is a coarse-grained representation of the network that assists us analyzing the roles played by each node (Fortunato 2010). Despite numerous studies on multilayer networks in recent years, there is still a lack of evaluation metrics for measuring the community structure of a multilayer network, which in turn limits the number of available algorithms to find the optimal community structure in multilayer networks. Existing evaluation metrics in multilayer networks are mainly derived from “single-layer” cases, where the evaluation metrics are designed to detect modular structures in conventional networks that can be represented simply with nodes and edges, e.g., edge centrality, clustering coefficient, and metrics based on dynamic process (Battiston et al. 2013; Bródka et al. 2010; De Domenico et al. 2013; Lambiotte and Rosvall 2012; Kivelä et al. 2014; De Domenico and Lancichinetti 2015). In such methods, detections are applied independently on each layers before final assignment, or on a “collapsed network” which is a single-layer network generated by aggregating the layers (Peixoto 2015). Such treatment is intuitive to find an “average” role played by a node in different layers, but somehow fails to treat the multiple layers fundamentally as a whole. Mucha et al. 2010, proposed a modularity-based metric for multilayer network community structure derived from a Laplacian dynamic. To the best of our knowledge, they for the first time introduce couplings to the multilayer network models, which are links that appear between layers and connect a node with its copy in other layers, to combine the layers and form an interconnected-layer network model. Based on such an interconnected-layer structure, the generalized modularity is able to evaluate the community structure without any compression or loss of the information encoded in the multilayer networks.

In spite of the great advances, the generalized modularity still has weaknesses which will lead to confusion especially when it comes to temporal networks, where the layers are usually time slices of a specific evolving single-layer network. The derivation of modularity is based on the stability of global communities (Lambiotte et al. 2008), which is measured by comparing the position of a random walker with the stable state. This assumes that the random walker keeps transferring as the time goes. However, the layers are interdependent w.r.t. time and the couplings are introduced to describe the continuity of the interaction between nodes along the layers (time slices) (Mucha et al. 2010). It is confusing since one layer may be the result of the evolvement of another layer but the random walker is assumed to be able to travel between them. Another important weakness is, although the generalized modularity is generally similar with its original version in the single-layer case, the definition of a community in multilayer networks becomes vaguer to understand—what does a community look like in multilayer networks if a random walker can hardly escape from it? Actually, the current derivation of multilayer modularity focuses on capturing the dynamic property of a community—stopping the random walker from leaving it. In some cases where there is such random process defined on the network, the definition of the community is apparent. But in other cases, the definition of a community becomes vague.

The above two issues are inevitably brought by the derivation from a dynamic perspective. In order to address them, in this paper, we derive the generalized modularity from a static perspective, i.e., without defining dynamic process on the network. As will be shown in “Multilayer modularity from a static perspective,” from such perspective, the generalized modularity is represented as the predominant part of Hamiltonian, which measures the total energy of the systems in a variety of cases including community structure in the networks (Reichardt and Bornholdt 2006). Thus, the optimization of the proposed metric is equivalent to that of Hamiltonian, which provides the generalized modularity with an energy explanation. We also demonstrate in “Multilayer modularity from a static perspective” that the generalized modularity just finds communities with high cohesion, i.e., densely distributed internal efficient edges (not the couplings), which is more intuitive to understand and returns to its original definition in the single-layer case (Newman 2006). With such a static derivation, we are able to generalize the modularity to multiple aspect cases, where the layers belong to different groups (Kivelä et al. 2014) or the layer relation is flexible. We also propose a spectral algorithm called mSpec for optimizing the proposed modularity evaluation metric, which extends the spectral bisection algorithm in the single-layer case (Newman 2006).

We summarize our contribution in this paper briefly as follows:
  • We derive the multilayer modularity from a static perspective to address the confusion in temporary networks and point out which kind of topological structure will lead to a high modularity value.

  • We generalize the multilayer modularity to adapt to networks with multiple aspects or there are flexible constraints on the layer relation.

  • We propose a spectral bisection algorithm (mSpec) for multilayer modularity optimization based on the supra-adjacency representation for multilayer structure.

  • We apply the proposed metric to electroencephalogram (EEG) networks as an attempt of application.

The rest of this paper will be organized as follows: We review the related works that have been done in the literature in “Background.” The proposed multilayer modularity and the mSpec optimization will be described in “Multilayer modularity from a static perspective” and “mSpec: an iterative spectral optimization of multilayer modularity,” respectively. The experimental results are reported “Experiments”. We conclude this paper “Conclusion.”


In this section, we will briefly introduce the network models that have been explored in the literature and the strategies that have been adopted to detect communities in multilayer networks, including evaluation metrics and optimization.

Network model

During the process of exploring the multilayer networks, different network models have been proposed (Mucha et al. 2010; Boccaletti et al. 2014; Kivelä et al. 2014). Mucha et al. (2010) linked multiple single-layer networks with couplings, which refer to the edges that connect the nodes with their copies in different layers, to represent a multilayer network. This model allows the layers to communicate through the couplings and is widely adopted especially by research involving dynamics defined on multilayer networks (De Domenico et al. 2013; Gomez et al. 2013). De Domenico et al. (2013) proposed a multilayer model based on tensor representation, which no longer restrains the between-layer connection to appear between node-copy pairs. In the rest of this paper, we will use between-layer edges to refer to this kind of connections that link a node with another node in different layers. On the one hand, the presence of between-layer edges makes the network more flexible. But on the other hand, a multilayer network with between-layer edges is very similar to a single-layer network in structure since they both have no limitations on the presence of edges (any node in any layer is allowed to link with another node in another layer). This will sometimes blur the boundary between single-layer and multilayer networks.

In more complex systems, the layers may be divided into several groups, which indicate that multilayer networks should also be distinguished when observed from different aspects (Kivelä et al. 2014). For example, a cell phone contact network can be characterized by different means such as calling and texting. Meanwhile, this network is also temporal since there are callings and texting at any time point. Thus, this layer is divided into two groups: according to time stamps or according to communication means. In order not to lose the information of the networks from either aspect, we have to construct a more complex multilayer network. There are actually two types of multilayer networks with multiple aspects, which have not been clearly distinguished in the literature. If a layer can belong to more than one aspects, which means the aspects may overlap, we can locate a single layer by indicating all aspects it belongs to. In the rest of this paper, we will call this an aspect–aspect representation, as shown in Fig. 1. In such representation, we need an F-dimensional (the number of aspects) vector to locate a layer. For example, the layer at the top right can be located by (2, 1) since the layer is in layer set 2 of aspect (I) and layer set 1 of aspect (II). When there are additional aspects, the dimension of the location vector grows. Therefore, it is a challenging task to represent this network by matrix with predetermined size.
Fig. 1

Aspect–aspect representation of the multilayer network model (Kivelä et al. 2014). Aspect (I) has two layer sets and aspect (II) has three layer sets. The within-layer edges are denoted with solid lines and the couplings are denoted with dotted lines where different colors indicate the couplings of different aspects

Fig. 2

Aspect-layer representation of the multilayer network model. The within-layer edges are denoted with solid lines and the couplings are denoted with dotted lines where different colors indicate the couplings of different aspects. The position of a layer in a multilayer network can be specified by determining which aspect it belongs to and its serial number within the aspect. For example, the layer at the bottom right can be located as (2, 3) since it is in aspect (II) and layer 3. Such representation avoid the problem caused by increasing aspect numbers

In other networks with multiple aspects, each layer only belongs to a unique aspect. For instance, conventional electroencephalogram (EEG) networks (single-layer networks) for different individuals construct a multilayer network, where each layer corresponds to an EEG network of a person. With the fact that all testees receive the same treatment, the layers reflect the common reactions to the test, but still hold the individual difference between the testees. In addition, a person can take several EEG tests to obtain different EEG networks that all reflect the roles played by different regions of his brain in the test. Thus, we have two aspects observing the EEG network of the testees, enabling us to analyze individual differences and similarities as well as the role of different brain regions simultaneously. With respect to different individuals, we may have as many aspects as the number of persons that takes the EEG test, and the layers within that aspect are several EEG networks obtained from several tests. To locate a layer in such networks, we just need to point out to which aspect the layer belongs and its position within that aspect, as shown in Fig. 2. We will refer to such representation as aspect-layer representation to distinguish with the aspect–aspect representation.

Actually, for a more convenient implementation, we can convert the aspect–aspect representation to the aspect-layer representation by absorbing aspects hierarchically into one aspect. We can interpret this process by considering how multidimensional arrays are stored on the disk. A 2-dimensional array is represented as an “array of arrays.” The multiple aspects are arranged in a similar way so that we can represent the network using matrices with predetermined size.

Existing evaluation metrics for community detection in multilayer networks

As one of the most concerned issues in network analysis, community detection aims at partitioning the network into groups of closely connected nodes (which is called a community) to obtain a coarse-grained representation, which helps us better understand the structure of the network. However, as far as we are concerned, most of existing evaluation metrics designed for community detection in multilayer networks assume that the layers are independent. The multilayer stochastic block models (SBM), which are generative models that make inferences on the role of nodes given the network structure as evidence (Valles-Catala et al. 2014; Peixoto 2015), usually adopt two types of strategies. They either learn a SBM on each layer, just like in single-layer networks, and then make global assignments based on the result of each layers, or they aggregate the layers to produce a “collapsed network” (Peixoto 2015). The final community assignment of each node is made based on the SBM result on the collapsed network.

De Domenico et al. extended the well-known infomap method (Rosvall and Bergstrom 2008) to the multilayer case (De Domenico and Lancichinetti 2015). The infomap method solves the community detection problem by considering its duality with a coding problem. It assumes that the community is able to capture the flows on the network so that by utilizing the community structure, we can greatly compress the coding length needed to describe a random process on the network (Rosvall and Bergstrom 2008). The goal is to minimize the “map equation,” which describes the coding length based on a specific partition and the transition probability of the random process. De Domenico et al. defined the transition probability of a random walker in multilayer networks so that the map equation is able to describe the flow in multilayer scenarios. Such treatment is intuitively correct, albeit they assume the node can reach the neighbors of its copies in other layers in a single step. In fact this implicitly erases the difference between layers—it is equivalent to consider a collapsed network.

Some other existing evaluation metrics also provide considerable solutions to the community detection problem in multilayer networks, such as multilayer clustering coefficient (the authors consider the overlapping of layers or the networks with multiple types of connections) (Bródka et al. 2010; Battiston et al. 2013), multilayer centrality (the authors consider a random walker to jump between layers through specific node pairs or edges) (De Domenico et al. 2013; Lambiotte and Rosvall 2012), etc. What these methods share in common is that they assume the layers are independent or can be aggregated and attempt to find global roles for the nodes. Such treatments would have considerable effects as the network structure varies when we wish to find the similarity of the layers. But when we are interested in the different roles of nodes in the layers, these methods may generate a poor result, as we will discuss in the experiments. Thus, it is highly recommended to adopt an interconnected-layer structure.

Modularity is a widely adopted metric for community detection in single-layer networks (Newman and Girvan 2004; Newman 2010; Clauset et al. 2004; Newman 2006). The original definition of modularity is the edge difference between the current network and a null model, which is a rewired random network with the same degree distribution as the original network. Modularity reflects the cohesion of nodes within a community, so by optimizing global modularity one can find a partition of the network with communities within which the edges are densely distributed (Newman 2006). Recently, Mucha et al. extended the single-layer modularity to multilayer case using a Laplacian dynamic process defined on the multilayer network (without between-layer edges), which measures the stability of a community by comparing the probability of a random walker to stay in the same community at time t to the static solution (i.e., \(t \rightarrow \infty\)) (Mucha et al. 2010; Lambiotte et al. 2008).

This generalized modularity is of great contribution due to the fact that it combines the layers (using the couplings) on a model level for the first time and is adopted in a wide range of areas (Szell et al. 2010; Porter et al. 2011; Chiu and Westveld 2011). Nevertheless, this evaluation metric still has weaknesses. The multilayer modularity is derived based on a dynamic process (actually it is a random walk process), which means the random walker is jumping between nodes as time goes. So what if the network is evolving over time? When it comes to temporal networks, whose layers can be interpreted as different time slices of an evolving network (i.e., the edges vary over time), things get confusing, because the layers can be seen as different states in the network evolving process. Moreover, although the within-layer representation is the same as the conventional form proposed by Newman et al., it is not clear what kind of community the multilayer modularity tends to find. It is of vital importance to know the bias of the evaluation metrics on the communities, so that we can pick appropriate evaluation metrics for corresponding network structures. Last but not the least, the coupling strength strategy needs modification to adapt to more general cases, since the original one is brought without much discussion.


Optimizing the single-layer modularity is an NP-hard problem (Brandes et al. 2008), so we can only obtain a good approximation of the optimal solution efficiently. Since the single-layer modularity is actually a component of the multilayer modularity, the optimization of the multilayer modularity will also be NP-hard. To our best knowledge, there are rare algorithms except a generalized Louvain heuristic approach for multilayer modularity optimization (Mucha et al. 2010). The Louvain method is a greedy iterative method which hierarchically aggregates two nodes into a group by making the optimal modularity gain in each iteration. Then the generated node group is regarded as a new node and another iteration starts. This algorithm converges when there is no such merger that increases global modularity value. Some tricks like adding a Kernighan-Lin node swapping step (Kernighan and Lin 1970) after each iteration will give better detection result. The Louvain method is a widely adopted heuristic for optimizing quality functions of community structure, which implies that it does not utilize the property of the evaluation metric. Meanwhile, the community assignments of nodes are not guaranteed to converge to a good approximation, so we may need to run the algorithm several times to obtain a relatively more reasonable solution. As will be discussed “Experiments,” we cannot control the community scale detected by the Louvain method. When it comes to EEG networks, the Louvain method provides a relatively fine-grained detection result, whereas we expect it to find two communities—the regions that are active or inactive.

In order to tackle the above issues, we adopt the aspect-layer representation for describing network structure which is intuitive to implement and derive the multilayer modularity from a static perspective (not involving the dynamic process). We also discuss the extension of the evaluation metric so as to make it applicable when considering different types of multilayer networks such as unbalanced multilayer networks, temporal networks or signed networks, etc. We propose a spectral method for optimizing the multilayer modularity which provides a stable solution and is helpful when we concern the scale of the discovered communities.

Multilayer modularity from a static perspective

We start with several general requirements that a quality function should satisfy as introduced in (Reichardt and Bornholdt 2006): (1) rewarding existing edges within a community, (2) penalizing non-existing edges within a community, (3) penalizing existing edges between two communities, and (4) rewarding non-existing edges between two communities. Thus, a general quality function takes the form
$$\begin{aligned} \mathcal {H}(g)&= -\sum _{i\ne {j}}a_{ij}\underbrace{A_{ij}\delta (g_i, g_j)}_{\text {Internal existing edges}}+ \sum _{i\ne {j}}b_{ij}\underbrace{(1-A_{ij})\delta (g_i, g_j)}_{\text {Internal non-existing edges}}\nonumber \\&\quad +\sum _{i\ne {j}}c_{ij}\underbrace{A_{ij}\big [1-\delta (g_i, g_j)\big ]}_{\text {External existing edges}}-\sum _{i\ne {j}}d_{ij}\underbrace{(1-A_{ij})\big [1-\delta (g_i, g_j)\big ]}_{\text {External non-existing edges}}, \end{aligned}$$
where \(A_{ij}\) is the edge strength of nodes i and j, \(g_i\) indicates the label of the community that node i belongs to, and a, b, c, d are free parameters. The delta function \(\delta (x,y)\) takes 1 if \(x=y\), and 0 otherwise. Thus, the delta function ensures that the summation is performed between pairs of nodes belonging to the same community. In multilayer networks, since there are three kinds of edges (within-layer edges, couplings, and between-layer edges), we need to expand this function to enable the additional edge types. To be more explicit, the between-layer edges will be ignored in this paper since they blur the boundaries between such multilayer model and a single-layer network (i.e., both of them have no restraints on the appearance of the edges). But similar tricks can be designed to easily enable between-layer edges in this model. The expanded quality function can be written as
$$\begin{aligned} \mathcal {H}_{M}(g)&= -\sum _{i\ne {j}}\sum _{v=1}^{F}\sum _{s=1}^{V_v}\underbrace{a_{ijs}^{\{v\}}A_{ijs}^{\{v\}}\delta \left(g_{is}^{\{v\}}, g_{js}^{\{v\}}\right)}_{\text {Within-layer internal existing edges}}\nonumber \\&\quad +\sum _{i\ne {j}}\sum _{v=1}^{F}\sum _{s=1}^{V_v}\underbrace{b_{ijs}^{\{v\}} \left(1-A_{ijs}^{\{v\}}\right)\delta \left(g_{is}^{\{v\}}, g_{js}^{\{v\}}\right)}_{\text {Within-layer internal non-existing links}} \nonumber \\&\quad +\sum _{i\ne {j}}\sum _{v=1}^{F}\sum _{s=1}^{V_v}\underbrace{c_{ijs}^{\{v\}}A_{ijs}^{\{v\}} \left[1-\delta \left(g_{is}^{\{v\}}, g_{js}^{\{v\}}\right)\right]}_{\text {Within-layer external existing links}} \nonumber \\&\quad -\sum _{i\ne {j}}\sum _{v=1}^{F}\sum _{s=1}^{V_v}\underbrace{d_{ijs}^{\{v\}} \left(1-A_{ijs}^{\{v\}} \right) \left[1-\delta \left(g_{is}^{\{v\}}, g_{js}^{\{v\}}\right)\right]}_{\text {Within-layer external non-existing links}} \nonumber \\&\quad -\sum _{sv\ne {rw}}\sum _{i=1}^{N}\underbrace{e_{isr}^{\{vw\}}C_{isr}^{\{vw\}}\delta \left(g_{is}^{\{v\}}, g_{ir}^{\{w\}}\right)}_{\text {Between-layer internal existing couplings}} \nonumber \\&\quad +\sum _{sv\ne {rw}}\sum _{i=1}^{N}\underbrace{f_{isr}^{\{vw\}} \left(1-C_{isr}^{\{vw\}}\right) \delta \left(g_{is}^{\{v\}}, g_{ir}^{\{w\}}\right)}_{\text {Between-layer internal non-existing couplings}} \nonumber \\&\quad +\sum _{sv\ne {rw}}\sum _{i=1}^{N}\underbrace{g_{isr}^{\{vw\}}C_{isr}^{\{vw\}} \left[1-\delta \left(g_{is}^{\{v\}}, g_{ir}^{\{w\}}\right)\right]}_{\text {Between-layer external existing couplings}}\nonumber \\&\quad -\sum _{sv\ne {rw}}\sum _{i=1}^{N}\underbrace{h_{isr}^{\{vw\}} \left(1-C_{isr}^{\{vw\}}\right) \left[1-\delta \left(g_{is}^{\{v\}}, g_{ir}^{\{w\}}\right)\right]}_{\text {Between-layer external non-existing couplings}}, \end{aligned}$$
where we use s and r for the denotation of different layers, v and w for that of aspects. N and \(V_v\) represent the total number of nodes within a layer and total number of layers of aspect v and matrix \(\mathbf {A}\), \(\mathbf {C}\) and \(\mathbf {g}\) denote the within-layer adjacency, between-layer adjacency, and the community label matrix, respectively. Note that compared to Eq. (1), the number of parameters has doubled after taking between-layer couplings into account. Equation (1) points out the general form of an objective function for community detection, and can be used to derive the Hamiltonian of a Potts model in statistical mechanics as well as the modularity (Wu 1982; Reichardt and Bornholdt 2006), while Eq. (2) restricts the quality that an objective function in the multilayer case should satisfy. Since the parameters of Eq. (2) control the punishment (encouragement) and are free to choose, we can take \(a_{ijs}^{\{v\}}=c_{ijs}^{\{v\}} = 1-b_{ijs}^{\{v\}} = 1-d_{ijs}^{\{v\}} = 1-\gamma _{s}^{\{v\}}p_{ijs}^{\{v\}}\) and \(e_{isr}^{\{vw\}} = f_{isr}^{\{vw\}} = g_{isr}^{\{vw\}} = h_{isr}^{\{vw\}}\) to obtain a similar representation as the multilayer modularity (Mucha et al. 2010), where \(p_{ijs}^{\{v\}}\) known as null model is the penalty factor, and the parameter \(\gamma _s^{\{v\}}\) known as resolution parameter balances the contribution of punishment and award. Thus, we obtain a Hamiltonian function
$$\begin{aligned} \mathcal {H}_M(g)&= -\sum _{ijsv}\left(A_{ijs}^{\{v\}}-\gamma _{s}^{\{v\}}p_{ijs}^{\{v\}}\right) \left[2\delta \left(g_{is}^{\{v\}}, g_{js}^{\{v\}}\right)-1\right] \nonumber \\&\quad -\sum _{isrvw}e_{isr}^{\{vw\}}\left[2\delta \left(g_{is}^{\{v\}}, g_{ir}^{\{w\}}\right)-1\right]\left(2C_{isr}^{\{vw\}}-1\right). \end{aligned}$$
In Eq. (3), we notice that the terms that do not contain \(\delta\) will be constant in optimization process, so we can rewrite \(\mathcal {H}_M(g)\) as
$$\begin{aligned} \mathcal {H}_M(g)&= -2\sum _{ijsv} \left(A_{ijs}^{\{v\}}-\gamma _{s}^{\{v\}}p_{ijs}^{\{v\}}\right)\delta \left(g_{is}^{\{v\}}, g_{js}^{\{v\}}\right) \nonumber \\&\quad -2\sum _{isrvw}e_{isr}^{\{vw\}}\left(2C_{isr}^{\{vw\}}-1 \right)\delta \left(g_{is}, g_{ir} \right) \nonumber \\&\quad +\sum _{sv} \left(1-\gamma _{s}^{\{v\}} \right)\cdot {2m_{s}^{\{v\}}} + \sum _{isrvw} \left(2C_{isr}^{\{vw\}}-1 \right)e_{isr}^{\{vw\}}. \end{aligned}$$
By using \(\tilde{C}_{isr}^{\{vw\}}=e_{isr}^{\{vw\}}\left(2C_{isr}^{\{vw\}}-1\right)\), we can get the standard Hamiltonian form for system with many particles
$$\begin{aligned} -\frac{1}{2}\mathcal {H}_{M}(g)&= \sum _{isjrvw}\bigg [\left(A_{ijs}^{\{v\}}-\gamma _{s}^{\{v\}}p_{ijs}^{\{v\}}\right)\delta _{sr}\delta _{vw}+\tilde{C}_{isr}^{\{vw\}}\delta _{ij}\bigg ]\delta \left(g_{is}^{\{v\}}, g_{jr}^{\{w\}}\right)\nonumber \\&\quad -\frac{1}{2}\bigg \{\sum _{sv}\left(1-\gamma _{s}^{\{v\}}\right)\cdot {2m_{s}^{\{v\}}}+ \sum _{isrw}\tilde{C}_{isr}^{\{vw\}}\bigg \}, \end{aligned}$$
where the first term is proportional to Mucha’s modularity (Mucha et al. 2010) (except the value \(\tilde{C}_{isr}^{\{vw\}}\) takes differs from \(C_{ijs}\) in (Mucha et al. 2010) which will be discussed “Selection of \(\tilde{C}_{isr}^{\{vw\}}\) ”). The last two terms can be interpreted as bias that is linear with the network size (number of edges and couplings), which is constant during the minimization. Therefore, minimizing Hamiltonian is equivalent to optimizing modularity. Finally we obtain the modularity representation
$$\begin{aligned} Q_M(g)&= \sum _{isjrvw} \left[ \left(A_{ijs}^{\{v\}}-\gamma _{s}^{\{v\}}p_{ijs}^{\{v\}}\right)\delta _{sr}\delta _{vw}+\tilde{C}_{isr}^{\{vw\}}\delta _{ij}\right]\delta \left(g_{is}^{\{v\}}, g_{jr}^{\{w\}}\right). \end{aligned}$$
Here \(p_{ijs}^{\{v\}}\) is the within-layer edge strength of the null model. We can take different null models for different network types such as directed networks, bipartite networks, etc. (Mucha et al. 2010; Bazzi et al. 2014). Traditionally, in an undirected network, we take Newman–Girvan null model (i.e., a uniform network) \(\frac{k_{is}^{\{v\}}k_{js}^{\{v\}}}{2m_{s}^{\{v\}}}\), so
$$\begin{aligned} Q_M(g)&= \sum _{isjrvw} \left[ \left(A_{ijs}^{\{v\}}-\gamma _{s}^{\{v\}}\frac{k_{is}^{\{v\}}k_{js}^{\{v\}}}{2m_{s}^{\{v\}}}\right)\delta _{sr}\delta _{vw} +\tilde{C}_{isr}^{\{vw\}}\delta _{ij}\right] \nonumber \\& \quad \times \delta \left(g_{is}^{\{v\}}, g_{jr}^{\{w\}}\right). \end{aligned}$$
Now we can take a closer look at the choice of the parameters in Eq. (2) and we take
$$\begin{aligned} {\left\{ \begin{array}{ll} a_{ijs}^{\{v\}} &{}=c_{ijs}^{\{v\}} = 1-\gamma _{s}^{\{v\}}p_{ijs}^{\{v\}}\\ b_{ijs}^{\{v\}} &{}= d_{ijs}^{\{v\}} = \gamma _{s}^{\{v\}}p_{ijs}^{\{v\}}\\ e_{isr}^{\{vw\}} &{}= f_{isr}^{\{vw\}} = g_{isr}^{\{vw\}} = h_{isr}^{\{vw\}}, \end{array}\right. } \end{aligned}$$
which groups the edges into two types and gives different punishment (encouragement). The values of the parameters are actually the efficient number of each type of edges (the edge difference of current network structure and the null model). In other words, a positive modularity is obtained if the edges and couplings within the community are more than those between different communities. A higher modularity is reached when the edges are more densely distributed within the communities.

Selection of \(\tilde{C}_{isr}^{\{vw\}}\)

In Mucha et al. (2010) proposed a multilayer modularity based on a Laplacian dynamic defined on multilayer network model
$$\begin{aligned} Q' = \frac{1}{2\mu '}\left[ \left( A_{ijs}-\gamma _s\frac{k_{is}k_{jr}}{2m_s}\right) \delta _{sr}-C_{jsr}\delta _{ij}\right] \delta (g_{is}, g_{jr}). \end{aligned}$$
Although similar to the proposed form as Eq. (7) in structure (taking \(v = w\) to obtain a single-aspect representation of modularity in this paper), Mucha et al. did not discuss much about the coupling strength \(C_{jsr}\). They chose \(C_{jsr}\) to take binary value \(\{0, \omega \}\) to represent the absence and presence of couplings and \(\omega\) controls the contribution of couplings. In the proposed form, we notice that \(\tilde{C}_{isr}^{\{vw\}} = e_{isr}^{\{vw\}}\cdot (2C_{isr}^{\{vw\}}-1)\) and if we take \(e_{isr}^{\{vw\}} = \omega\), \(\tilde{C}_{isr}^{\{vw\}}\) takes \(\{-\omega , \omega \}\) representing the absence and presence of couplings in a specific community. Compared with Mucha’s modularity, the proposed form will punish those couplings that do not show up, so the couplings that are absent will also provide information about the community structure. Additionally, since \(\tilde{C}_{isr}^{\{vw\}} = e_{isr}^{\{vw\}}\cdot (2C_{isr}^{\{vw\}}-1)\) and \(e_{isr}^{\{vw\}}\) is totally free, the proposed form of multilayer modularity is flexible to adjust to various types of multilayer networks. We will use two typical types of network as an example.

Unevenly distributed views

Consider a common type of multilayer network whose distribution of layers is uneven, i.e., the intervals between pairs of layers can be unequal. In this situation, simply letting \(\tilde{C}_{isr}^{\{vw\}}\) take the same value without considering the closeness of layers will cause large errors. For instance, suppose we have a multilayer electroencephalogram network in which each person is treated as a layer. Apparently, the age difference and gender of the patients will greatly influence the result (Sharma et al. 2015; Repovs et al. 2011). Therefore, we should enable the proposed model to handle such networks with unevenly distributed layers. Noticing \(e_{isr}^{\{vw\}}\) is a free parameter governing the amplitude of \(\tilde{C}_{isr}^{\{vw\}}\), we can adjust \(e_{isr}^{\{vw\}}\) according to the closeness of the layers as
$$\begin{aligned} e_{isr}^{\{vw\}} = \omega \cdot \frac{M_{sr}^{\{vw\}}}{\max _{s, r, v, w}{M_{sr}^{\{vw\}}}}, \end{aligned}$$
where \(M_{sr}\) measures the closeness of layer s and r. Here we still use \(\omega\) to control the coupling strength so as to control the balance between within-layer edges and between-layer couplings.

Temporal networks

In some research, a temporal network is defined as a sequence of networks corresponding to successive time points with between-layer couplings indicating the continuity between adjacent layers (Holme and Saramäki 2012; Bazzi et al. 2014; Berlingerio et al. 2013). For example, suppose in a phone calling temporal network, two nodes are linked by an edge in two successive layers. If there is a coupling connecting the corresponding nodes in both layers, then we can tell that this call lasts through these two time points. Otherwise we can tell that they have two calls at both of the time points. Therefore, between-layer couplings only appear between adjacent layers in such temporal networks. In order to satisfy this, we let \(e_{isr}^{\{vw\}} = 0\) when \(|s-r| \ne 1\) or the link between the nodes does not last between two time points.

Notice that the interval between two time slices can also be unequal. For example, the Facebook social networks of a person when he was 15 and 16 will be similar but they may have large difference compared with the network when he was 20. Such time interval problem can be addressed just like the unevenly distributed layers discussed before.

Signed networks

Connections in complex systems reflect either positive or negative interactions between nodes, which can be modeled as signed networks that contain edges with positive or negative weight (Doreian and Mrvar 2009; Yang et al. 2007). The effect of both kinds of edges on the structure of such networks should be distinguished: the contribution of positive edges should be awarded, while the contribution of the negative edges should be punished. In Mucha et al. (2010) derive the modularity by using a Laplacian dynamics operator that contains the sign information. We can bring in signed edges into the proposed metric by representing the adjacency \(A_{ijs}^{\{v\}}\) as well as the null model \(p_{ijs}^{\{v\}}\) as the combination of both kinds of edges in Eq. (2)
$$\begin{aligned} A_{ijs}^{\{v\}} = A_{ijs}^{\{v\}+}-A_{ijs}^{\{v\}-}, \end{aligned}$$
$$\begin{aligned} \gamma _s^{\{v\}}p_{ijs}^{\{v\}} = \gamma _s^{\{v\}+}p_{ijs}^{\{v\}+}-\gamma _s^{\{v\}-}p_{ijs}^{\{v\}+} \end{aligned}$$
Thus, we obtain the signed version of the proposed metric
$$\begin{aligned} Q_M(g)&= \frac{1}{\mu }\sum _{ijsr}\left\{ \left[\left(A_{ijs}^{\{v\}+}-\gamma _s^{\{v\}+}\frac{k_{is}^{\{v\}+}k_{js}^{\{v\}+}}{2m_s^{\{v\}+}}\right) \right. \right. \nonumber \\ &\quad \quad \left. \left. -\left(A_{ijs}^{\{v\}-}-\gamma _s^{\{v\}-}\frac{k_{is}^{\{v\}-}k_{js}^{\{v\}-}}{2m_s^{\{v\}-}}\right)\right]\delta _{sr}\delta _{vw} \right. \nonumber \\ &\quad \quad \left. +\tilde{C}_{isr}^{\{vw\}}\delta _{ij}\right \}\delta \left(g_{is}^{\{v\}}, g_{jr}^{\{w\}} \right). \end{aligned}$$
The positive and negative weighted terms are equivalent to considering the within-layer modularity as the combination of two “networks” with opposite contribution. We can now conclude that the proposed metric is able to deal with signed networks by considering the negative edges as additional networks of the within-layer modularity.

mSpec: an iterative spectral optimization of multilayer modularity

In order to find a good approximation of the optimal solution of multilayer modularity maximization problem, Mucha et al. (2010) adopted a generalized Louvain method, which hierarchically merges two communities to increase the modularity score. The result is improved by a KL-swap step that swaps the nodes between the communities to see if further increase on modularity score is possible (Kernighan and Lin 1970). But such optimization method is unstable, so we need to run it multiple times to avoid converging to a local maxima. And it sometimes fails to find expected number (always small) of communities, since the algorithm stops before the number of communities decreases to the desired value. Newman et al. (2006) proposed a spectral method for single-layer modularity optimization which hierarchically divides the network into two communities. Inspired by their work, we propose a spectral bisection method called mSpec based on the supra-adjacency representation of the multilayer network. This method will provide more stable performance as will be discussed in “Experiments”.

Supra-adjacency representation: an equivalent single-layer network

In multilayer network analysis, a supra-adjacency always refers to a single-layer network which is flattened from a multilayer network (Kivelä et al. 2014; Boccaletti et al. 2014; Sánchez-García et al. 2014; Bazzi et al. 2014; Cozzo et al. 2015). The basic idea is to combine two layers which are represented by two \(N\times {N}\) graphs, to obtain an expanded layer which is represented by a \(2\times {2}\) block graph with the diagonal blocks representing the within-layer adjacency of each layer and off-diagonal blocks representing the between-layer couplings. By repeating such flattening step until the number of layers reduces to one, we obtain an expanded equivalent single-layer network containing all nodes in the original multilayer network, where the nodes are distinguished from their copies in different layers and aspects (see Fig. 3).
Fig. 3

Supra-adjacency matrix of a multilayer network with three aspects. The first aspect consists of two layers and the others contain only one layer. The non-diagonal blocks of the supra-adjacency matrix represent the between-layer adjacency of the layers. Since we only consider the between-layer couplings, these blocks are all diagonal. The diagonal blocks record the within-layer adjacency. Since we only discuss about undirected networks, the supra-adjacency matrix is symmetric

Based on the supra-adjacency representation, we obtain a mapping from a multilayer network to an equivalent single-layer network where the mapped subscript for node i in layer \(s^{\{v\}}\) is
$$\begin{aligned} x = i + (s-1)N + \sum _{v'}^{v-1}V_{v'}N \end{aligned}$$
with \(x \in \big [1, \sum _{v}^FV_vN\big ]\).
We therefore can apply the same mapping on the modularity matrix which records the modularity of each node pair (ij) in each layer pair \((s^{\{v\}},r^{\{w\}})\) to obtain a supra-modularity-matrix
$$\begin{aligned} B_{isjr}^{\{vw\}}&= \lambda _{s}^{\{v\}}\cdot \Big (A_{ijs}^{\{v\}}-\gamma _{s}^{\{v\}}\frac{k_{is}^{\{v\}}k_{js}^{\{v\}}}{2m_{s}^{\{v\}}}\Big )\delta _{sr}\delta _{vw} + \delta _{ij}\tilde{C}_{jsr}^{\{vw\}} \nonumber \\&= D_{xy} \end{aligned}$$
We will illustrate that this supra-modularity-matrix maintains all the information of the original multilayer network and can be utilized for optimization.

Dividing networks into two communities

Let the index matrix \(\mathcal {L}\) identify the community label of each node in each layer
$$\begin{aligned} \mathcal {L}_{is}^{\{v\}}= {\left\{ \begin{array}{ll} +1 & {} \quad \text {if node i in layer}\,{s}^{\{v\}}\, \text {is in community 1}\\ -1 &{} \quad \text {otherwise.} \end{array}\right. } \end{aligned}$$
Then we can rewrite the modularity function as
$$\begin{aligned} Q = \frac{1}{\mu }\sum _{isjrvw}B_{isjr}^{\{vw\}}\left( \frac{\mathcal {L}_{is}^{\{v\}}\mathcal {L}_{jr}^{\{w\}}+1}{2}\right) . \end{aligned}$$
We notice that
$$\begin{aligned} \sum _{isjrvw}B_{isjr}^{\{vw\}}&= \lambda _{s}^{\{v\}}\cdot \sum _{ijs} \left(A_{ijs}^{\{v\}}-\gamma _{s}^{\{v\}}\frac{k_{is}^{\{v\}}k_{js}^{\{v\}}}{2m_{s}^{\{v\}}} \right)+\sum _{isrvw}\tilde{C}_{isr}^{\{vw\}}\nonumber \\&= \sum _{sv} \left(1-\gamma _{s}^{\{v\}} \right)\cdot {\lambda _{s}^{\{v\}}2m_{s}^{\{v\}}} + \sum _{isrvw}\tilde{C}_{isr}^{\{vw\}} \nonumber \\&= \chi \end{aligned}$$
which means once the graph is given, \(\chi\) is a constant value, and will not influence the global maximization of modularity function. Also, the \(\frac{1}{\mu }\) and \(\frac{1}{2}\) values in the parentheses do not make sense in the maximization, either. So, our objective function can be rewritten as
$$\begin{aligned} Q = \sum _{isjrvw}B_{isjr}^{\{vw\}}\mathcal {L}_{is}^{\{v\}}\mathcal {L}_{jr}^{\{w\}}. \end{aligned}$$
Then we can map the multilayer network to the corresponding supra-adjacency as described in Eq. (13)
$$\begin{aligned} B_{isjr}^{\{vw\}} = D_{xy}, \end{aligned}$$
where the mapping is performed according to Eq. (12). We can also bring in a new label vector z with
$$\begin{aligned} z_x = \mathcal {L}_{is}^{\{v\}}. \end{aligned}$$
Therefore, we can represent the objective function Eq. (17) as
$$\begin{aligned} Q = \sum _{xy}D_{xy}z_xz_y. \end{aligned}$$
By applying this mapping, the problem is converted to be a relatively simple one, on which we can apply the same spectral method used in the single-layer case. We can solve it by utilizing the eigenvectors and eigenvalues of matrix \(\mathbf {D}\) as follows:
$$\begin{aligned} Q = \sum _{xy}D_{xy}z_xz_y = \mathbf {z}^T\mathbf {Dz}. \end{aligned}$$
We can then represent z as the linear combination of the eigenvectors of \(\mathbf {D}\), i.e., \(\mathbf {z} = \sum _xa_x\mathbf {u}_x\), where \(\mathbf {u}_x\) is the x-th eigenvector of \(\mathbf {D}\) and \(a_x\) is the corresponding weight. We can obtain \(a_x = \mathbf {z}\cdot \mathbf {u}^T_{x}\). Meanwhile, if \(\beta _x\) is the corresponding eigenvalue of \(\mathbf {u}_x\), we can obtain \(\mathbf {u}_x^T\mathbf {D} = (\mathbf {D}\cdot {\mathbf {u}_x})^T = \beta _x\mathbf {u}_x^T\) according to the fact that \(\mathbf {D}\) is symmetric because \(B_{jris}^{\{wv\}} = B_{isjr}^{\{vw\}}\) which means \(D_{xy} = D_{yx}\). Then Eq. (21) can be written as
$$\begin{aligned} Q = \sum _xa_x\mathbf {u}_x^T\mathbf {Dz} = \sum _xa_x\mathbf {u}_x^T\cdot \mathbf {z}\beta _x = \sum _xa_x^2\beta _x. \end{aligned}$$
We know that in order to maximize Q, supposing that the eigenvector corresponding to the largest eigenvalue is \(\mathbf {u}_M\), all we need to do is to assign the vector z according to \(\mathbf {u}_M\)
$$\begin{aligned} z_x = \mathcal {L}_{is}^{\{v\}} = {\left\{ \begin{array}{ll} 1 &{} \quad \text {if}\ [\mathbf {u}_M]_x \ge 0,\\ -1 &{} \quad \text {otherwise} \end{array}\right. } \end{aligned}$$
Thus, we obtain the optimal division using the supra-modularity-matrix.

Dividing networks into more than two communities

To divide the network into more communities, we have to rewrite the additional modularity contribution of further division. Suppose the subcommunities after dividing community C are \(\mathcal {A}\) and \(\mathcal {B}\), we have
$$\begin{aligned} Q'_C&= Q_{\mathcal {A}} + Q_{\mathcal {B}} \nonumber \\&= \frac{1}{\mu }\left( \sum _{isv, jrw\in {\mathcal {A}}}B_{isjr}^{\{vw\}} + \sum _{isv, jrw\in {\mathcal {B}}}B_{isjr}^{\{vw\}}\right) \nonumber \\&= \frac{1}{\mu }\sum _{isv, jrw\in {C}}\frac{\mathcal {L'}_{is}^{\{v\}}\mathcal {L'}_{jr}^{\{w\}} + 1}{2} B_{isjr}^{\{vw\}}, \end{aligned}$$
where \(\mathcal {L'}_{jr}^{\{w\}} \in {\{-1, +1\}}\) is the community label indicating to \(\mathcal {A}\) or \(\mathcal {B}\) the node belongs. Here we use the fact that the sum of entries of modularity matrix \(\mathbf {B}\) is constant once the network is determined so that it will not influence the optimization. Then the multilayer modularity gain can be written as
$$\begin{aligned} \Delta Q&= Q'_{C} - Q_{C} \nonumber \\&=\frac{1}{2\mu }\left[ \sum _{isv, jrw\in {C}}B_{isjr}^{\{vw\}}\mathcal {L}_{is}^{\{v\}}\mathcal {L}_{jr}^{\{w\}}-\sum _{isv, jrw\in {C}}B_{isjr}^{\{vw\}}\right] \nonumber \\&=\frac{1}{2\mu }\sum _{isv, jrw\in {C}}\left[ B_{isjr}^{\{vw\}}-\delta _{ij}\delta _{sr}\delta _{vw}\sum _{j'r'w'\in {C}}B_{isj'r'}^{\{vw'\}}\right] \mathcal {L}_{is}^{\{v\}}\mathcal {L}_{jr}^{\{w\}}\nonumber \\&=\frac{1}{2\mu }\sum _{isv, jrw\in {C}}B_{isjr}^{\{vw\}(C)}\mathcal {L}_{is}^{\{v\}}\mathcal {L}_{jr}^{\{w\}} \end{aligned}$$
where each entry of matrix \(\mathbf {B}^{(C)}\) is
$$\begin{aligned} B_{isjr}^{\{vw\}(C)} = B_{isjr}^{\{vw\}}-\delta _{ij}\delta _{sr}\delta _{vw}\sum _{j'r'w'\in {C}}B_{isj'r'}^{\{vw'\}}. \end{aligned}$$
Similarly, we also bring in an assistant matrix \(\mathbf {D}\) to maximize the global modularity, \(B_{isjr}^{\{vw\}} = D_{xy}\). Notice that \(\sum _{j'r'w'\in {C}}B_{isj'r'}^{\{vw'\}}\) is constant, \(\mathbf {B}^{(C)}\) is also symmetric and \(\sum _{isjrvw}B_{isjr}^{\{vw\}(C )} = 0\), so we can repeatedly apply the bisection method on the detected communities using \(\mathbf {D}\) as the modularity matrix \(\mathbf {B}\) until the modularity gain \(\Delta Q\) does not increase.

Complexity analysis

The mSpec method is based on a linear mapping and spectral decomposition. The time complexity of the linear mapping is \(O(\sum _{v}^FV_vN)\), where N is the total number of nodes in a single layer and \(V_{v}\) is total layers within aspect v. By applying Lanczos algorithm (Freund et al. 1993), finding the dominant eigenvector can be carried out in \(O((\sum _{v}^FV_vN)^2)\) (Newman 2006). Thus, suppose there are k divisions, we can complete the total calculation in time \(O(k(\sum _{v}^FV_vN)^2)\). The total number of divisions depends on the depth of the division tree, which is expected to be \(\log (\sum _v^FV_vN)\) in average. Thus the total complexity is \(O([F\bar{V}N]^2\log (F\bar{V}N))\), where \(\bar{V}=\sum _v^FV_v\) is the average layer number and F is the total number of aspects.


In this section, we present community detection results using the proposed modularity in several multilayer networks. As we will demonstrate in the results, (1) the proposed method can be applied to a wide range of networks by flexibly adjusting the couplings and parameters and (2) the mSpec is more stable than the generalized Louvain method.

We conduct several experiments on a well-known benchmark network to discuss how the parameters can influence the results of community detection. The proposed method is also applied to the electroencephalograph (EEG) networks as an attempt of its application, the result of which turns out to coincide with the functional division of the human brains. In order to evaluate the performance of the proposed modularity optimization method (mSpec), it is compared with baseline optimization methods. As will be reported, the proposed optimization performs more reliably as the coupling scale varies.

The networks we use in experiments are
  1. 1.
    Parameter analysis data:
    • Zachary Karate Club network: network of friendships between 34 members of a karate club in a US university (Zachary et al. 1977).

  2. 2.
    Comparison data:
    • CKM-Physicians Innovation multilayer network: a network of the physicians’ adoption of a new drug, tetracycline, in four towns (Coleman et al. 1957). There are 246 nodes and 3 layers (according to three questions asking about the relationship between the physicians).

    • CS-Aarhus social network: a multilayer social network consists of five online and offline relationships (5 layers) between 61 employees of Computer Science department at Aarhus (Magnani et al. 2013).

    • Kapferer Tailor Shop network: a time-varying network recording the interactions in a tailor shop in Zambia over 10 months (Kapferer 1972). The network consists of two layers according to the interaction types and 39 nodes.

    • Krackhardt High-Tech network: three kinds of social relationships (Advice, Friendship and “Reports to”) between 21 managers of a high-tech company (Krackhardt 1987).

    • London Transportation network: multilayer transportation network of 369 London train stations with three layers recording different types of connection (underground, overground, and DLR) (De Domenico et al. 2014). This network is relatively sparse.

    • Padgett Florentine Families network: the network of marriage alliances and business relationships between Florentine families in the Renaissance (Padgett and Ansell 1993). There are 16 nodes in total.

    • Vickers Class Relation network: the networks collected from 29 seventh-grade students in an Australia school about three questions on the classmate relationship (“Get on with,” “Best friend,” and “Prefer to work with”) (Vickers and Chan 1981).

  3. 3.
    Case study data: EEG network
    • Signed multilayer network that characterizes the correlation of the testees’ brain regions during a visual stimuli test. The nodes include 128 scalp electrodes as well as a standard control electrode and 11 testees and several test records form a two-aspect multilayer network.


Parameter analysis

In order to study how the parameters (i.e., \(\gamma _s\) and \(\omega\)) in the proposed method influence the community detection results, we conduct experiments with similar experimental settings as Mucha et al. do in (2010). We construct a ten-layer network with resolution parameter \(\gamma _s \in \{0.1, 0.2, \ldots , 1\}\), where the adjacency of each layer is the benchmark network Zachary Karate network (Zachary et al. 1977) and we assume that the between-layer coupling exists between any pair of nodes and their copies. We perform community detection on the generated network with different coupling strength parameters \(\omega \in \{0, 0.01, 0.1, 1, 10\}\) , and the community assignment for each node in ten layers is depicted with different colors.
Fig. 4

Community detection results with different parameters. The community assignment is distinguished by different colors. The network consists of ten identical layers each of which is the network of Zachary Karate Club with resolution parameter \(\gamma _s \in \{0.1, 0.2, \ldots , 1\}\) and detection is performed with coupling strength parameter \(\omega = 0, 0.01, 0.1, 1, 10\) , respectively

From Fig. 4 we can see, when \(\omega = 0\), the layers show great divergence due to the value of resolution parameter \(\gamma _s\). As \(\gamma _s\) grows, the network is inclined to split into subcommunities. By comparing with standard community label, we see the detection result with parameter \(\gamma _s\) setting from 0.5 to 0.9 matches the ground truth, while there are misclassifications in the rest. As \(\omega\) increases, we see the nodes in different layers tend to be assigned to the same community. When \(\omega = 1\), we see that every node has the same community label as its copies in other layers, and the detection result consistent with the ground truth.

We can then conclude that, the resolution parameter \(\gamma _s\) controls the tendency of the splitting and the coupling strength parameter \(\omega\) controls the consistency of the community assignment between layers. Too large or too small \(\gamma _s\) will cause misclassification, which can be fixed, however, by the between-layer couplings. Meanwhile, too small \(\omega\) will lead to the isolation between layers. When there are noises in the network data, the result can be poor (as shown in Fig. 4a) since cross-layer information has not been fully utilized. Nevertheless, the peculiarity of each layer will be damaged by large \(\omega\) (as shown in Fig. 4d).

In "Comparison results", we will compare the performance of several algorithms as the network scale varies, where we bring in a parameter \(\rho\) to explicitly control the coupling density. However, since \(\rho\) reflects the density of the raw network data, we can consider it as a super parameter that is unalterable once the network is given.
Table 1

Comparison of modularity result of CKM-Physicians innovation network








































































The best mean values are marked in italics

Table 2

Comparison of modularity result of CS-Aarhus network








































































The best mean values are marked in italics

Table 3

Comparison of modularity result of Kapferer Tailor Shop network








































































The best mean values are marked in italics

Table 4

Comparison of modularity result of Krackhardt High-Tech network








































































The best mean values are marked in italics

Table 5

Comparison of modularity result of London Transportation network








































































The best mean values are marked in italics

Table 6

Comparison of modularity result of Padgett Florentine Families network








































































The best mean values are marked in italics

Table 7

Comparison of modularity result of Vickers Class Relation network








































































The best mean values are marked in italics

Comparison results

For comparison, several state-of-the-art approaches are used so as to evaluate the performance of the proposed optimization method (mSpec):
  1. 1.

    mLouv: Multilayer Louvain-like method plus KL-swap improvement, which is the most widely adopted heuristic method for modularity optimization (Mucha et al. 2010);

  2. 2.

    sMSpec: Single-layer spectral optimization method that will be applied on the mean of adjacency matrices of all layers (Tang et al. 2010);

  3. 3.

    sMSpec: Single-layer spectral optimization method applied on each layer (Newman and Girvan 2004; Newman 2006).

In order to examine the reliability of the proposed method, the detection is performed over seven datasets with different between-layer coupling density \(\rho\). The parameter \(\rho\) depending on the raw network data reflects how closely connected any two layers are, and in experiments, we generate random between-layer couplings according to the probability \(\rho\). The nodes are linked with all its copies in other layers when \(\rho = 1\) and there are no couplings at all when \(\rho = 0\). The result is evaluated by the modularity value Q computed according to Eq. (7) using the community assignment of each algorithm, as shown in Tables 1, 2, 3, 4, 5, 6, and 7. The variance and mean of the modularity value reflect the stability and reliability of each algorithm against network with different between-layer coupling scales.

As the results suggest, the proposed method significantly outperforms the existing methods, achieving \(18.65\%\) improvement over the second best in terms of mean modularity values while maintaining a relatively low variances. The mLouv method and sMSpec method show low Q when the couplings are sparse (small \(\rho\)) and high Q when the couplings are dense (large \(\rho\)), while sMSpec performs oppositely. This is because the mLouv and sMSpec methods incline to look for a global community label for all nodes and ignore the peculiarity of each layers, so that when the couplings are sparse (which implies high heterogeneity between layers), such algorithm fail to make a distinguished assignment. Similarly, the performance of sMSpec degenerates seriously when the couplings are dense since it runs detection over each layer, respectively, and lacks the consideration of consistency. The proposed method is based on a supra-adjacency representation of the multilayer network, with \(\omega\) dominates the consistency. This guarantees the reliable performance of the proposed method against networks with different conditions of the connection between layers. In a nutshell, the proposed method performs stably as the coupling density varies so that is relatively reliable when the condition of the raw network is unclear.

Case study: EEG network

The event-related potentials (ERPs) which are measured by means of electroencephalography (EEG) is the measured brain response of testee with a specific stimuli (Cahn and Polich 2006; Dietrich and Kanso 2010). Since the EEG monitoring collects electrical impulse data from the electrodes placed on the scalp, it should be totally non-invasive in most cases except for an inevitable invasive electrode for specific application. Moreover, the monitoring process is silent so that the auditory disturbance is reduced to a very subtle level and is tolerant to subject movement. Owing to the numerous advantages, EEG is widely adopted as the analysis tool for brain activity, especially on children testees. Nevertheless, the traditional output of the EEG monitoring manifests as waveforms, so that the analysis of them is unintuitive and usually relies on the experiential judgements of the EEG providers. In recent years, more and more research focus is concentrated on the analysis of EEG data, but almost all of such work focuses on the average performance of similar testees, which may lead to the loss of information about each distinct testee (Alexander-Bloch et al. 2012; Chen et al. 2008; Meunier et al. 2009). In this experiment, we attempt to apply the proposed method on the signed multilayer network generated from the EEG data to explore the functional performance of the regions of brain. We compare the detected result with a standard empirical brain functional region division to find a surprising match between clinical experience and graph data mining (Power et al. 2011).

We regard the 128 electrodes and a standard control electrode placed on the testee’s scalp as 129 nodes involved, and calculate the correlation coefficients between the ERPs recorded from each pair of electrodes when the testee is given a series of visual stimuli as the edge weights between them. Thus, we generate a single-layer network based on the EEG data of one test record of a specific testee. By combining the networks generated in this way from 11 testees and their several test records, we obtain a two-aspect EEG network that contains the information of the brain activities of all testees. Since the electrodes are placed identically for every testee, we assume the between-layer coupling exists between each pair of corresponding electrodes. We can adjust the parameter \(\gamma _s\) to control the resolution and \(\omega\) to control the consistency of the detection result of each testee. The detection results on the first four testees are shown in Fig. 5.
Fig. 5

The detection result of EEG network. We randomly pick four layers from the multilayer network. The standard brain region division is plotted with different symbols: (1) purple triangle prefrontal cortex that controls thinking, perception, information memory, and attention; (2) white square premotor cortex that controls eye movements; (3) blue circle auditory cortex that controls the audition; (4) green star somatosensory cortex that controls the sense of touch; (5) red diamond visual cortex that controls the sense of sight. The detected result is presented as the topographic map of the brain where we directly treat the dominant eigenvector \(\mu\) as the community label. The blue region corresponds to the negative terms of \(\mu\) , while the yellow region corresponds to the positive terms, where the darkness indicates the magnitude of corresponding label value

We find that the EEG networks are always divided into two communities, yellow and blue, in all experiments. By comparing the detection results with the corresponding adjacency, we observe that the edges with negative weights mainly lie between the two communities and within each community the nodes are connected by the edges with positive weights. Therefore, in order to better illustrate the brain terrain, we directly treat the dominant eigenvector \(\mathbf {u}_M\) of the modularity matrix as the detected community labels of the corresponding nodes for plotting since such non-binary labels make it possible to picture the contour of the brain. Say, the dominant eigenvector is (0.5, 0.2, −0.1, −1) and the label vector will also be (0.5, 0.2, −0.1, −1), where the last two nodes will be dyed blue (the darkness distinguishes the magnitude) and the first two will be dyed yellow. Meanwhile, since such treatment also maximizes the modularity function, the result is more accurate and reliable than discrete community labels. We present the continuous community label as the topographic map of the brain where the two communities correspond to regions with different colors. By adding the standard brain function region division to the figures, we find the detection results reach a surprising match with the widely accepted brain functional partition. The visual cortex (red diamond), prefrontal cortex (purple triangle), and the premotor cortex (white square) share the same community, while the auditory cortex which is denoted with blue circles belongs to the other community. The former is more or less relevant to the visual and attention, while the latter is closely related with audition. The results coincide with the clinical experience that the visual and audition always demonstrate relatively strong divergence and interaction. Moreover, from the color bar attached, we can notice the magnitude of continuous community label of the blue part which corresponds to visual brain region is much higher than that of the yellow part which refers to the auditory region. The magnitude of the continuous label indicates the contribution of the node to the global modularity value, which can imply how active the region is during the test. Therefore, we see the visual region is much more active than the auditory region, which coincides our intuition.

To sum up, this experiment on EEG network shows encouraging results about the feasibility of the proposed method on empirical networks. It also provides a new direction of the application of the proposed method and similar approaches.


In this paper, we discussed the representation of multilayer networks with multiple aspects and then derived the multilayer modularity based on the assumption of the contribution of the edges and couplings. According to the derivation, we demonstrate that the modularity prefers the community structure where the edges and couplings are densely distributed within the communities. Then we proposed a spectral bisection method for optimization of the modularity based on the supra-adjacency representation. In “Experiments,” we reported the performance of the proposed evaluation metric as the parameters change and the comparison result with some other baseline methods. We applied the proposed method on a two-aspect EEG network as an attempt of application, and the results coincide with the functional region of the brain.



Authors' contributions

The authors discussed the problem and the solutions proposed all together. All authors participated in drafting and revising the final manuscript. All authors read and approved the final manuscript.


We would like to thank Sun Yat-sen Memorial Hospital for providing EEG data. This work was supported by NSFC (Nos. 61502543, 61573387), Guangdong Natural Science Funds for Distinguished Young Scholar (2016A030306014), NSF through Grants III-1526499, and CNS-1115234.

Competing interests

The authors declare that they have no competing interests.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Authors’ Affiliations

School of Data and Computer Science, Sun Yat-sen University, Guangzhou Higher Education Mega Center
University of Illinois at Chicago


  1. Alexander-Bloch A, Lambiotte R, Roberts B, Giedd J, Gogtay N, Bullmore E (2012) The discovery of population differences in network community structure: new methods and applications to brain functional networks in schizophrenia. Neuroimage 59(4):3889–3900View ArticleGoogle Scholar
  2. Bang-Jensen J, Gutin GZ (2008) Digraphs: theory algorithms and applications. Springer, BerlinMATHGoogle Scholar
  3. Barrat A, Barthelemy M, Pastor-Satorras R, Vespignani A (2004) The architecture of complex weighted networks. Proc Natl Acad Sci USA 101(11):3747–3752View ArticleGoogle Scholar
  4. Bassett DS, Wymbs NF, Porter MA, Mucha PJ, Carlson JM, Grafton ST (2011) Dynamic reconfiguration of human brain networks during learning. Proc Natl Acad Sci 108(18):7641–7646View ArticleGoogle Scholar
  5. Battiston F, Nicosia V, Latora V (2013) Metrics for the analysis of multiplex networks. arXiv preprint arXiv:1308.3182
  6. Bazzi M, Porter MA, Williams S, McDonald M, Fenn DJ, Howison SD (2014) Community detection in temporal multilayer networks, and its application to correlation networks. arXiv preprint arXiv:1501.00040
  7. Berlingerio M, Coscia M, Giannotti F, Monreale A, Pedreschi D (2013) Multidimensional networks: foundations of structural analysis. World Wide Web 16(5–6):567–593View ArticleGoogle Scholar
  8. Boccaletti S, Bianconi G, Criado R, Del Genio CI, Gómez-Gardeñes J, Romance M, Sendina-Nadal I, Wang Z, Zanin M (2014) The structure and dynamics of multilayer networks. Phys Rep 544(1):1–122MathSciNetView ArticleGoogle Scholar
  9. Brandes U, Delling D, Gaertler M, Görke R, Hoefer M, Nikoloski Z, Wagner D (2008) On modularity clustering. IEEE Trans Knowl Data Eng 20(2):172–188View ArticleMATHGoogle Scholar
  10. Bródka P, Musial K, Kazienko P (2010) A method for group extraction in complex social networks. In: Knowledge management, information systems, e-learning, and sustainability research. Springer, Berlin, pp 238–247Google Scholar
  11. Cahn BR, Polich J (2006) Meditation states and traits: Eeg, erp, and neuroimaging studies. Psychol Bull 132(2):180View ArticleGoogle Scholar
  12. Chen ZJ, He Y, Rosa-Neto P, Germann J, Evans AC (2008) Revealing modular architecture of human brain structural networks by using cortical thickness from mri. Cereb Cortex 18(10):2374–2381View ArticleGoogle Scholar
  13. Chiu GS, Westveld AH (2011) A unifying approach for food webs, phylogeny, social networks, and statistics. Proc Natl Acad Sci 108(38):15881–15886View ArticleGoogle Scholar
  14. Clauset A, Newman ME, Moore C (2004) Finding community structure in very large networks. Phys Rev E 70(6):066111View ArticleGoogle Scholar
  15. Coleman J, Katz E, Menzel H (1957) The diffusion of an innovation among physicians. Sociometry 20:253–270View ArticleGoogle Scholar
  16. Cozzo E, de Arruda GF, Rodrigues FA, Moreno Y (2015) Multilayer networks: metrics and spectral properties. arXiv preprint arXiv:1504.05567
  17. De Domenico M, Solé-Ribalta A, Cozzo E, Kivelä M, Moreno Y, Porter MA, Gómez S, Arenas A (2013) Mathematical formulation of multilayer networks. Phys Rev X 3(4):041022Google Scholar
  18. De Domenico M, Solé-Ribalta A, Gómez S, Arenas A (2014) Navigability of interconnected networks under random failures. Proc Natl Acad Sci 111(23):8351–8356MathSciNetView ArticleMATHGoogle Scholar
  19. De Domenico M, Lancichinetti A, Arenas A, Rosvall M (2015) Identifying modular flows on multilayer networks reveals highly overlapping organization in interconnected systems. Phys Rev X 5(1):011027Google Scholar
  20. De Domenico M, Solé-Ribalta A, Omodei E, Gómez S, Arenas A (2013) Centrality in interconnected multilayer networks. arXiv preprint arXiv:1311.2906
  21. Dietrich A, Kanso R (2010) A review of eeg, erp, and neuroimaging studies of creativity and insight. Psychol Bull 136(5):822View ArticleGoogle Scholar
  22. Doreian P, Mrvar A (2009) Partitioning signed social networks. Soc Netw 31(1):1–11View ArticleMATHGoogle Scholar
  23. Fortunato S (2010) Community detection in graphs. Phys Rep 486(3):75–174MathSciNetView ArticleGoogle Scholar
  24. Freund RW, Gutknecht MH, Nachtigal NM (1993) An implementation of the look-ahead lanczos algorithm for non-hermitian matrices. SIAM J Sci Comput 14(1):137–158MathSciNetView ArticleMATHGoogle Scholar
  25. Girvan M, Newman ME (2002) Community structure in social and biological networks. Proc Natl Acad Sci 99(12):7821–7826MathSciNetView ArticleMATHGoogle Scholar
  26. Gomez S, Diaz-Guilera A, Gomez-Gardeñes J, Perez-Vicente CJ, Moreno Y, Arenas A (2013) Diffusion dynamics on multiplex networks. Phys Rev Lett 110(2):028701View ArticleGoogle Scholar
  27. Holme P, Saramäki J (2012) Temporal networks. Phys Rep 519(3):97–125View ArticleGoogle Scholar
  28. Kapferer B (1972) Strategy and transaction in an African factory: African workers and Indian management in a Zambian town. Manchester University Press, ManchesterGoogle Scholar
  29. Kernighan BW, Lin S (1970) An efficient heuristic procedure for partitioning graphs. Bell Syst Tech J 49(2):291–307View ArticleMATHGoogle Scholar
  30. Kivelä M, Arenas A, Barthelemy M, Gleeson JP, Moreno Y, Porter MA (2014) Multilayer networks. J Complex Netw 2(3):203–271View ArticleGoogle Scholar
  31. Krackhardt D (1987) Cognitive social structures. Soc Netw 9(2):109–134MathSciNetView ArticleGoogle Scholar
  32. Lambiotte R, Delvenne J-C, Barahona M (2014) Random walks, Markov processes and the multiscale modular organization of complex networks. IEEE Trans Netw Sci Eng 1(2):76–90MathSciNetView ArticleGoogle Scholar
  33. Lambiotte R, Delvenne J-C, Barahona M (2008) Laplacian dynamics and multiscale modular structure in networks. arXiv preprint arXiv:0812.1770
  34. Lambiotte R, Rosvall M (2012) Ranking and clustering of nodes in networks with smart teleportation. Phys Rev E 85(5):056107View ArticleGoogle Scholar
  35. Magnani M, Micenkova B, Rossi L (2013) Combinatorial analysis of multiple networks. arXiv preprint arXiv:1303.4986
  36. Meunier D, Lambiotte R, Fornito A, Ersche KD, Bullmore ET (2009) Hierarchical modularity in human brain functional networks. Front Neuroinformatics 3
  37. Mucha PJ, Richardson T, Macon K, Porter MA, Onnela J-P (2010) Community structure in time-dependent, multiscale, and multiplex networks. Science 328(5980):876–878MathSciNetView ArticleMATHGoogle Scholar
  38. Newman ME (2004) Analysis of weighted networks. Phys Rev E 70(5):056131View ArticleGoogle Scholar
  39. Newman ME (2006) Modularity and community structure in networks. Proc Natl Acad Sci 103(23):8577–8582View ArticleGoogle Scholar
  40. Newman M (2010) Networks: an introduction. Oxford University Press, OxfordView ArticleMATHGoogle Scholar
  41. Newman ME, Girvan M (2004) Finding and evaluating community structure in networks. Phys Rev E 69(2):026113View ArticleGoogle Scholar
  42. Padgett JE, Ansell CK (1993) Robust action and the rise of the medici, 1400–1434. Am J Sociol 98:1259–1319View ArticleGoogle Scholar
  43. Peixoto TP (2015) Inferring the mesoscale structure of layered, edge-valued, and time-varying networks. Phys Rev E 92(4):042807View ArticleGoogle Scholar
  44. Power JD, Cohen AL, Nelson SM, Wig GS, Barnes KA, Church JA, Vogel AC, Laumann TO, Miezin FM, Schlaggar BL (2011) Functional network organization of the human brain. Neuron 72(4):665–678View ArticleGoogle Scholar
  45. Reichardt J, Bornholdt S (2006) Statistical mechanics of community detection. Phys Rev E 74(1):016110MathSciNetView ArticleGoogle Scholar
  46. Repovs G, Csernansky JG, Barch DM (2011) Brain network connectivity in individuals with schizophrenia and their siblings. Biol Psychiatry 69(10):967–973View ArticleGoogle Scholar
  47. Rocklin M, Pinar A (2013) On clustering on graphs with multiple edge types. Internet Math 9(1):82–112MathSciNetView ArticleMATHGoogle Scholar
  48. Rosvall M, Bergstrom CT (2008) Maps of random walks on complex networks reveal community structure. Proc Natl Acad Sci 105(4):1118–1123View ArticleGoogle Scholar
  49. Sánchez-García RJ, Cozzo E, Moreno Y (2014) Dimensionality reduction and spectral properties of multilayer networks. Phys Rev E 89(5):052815View ArticleGoogle Scholar
  50. Sharma A, Campbell J, Cardon G (2015) Developmental and cross-modal plasticity in deafness: evidence from the p1 and n1 event related potentials in cochlear implanted children. Int J Psychophysiol 95(2):135–144View ArticleGoogle Scholar
  51. Strogatz SH (2001) Exploring complex networks. Nature 410(6825):268–276View ArticleGoogle Scholar
  52. Szell M, Lambiotte R, Thurner S (2010) Multirelational organization of large-scale social networks in an online world. Proc Natl Acad Sci 107(31):13636–13641View ArticleGoogle Scholar
  53. Tang L, Wang X, Liu H (2010) Community detection in multi-dimensional networks. Technical report. DTIC Document
  54. Valles-Catala T, Massucci FA, Guimera R, Sales-Pardo M (2014) Multilayer stochastic block models reveal the multilayer structure of complex networks. arXiv preprint arXiv:1411.1098
  55. Verbrugge LM (1979) Multiplexity in adult friendships. Soc Forces 57(4):1286–1309View ArticleGoogle Scholar
  56. Vickers M, Chan S (1981) Representing classroom social structure. Victoria Institute of Secondary Education, MelbourneGoogle Scholar
  57. Wasserman S, Faust K (1994) Social network analysis: methods and applications, vol 8. Cambridge University Press, CambridgeView ArticleMATHGoogle Scholar
  58. Wu F-Y (1982) The potts model. Rev Mod Phys 54(1):235MathSciNetView ArticleGoogle Scholar
  59. Yang B, Cheung WK, Liu J (2007) Community mining from signed social networks. IEEE Trans Knowl Data Eng 19(10):1333–1348View ArticleGoogle Scholar
  60. Zachary WW (1977) An information flow model for conflict and fission in small groups. J Anthropol Res 33:452–473View ArticleGoogle Scholar
  61. Zhang T, Xu P, Guo L, Chen R, Zhang R, He H, Xie Q, Liu T, Luo C, Yao D (2015) Multivariate empirical mode decomposition based sub-frequency bands analysis of the default mode network: a resting-state fMRI data study. Appl Inform 2(1):1–11View ArticleGoogle Scholar
  62. Zhang Y, Chen H, Long Z, Cui Q, Chen H (2016) Altered effective connectivity network of the thalamus in post-traumatic stress disorder: a resting-state fMRI study with granger causality method. Appl Inform 3(1):1–8View ArticleGoogle Scholar


© The Author(s) 2017