We start with several general requirements that a quality function should satisfy as introduced in (Reichardt and Bornholdt 2006): (1) rewarding existing edges within a community, (2) penalizing non-existing edges within a community, (3) penalizing existing edges between two communities, and (4) rewarding non-existing edges between two communities. Thus, a general quality function takes the form
$$\begin{aligned} \mathcal {H}(g)&= -\sum _{i\ne {j}}a_{ij}\underbrace{A_{ij}\delta (g_i, g_j)}_{\text {Internal existing edges}}+ \sum _{i\ne {j}}b_{ij}\underbrace{(1-A_{ij})\delta (g_i, g_j)}_{\text {Internal non-existing edges}}\nonumber \\&\quad +\sum _{i\ne {j}}c_{ij}\underbrace{A_{ij}\big [1-\delta (g_i, g_j)\big ]}_{\text {External existing edges}}-\sum _{i\ne {j}}d_{ij}\underbrace{(1-A_{ij})\big [1-\delta (g_i, g_j)\big ]}_{\text {External non-existing edges}}, \end{aligned}$$
(1)
where \(A_{ij}\) is the edge strength of nodes i and j, \(g_i\) indicates the label of the community that node i belongs to, and a, b, c, d are free parameters. The delta function \(\delta (x,y)\) takes 1 if \(x=y\), and 0 otherwise. Thus, the delta function ensures that the summation is performed between pairs of nodes belonging to the same community. In multilayer networks, since there are three kinds of edges (within-layer edges, couplings, and between-layer edges), we need to expand this function to enable the additional edge types. To be more explicit, the between-layer edges will be ignored in this paper since they blur the boundaries between such multilayer model and a single-layer network (i.e., both of them have no restraints on the appearance of the edges). But similar tricks can be designed to easily enable between-layer edges in this model. The expanded quality function can be written as
$$\begin{aligned} \mathcal {H}_{M}(g)&= -\sum _{i\ne {j}}\sum _{v=1}^{F}\sum _{s=1}^{V_v}\underbrace{a_{ijs}^{\{v\}}A_{ijs}^{\{v\}}\delta \left(g_{is}^{\{v\}}, g_{js}^{\{v\}}\right)}_{\text {Within-layer internal existing edges}}\nonumber \\&\quad +\sum _{i\ne {j}}\sum _{v=1}^{F}\sum _{s=1}^{V_v}\underbrace{b_{ijs}^{\{v\}} \left(1-A_{ijs}^{\{v\}}\right)\delta \left(g_{is}^{\{v\}}, g_{js}^{\{v\}}\right)}_{\text {Within-layer internal non-existing links}} \nonumber \\&\quad +\sum _{i\ne {j}}\sum _{v=1}^{F}\sum _{s=1}^{V_v}\underbrace{c_{ijs}^{\{v\}}A_{ijs}^{\{v\}} \left[1-\delta \left(g_{is}^{\{v\}}, g_{js}^{\{v\}}\right)\right]}_{\text {Within-layer external existing links}} \nonumber \\&\quad -\sum _{i\ne {j}}\sum _{v=1}^{F}\sum _{s=1}^{V_v}\underbrace{d_{ijs}^{\{v\}} \left(1-A_{ijs}^{\{v\}} \right) \left[1-\delta \left(g_{is}^{\{v\}}, g_{js}^{\{v\}}\right)\right]}_{\text {Within-layer external non-existing links}} \nonumber \\&\quad -\sum _{sv\ne {rw}}\sum _{i=1}^{N}\underbrace{e_{isr}^{\{vw\}}C_{isr}^{\{vw\}}\delta \left(g_{is}^{\{v\}}, g_{ir}^{\{w\}}\right)}_{\text {Between-layer internal existing couplings}} \nonumber \\&\quad +\sum _{sv\ne {rw}}\sum _{i=1}^{N}\underbrace{f_{isr}^{\{vw\}} \left(1-C_{isr}^{\{vw\}}\right) \delta \left(g_{is}^{\{v\}}, g_{ir}^{\{w\}}\right)}_{\text {Between-layer internal non-existing couplings}} \nonumber \\&\quad +\sum _{sv\ne {rw}}\sum _{i=1}^{N}\underbrace{g_{isr}^{\{vw\}}C_{isr}^{\{vw\}} \left[1-\delta \left(g_{is}^{\{v\}}, g_{ir}^{\{w\}}\right)\right]}_{\text {Between-layer external existing couplings}}\nonumber \\&\quad -\sum _{sv\ne {rw}}\sum _{i=1}^{N}\underbrace{h_{isr}^{\{vw\}} \left(1-C_{isr}^{\{vw\}}\right) \left[1-\delta \left(g_{is}^{\{v\}}, g_{ir}^{\{w\}}\right)\right]}_{\text {Between-layer external non-existing couplings}}, \end{aligned}$$
(2)
where we use s and r for the denotation of different layers, v and w for that of aspects. N and \(V_v\) represent the total number of nodes within a layer and total number of layers of aspect v and matrix \(\mathbf {A}\), \(\mathbf {C}\) and \(\mathbf {g}\) denote the within-layer adjacency, between-layer adjacency, and the community label matrix, respectively. Note that compared to Eq. (1), the number of parameters has doubled after taking between-layer couplings into account. Equation (1) points out the general form of an objective function for community detection, and can be used to derive the Hamiltonian of a Potts model in statistical mechanics as well as the modularity (Wu 1982; Reichardt and Bornholdt 2006), while Eq. (2) restricts the quality that an objective function in the multilayer case should satisfy. Since the parameters of Eq. (2) control the punishment (encouragement) and are free to choose, we can take \(a_{ijs}^{\{v\}}=c_{ijs}^{\{v\}} = 1-b_{ijs}^{\{v\}} = 1-d_{ijs}^{\{v\}} = 1-\gamma _{s}^{\{v\}}p_{ijs}^{\{v\}}\) and \(e_{isr}^{\{vw\}} = f_{isr}^{\{vw\}} = g_{isr}^{\{vw\}} = h_{isr}^{\{vw\}}\) to obtain a similar representation as the multilayer modularity (Mucha et al. 2010), where \(p_{ijs}^{\{v\}}\) known as null model is the penalty factor, and the parameter \(\gamma _s^{\{v\}}\) known as resolution parameter balances the contribution of punishment and award. Thus, we obtain a Hamiltonian function
$$\begin{aligned} \mathcal {H}_M(g)&= -\sum _{ijsv}\left(A_{ijs}^{\{v\}}-\gamma _{s}^{\{v\}}p_{ijs}^{\{v\}}\right) \left[2\delta \left(g_{is}^{\{v\}}, g_{js}^{\{v\}}\right)-1\right] \nonumber \\&\quad -\sum _{isrvw}e_{isr}^{\{vw\}}\left[2\delta \left(g_{is}^{\{v\}}, g_{ir}^{\{w\}}\right)-1\right]\left(2C_{isr}^{\{vw\}}-1\right). \end{aligned}$$
(3)
In Eq. (3), we notice that the terms that do not contain \(\delta\) will be constant in optimization process, so we can rewrite \(\mathcal {H}_M(g)\) as
$$\begin{aligned} \mathcal {H}_M(g)&= -2\sum _{ijsv} \left(A_{ijs}^{\{v\}}-\gamma _{s}^{\{v\}}p_{ijs}^{\{v\}}\right)\delta \left(g_{is}^{\{v\}}, g_{js}^{\{v\}}\right) \nonumber \\&\quad -2\sum _{isrvw}e_{isr}^{\{vw\}}\left(2C_{isr}^{\{vw\}}-1 \right)\delta \left(g_{is}, g_{ir} \right) \nonumber \\&\quad +\sum _{sv} \left(1-\gamma _{s}^{\{v\}} \right)\cdot {2m_{s}^{\{v\}}} + \sum _{isrvw} \left(2C_{isr}^{\{vw\}}-1 \right)e_{isr}^{\{vw\}}. \end{aligned}$$
(4)
By using \(\tilde{C}_{isr}^{\{vw\}}=e_{isr}^{\{vw\}}\left(2C_{isr}^{\{vw\}}-1\right)\), we can get the standard Hamiltonian form for system with many particles
$$\begin{aligned} -\frac{1}{2}\mathcal {H}_{M}(g)&= \sum _{isjrvw}\bigg [\left(A_{ijs}^{\{v\}}-\gamma _{s}^{\{v\}}p_{ijs}^{\{v\}}\right)\delta _{sr}\delta _{vw}+\tilde{C}_{isr}^{\{vw\}}\delta _{ij}\bigg ]\delta \left(g_{is}^{\{v\}}, g_{jr}^{\{w\}}\right)\nonumber \\&\quad -\frac{1}{2}\bigg \{\sum _{sv}\left(1-\gamma _{s}^{\{v\}}\right)\cdot {2m_{s}^{\{v\}}}+ \sum _{isrw}\tilde{C}_{isr}^{\{vw\}}\bigg \}, \end{aligned}$$
(5)
where the first term is proportional to Mucha’s modularity (Mucha et al. 2010) (except the value \(\tilde{C}_{isr}^{\{vw\}}\) takes differs from \(C_{ijs}\) in (Mucha et al. 2010) which will be discussed “Selection of \(\tilde{C}_{isr}^{\{vw\}}\)
”). The last two terms can be interpreted as bias that is linear with the network size (number of edges and couplings), which is constant during the minimization. Therefore, minimizing Hamiltonian is equivalent to optimizing modularity. Finally we obtain the modularity representation
$$\begin{aligned} Q_M(g)&= \sum _{isjrvw} \left[ \left(A_{ijs}^{\{v\}}-\gamma _{s}^{\{v\}}p_{ijs}^{\{v\}}\right)\delta _{sr}\delta _{vw}+\tilde{C}_{isr}^{\{vw\}}\delta _{ij}\right]\delta \left(g_{is}^{\{v\}}, g_{jr}^{\{w\}}\right). \end{aligned}$$
(6)
Here \(p_{ijs}^{\{v\}}\) is the within-layer edge strength of the null model. We can take different null models for different network types such as directed networks, bipartite networks, etc. (Mucha et al. 2010; Bazzi et al. 2014). Traditionally, in an undirected network, we take Newman–Girvan null model (i.e., a uniform network) \(\frac{k_{is}^{\{v\}}k_{js}^{\{v\}}}{2m_{s}^{\{v\}}}\), so
$$\begin{aligned} Q_M(g)&= \sum _{isjrvw} \left[ \left(A_{ijs}^{\{v\}}-\gamma _{s}^{\{v\}}\frac{k_{is}^{\{v\}}k_{js}^{\{v\}}}{2m_{s}^{\{v\}}}\right)\delta _{sr}\delta _{vw} +\tilde{C}_{isr}^{\{vw\}}\delta _{ij}\right] \nonumber \\& \quad \times \delta \left(g_{is}^{\{v\}}, g_{jr}^{\{w\}}\right). \end{aligned}$$
(7)
Now we can take a closer look at the choice of the parameters in Eq. (2) and we take
$$\begin{aligned} {\left\{ \begin{array}{ll} a_{ijs}^{\{v\}} &{}=c_{ijs}^{\{v\}} = 1-\gamma _{s}^{\{v\}}p_{ijs}^{\{v\}}\\ b_{ijs}^{\{v\}} &{}= d_{ijs}^{\{v\}} = \gamma _{s}^{\{v\}}p_{ijs}^{\{v\}}\\ e_{isr}^{\{vw\}} &{}= f_{isr}^{\{vw\}} = g_{isr}^{\{vw\}} = h_{isr}^{\{vw\}}, \end{array}\right. } \end{aligned}$$
(8)
which groups the edges into two types and gives different punishment (encouragement). The values of the parameters are actually the efficient number of each type of edges (the edge difference of current network structure and the null model). In other words, a positive modularity is obtained if the edges and couplings within the community are more than those between different communities. A higher modularity is reached when the edges are more densely distributed within the communities.
Selection of \(\tilde{C}_{isr}^{\{vw\}}\)
In Mucha et al. (2010) proposed a multilayer modularity based on a Laplacian dynamic defined on multilayer network model
$$\begin{aligned} Q' = \frac{1}{2\mu '}\left[ \left( A_{ijs}-\gamma _s\frac{k_{is}k_{jr}}{2m_s}\right) \delta _{sr}-C_{jsr}\delta _{ij}\right] \delta (g_{is}, g_{jr}). \end{aligned}$$
(9)
Although similar to the proposed form as Eq. (7) in structure (taking \(v = w\) to obtain a single-aspect representation of modularity in this paper), Mucha et al. did not discuss much about the coupling strength \(C_{jsr}\). They chose \(C_{jsr}\) to take binary value \(\{0, \omega \}\) to represent the absence and presence of couplings and \(\omega\) controls the contribution of couplings. In the proposed form, we notice that \(\tilde{C}_{isr}^{\{vw\}} = e_{isr}^{\{vw\}}\cdot (2C_{isr}^{\{vw\}}-1)\) and if we take \(e_{isr}^{\{vw\}} = \omega\), \(\tilde{C}_{isr}^{\{vw\}}\) takes \(\{-\omega , \omega \}\) representing the absence and presence of couplings in a specific community. Compared with Mucha’s modularity, the proposed form will punish those couplings that do not show up, so the couplings that are absent will also provide information about the community structure. Additionally, since \(\tilde{C}_{isr}^{\{vw\}} = e_{isr}^{\{vw\}}\cdot (2C_{isr}^{\{vw\}}-1)\) and \(e_{isr}^{\{vw\}}\) is totally free, the proposed form of multilayer modularity is flexible to adjust to various types of multilayer networks. We will use two typical types of network as an example.
Unevenly distributed views
Consider a common type of multilayer network whose distribution of layers is uneven, i.e., the intervals between pairs of layers can be unequal. In this situation, simply letting \(\tilde{C}_{isr}^{\{vw\}}\) take the same value without considering the closeness of layers will cause large errors. For instance, suppose we have a multilayer electroencephalogram network in which each person is treated as a layer. Apparently, the age difference and gender of the patients will greatly influence the result (Sharma et al. 2015; Repovs et al. 2011). Therefore, we should enable the proposed model to handle such networks with unevenly distributed layers. Noticing \(e_{isr}^{\{vw\}}\) is a free parameter governing the amplitude of \(\tilde{C}_{isr}^{\{vw\}}\), we can adjust \(e_{isr}^{\{vw\}}\) according to the closeness of the layers as
$$\begin{aligned} e_{isr}^{\{vw\}} = \omega \cdot \frac{M_{sr}^{\{vw\}}}{\max _{s, r, v, w}{M_{sr}^{\{vw\}}}}, \end{aligned}$$
(10)
where \(M_{sr}\) measures the closeness of layer s and r. Here we still use \(\omega\) to control the coupling strength so as to control the balance between within-layer edges and between-layer couplings.
Temporal networks
In some research, a temporal network is defined as a sequence of networks corresponding to successive time points with between-layer couplings indicating the continuity between adjacent layers (Holme and Saramäki 2012; Bazzi et al. 2014; Berlingerio et al. 2013). For example, suppose in a phone calling temporal network, two nodes are linked by an edge in two successive layers. If there is a coupling connecting the corresponding nodes in both layers, then we can tell that this call lasts through these two time points. Otherwise we can tell that they have two calls at both of the time points. Therefore, between-layer couplings only appear between adjacent layers in such temporal networks. In order to satisfy this, we let \(e_{isr}^{\{vw\}} = 0\) when \(|s-r| \ne 1\) or the link between the nodes does not last between two time points.
Notice that the interval between two time slices can also be unequal. For example, the Facebook social networks of a person when he was 15 and 16 will be similar but they may have large difference compared with the network when he was 20. Such time interval problem can be addressed just like the unevenly distributed layers discussed before.
Signed networks
Connections in complex systems reflect either positive or negative interactions between nodes, which can be modeled as signed networks that contain edges with positive or negative weight (Doreian and Mrvar 2009; Yang et al. 2007). The effect of both kinds of edges on the structure of such networks should be distinguished: the contribution of positive edges should be awarded, while the contribution of the negative edges should be punished. In Mucha et al. (2010) derive the modularity by using a Laplacian dynamics operator that contains the sign information. We can bring in signed edges into the proposed metric by representing the adjacency \(A_{ijs}^{\{v\}}\) as well as the null model \(p_{ijs}^{\{v\}}\) as the combination of both kinds of edges in Eq. (2)
$$\begin{aligned} A_{ijs}^{\{v\}} = A_{ijs}^{\{v\}+}-A_{ijs}^{\{v\}-}, \end{aligned}$$
$$\begin{aligned} \gamma _s^{\{v\}}p_{ijs}^{\{v\}} = \gamma _s^{\{v\}+}p_{ijs}^{\{v\}+}-\gamma _s^{\{v\}-}p_{ijs}^{\{v\}+} \end{aligned}$$
Thus, we obtain the signed version of the proposed metric
$$\begin{aligned} Q_M(g)&= \frac{1}{\mu }\sum _{ijsr}\left\{ \left[\left(A_{ijs}^{\{v\}+}-\gamma _s^{\{v\}+}\frac{k_{is}^{\{v\}+}k_{js}^{\{v\}+}}{2m_s^{\{v\}+}}\right) \right. \right. \nonumber \\ &\quad \quad \left. \left. -\left(A_{ijs}^{\{v\}-}-\gamma _s^{\{v\}-}\frac{k_{is}^{\{v\}-}k_{js}^{\{v\}-}}{2m_s^{\{v\}-}}\right)\right]\delta _{sr}\delta _{vw} \right. \nonumber \\ &\quad \quad \left. +\tilde{C}_{isr}^{\{vw\}}\delta _{ij}\right \}\delta \left(g_{is}^{\{v\}}, g_{jr}^{\{w\}} \right). \end{aligned}$$
(11)
The positive and negative weighted terms are equivalent to considering the within-layer modularity as the combination of two “networks” with opposite contribution. We can now conclude that the proposed metric is able to deal with signed networks by considering the negative edges as additional networks of the within-layer modularity.