On degree-degree correlations in multilayer networks

We propose a generalization of the concept of assortativity based on the tensorial representation of multilayer networks, covering the definitions given in terms of Pearson and Spearman coefficients. Our approach can also be applied to weighted networks and provides information about correlations considering pairs of layers. By analyzing the multilayer representation of the airport transportation network, we show that contrasting results are obtained when the layers are analyzed independently or as an interconnected system. Finally, we study the impact of the level of assortativity and heterogeneity between layers on the spreading of diseases. Our results highlight the need of studying degree-degree correlations on multilayer systems, instead of on aggregated networks.


Introduction
The use of network science to study the structure and dynamics of complex systems has proved to be a successful approach to understand the organization and function of several natural and artificial systems [1][2][3][4]. The traditional framework used up to a few years ago represents the structure of complex systems as singlelayer (also referred to as monoplex) networks, in which only one type of connection is accounted for. However, this approach is limited because most natural and artificial systems such as the brain, our society or modern transportation networks [5,6], are made up by different constituents and/or different types of interaction. Indeed, their structure is organized in layers. For instance, in social networks individuals can be connected according to different social ties, such as friendship or family relationship (e.g. [7]). In transportation networks, routes of a single airline can be represented as a network, whose vertices (destinations) can be mapped into networks of several companies [8]. Gene co-expression networks consist of layers, each one representing a different signaling pathway or expression channel [9]. Therefore, mapping out the structure of these and similar systems as a monoplex network could lead to miss relevant information that could not be captured if the single layers are analyzed separately nor if all layers are collapsed altogether in an aggregated graph. Additionally, note that in most of these interconnected systems, the information travels not only among vertices of the same layer, but also between pairs of layers.
Degree-degree correlations is a fundamental property of singlelayer networks, impacting the spreading of diseases, synchronization phenomena and systems' resilience [19,3]. Additionally, it has been reported that different correlations arise in different kinds of networks: social networks are in general assortative, meaning that highly connected nodes tend to link with each other, whereas technological and biological systems have disassortative structures, in which high degree nodes are likely attached to low degree nodes [20].
For networks made up of more than one layer, only recently, Nicosia and Latora [21] considered the correlation between the degrees in two different layers. However, their methodology is only for node-aligned multiplex networks, which are special cases of multilayer networks (see [5]). In fact, multiplex networks are made up of N nodes that can be in one or more interacting layers. The links in each layer represent a given mode of interaction between the set of nodes belonging to that layer, whereas links connecting different layers stand for the different modes of interaction between objects involved in [5].
In this paper we study degree-degree correlations in multilayer systems and propose a way to generalize previous assortativity metrics by considering the tensorial formulation introduced in [10]. Our approach also covers a weighted version of assortativity [22] and the case in which the assortativity is given by the Spearman correlation coefficient, generalizing the definition in [23]. Aside from those, it worth mention the generalization for weighted and directed networks [24]. The study of a real dataset corresponding to the airport transportation network shows a contrasting behavior between the analyses of each layers independently and altogether, which reinforces the need for such a generalization of the assortativity measure. Finally, we study the influence of degree-degree correlations on epidemic spreading in multilayer networks. We verify that the impact of the disease depends on degree-degree correlations and also on the level of heterogeneity between the layers.

Assortativity in multilayer networks
Tensors are suitable for representation of multilayer networks. As showed in [10], tensors allow us to consider a branch of new relationships between nodes and layers, by encoding a multilayer network as a fourth order mixed tensor, M αδ βγ , i.e. 2-covariant and 2-contravariant basis, in the Euclidean space. Such representation is convenient for many operations, as discussed in [10]. We use the definition of the interlayer adjacency tensor C α β (hr) that is a second order tensor which has the information of the relationships between nodes in layersh andr. Note that C α β (rr) is the adjacency matrix for the layerr and belongs to R N×N space. Then, the multilayer adjacency tensor is expressed as the summation over all layers L of the tensorial product of the adjacency tensors, C α β (hr), and the canonical Euclidean basis. Mathematically, which belongs to R N×N×L×L space.
Following Einstein's summation convention, the assortativity coefficient can be written as where u is the 1-tensor, which is a tensor of rank 1 and has all elements equal to 1, W α β is a second order tensor that summarizes the information that is being extracted and M = W α β U β α is a normalization constant.
Let us explain in more details all terms appearing in the expression of ρ(W α β ). First, we define which is a 1-contravariant tensor and which is a 1-covariant tensor. Moreover, the indices are related to the direction of the relationships between nodes. Such a choice ensures a more general expression, capturing degree correlations on non-symmetric tensors and, consequently, in directed and weighted networks. Due to the multiplex nature of such systems we obtain different types of correlations, which can be uncovered by operating on the adjacency tensor. First of all, it is possible to extract a single layer by the operation called single layer extraction [10]. In this case, the adjacency tensor is defined as which is a simple projection on the canonical basis, Eδγ (rr). It is noteworthy that the results obtained from this projection are the same as those obtained by considering the layerr as a monoplex network and applying the traditional formulation of assortativity [20]. On the other hand, to consider all layers together, we can use the projected network, which is a weighted single-layer network. Formally it is given as Note that the projection presents self-edges and, as argued in [10], it is different from a weighted monoplex network, since self-edges code for inter-layer couplings between different replica of the same object. Thus they have a different meaning with respect to other edges. A version of the projection without self-edges is called overlay network and is given as the contraction over the layers [10], i.e., Observe that the overlay network does not consider the contribution of the interlayer connections, whereas the projection does. As we will see later, comparisons between the assortativity of those two different representations of the system reveal the key role of such inter-links. In both cases, i.e., for the overlay and the projected networks, we extract degree-degree correlations. Nodes with similar degrees connected in the same or different layers contribute positively to the assortativity coefficient. On the other hand, the connections between hubs and low degree nodes in the same or different layers decrease the assortativity. Self-edges always increase the assortativity, which yields different values of assortativity for the overlay and the projected networks. This gives information on the nature of the coupling between different replicas of the same object among different layers.
In some applications, it is interesting to calculate a pair-wise correlation between a set of nodes, for instance, between couple of layers. Thus, we propose a new operation, that we call selection, which is a projection over a selected set of layers: where Ωγ δ is a tensor used to select the set of layers we consider in the projection (L). The components of the tensor are equal to unity when the layersδ andγ are selected, and zero otherwise. Note that by selecting all layers together we recover the 1-tensor Uγ δ and consequently Eq. (6). Another special case isδ =γ , which yields Eq. (5), or the layer extraction. The tensor can also be generalized to weight different layers. In this case, each element of Ωγ δ contains the weight of the relationship between two layersδ andγ . Such projection is similar to the covariance matrix in statistics, which generalizes the concept of variance. The covariance between two variables is quantified in each entry of the matrix and the main diagonal has the variance of each variable. Thus, we can define a matrix that generalizes the assortativity in a similar manner as the covariance matrix generalizes the concept of variance, i.e.
which belongs to a R L×L space. We call S the P-assortativity matrix. Also in this case, a similar operation for the overlay network can be considered, yielding which can also be generalized in a similar way as Eqs. (8) and (9), resulting in the matrix We call Z the O-assortativity matrix. A similar interlayer correlation was also proposed in [21], where the authors suggested measuring the degree correlation between two different layers of the replica of the same object (or node). Furthermore, they proposed three different ways: the Pearson correlation coefficient, Spearman rank correlation and the Kendall's τ index. However, it is worth pointing out that such an approach does not consider the intralayer relationship because it is only for node-aligned multiplex networks [5]. Here, we generalize such a measure in terms of tensorial notation.
Up to now we have considered nodes, but if we extract the network of layers [10], the correlation between different layers can also be evaluated. We use where U β α is the second-order tensor whose components are all equal to one. It is important to stress that the components of this adjacency tensor are not binary, but weighted by the number of edges inter each layer. Moreover, also in this case, the resulting tensor presents self-edges that encode the information about the density of connections inside a single layer. Finally, we can consider only interlayer relationships over two different layers. Such information is extracted by projecting the adjacency tensor on the canonical base as Note that this is only applicable to multilayer networks and does not make sense in multiplex networks, since in the latter case the coupling is diagonal.
The assortativity coefficient can also be defined in terms of the Spearman rank correlation [23], since the traditional definition of this coefficient based on the Pearson correlation [20] can lead to incomplete results, as discussed in [23]. The generalization of assortativity coefficient proposed here allows to consider the Spearman rank correlation coefficient by changing Eqs. (3) and (4).
Specifically, instead of considering the values of Q α and Q β , one substitutes them by their respective ranks. 1 Such transformation is performed by using and where rank(X i ) is the rank of the tensor X i . We henceforth denote by ρ P (W α β ) and ρ S (W α β ) the Pearson and Spearman correlation coefficients, respectively. Furthermore, we adopt (S P )γ δ and (S S )γ δ for the pair-wise correlation matrices using the Pearson and Spearman correlation coefficients, respectively.
The same notation can be used for the matrices (Z P )γ δ and (Z S )γ δ . Monoplex assortativity, i.e. assortativity in single-layer networks [20], is recovered by considering the adjacency matrix, , and consequently Q α and Q β are analogous to in-degree and out-degree, respectively. Note that Q α = Q β for undirected networks. Moreover, M is equal to twice the number of edges, recovering the equation introduced in [20], which also captures correlations of weighted networks, as exposed in [22].
Each approach presented here gives a different descriptor of the multilayer structure. For instance, the projected and overlay networks gather the information of all layers into a single layer structure, aiming at describing the whole system using single descriptors. Aside from those, we also provide a pair-wise descriptor which gives another type of information. Besides, there is also the network of layers, that possesses information about yet another level of the system. In this way, our approach gives a set of metrics that capture information about the whole multilayer structure. However, it is worth mentioning that the interpretation and choices depend on the application.

Application to real data
We analyze the airport transportation network [25], whose multilayer representation was studied in [8]. The network comprises 450 airports and 37 companies, which are mapped as nodes and layers, respectively. More specifically, in each layer, the edges represent the directed flights operated by a given company and nodes, airports. Fig. 1 shows a representation of 12 layers of such multilayer network. The inter-layer connections link the airports shared by pairs of different companies. This approach gives us a multilayer network that is not a node-aligned multiplex network, since the latter considers a diagonal coupling between all nodes in all layers. Note that the way proposed in [8] to create the aggregated monoplex network is the union of all layers considering multiple edges as single ones. This is in contrast to our approach, because we consider the projections and overlay networks as weighted networks, thus retaining the information of the number of different connections between the same pair of airports.
Previous studies [25,8] showed that the airport transportation network presents the rich-club effect, which refers to the tendency of highly central nodes to be connected among themselves. This is also captured by the assortativity as shown in Table 1, where we verify that the projected network has positive assortativity coefficients, agreeing with previous analyses. However note that the projection has a positive value of the assortativity, whereas the overlay has a negative one. Thus, the assortativity of the projection 1 One may not confuse rank in this context with the tensorial rank. Here it is the position in the ordered set of values, whereas the rank of a tensor is the number of covariant and contravariant indices. ) - Fig. 1. Example of an airport transportation multilayer network. Each layer represents an airline, in which each node represents an airport and the edges are flights between two airports. This visualization was generated using MuxViz [26]. indicates that many companies share hubs airport, not that hubs connect between them. This apparent contradiction results from the fact that the rich-club effect is masked out in the overlay setup by the large number of peripheral nodes connecting to hubs. The analysis of each layer separately shows a different result, where most of the layers are disassortative. The only exception is the Netjet layer, which presents a positive coefficient for the rank correlation. Usually the companies focus their activities in one city or country, for example, Lufthansa in Germany or Air France in France, and have flights to other airports where their activity is lower. This leads to the disassortative behavior of each layer. Additionally, the disassortative correlations found in single layers is more pronounced than that of the overlay representation, which can be explained by noticing that hubs of a company are peripheral (or secondary) airports for other companies, but when the layers are collapsed they are also hubs in the overlay network and are connected. Fig. 2 shows the pair-wise correlation between layers. Interestingly, the latter is disassortative, in contrast to the results obtained for the projected network, but of the same sign as those computed for the overlay representation (see Table 1). Furthermore, our construction of the adjacency tensor leads to an assortative network of layers, suggesting that bigger companies tend to share similar airports. This analysis agrees with [8], where the authors argued that the main airports are connected to each other via directed flights. In addition, considering the Pearson correlations, the O-assortativity matrix presents lower values if compared to the P-assortativity matrix due to the intra-layer contributions, as discussed before.

Epidemic spreading in correlated multilayer networks
We investigate the effects of degree-degree correlations on epidemic spreading. To this end, we consider a classical SIS (Susceptible-Infected-Susceptible) model, in which nodes can be in one of two states, susceptible or infected [27]. Susceptible individuals can be infected if they are in contact with infected individuals, who have already caught the disease and are actively spreading it. Infected individuals get back to the susceptible state with probability µ. Here, we adopt the discrete formulation presented in [13], considering a fully reactive processes (RP). Such work is a generalization to multiplex networks of the Discrete-Time Markov Chain approach (DTMC) formerly presented in [28]. Furthermore, we remark that the DTMC formalism assumes that the state of two nodes is independent, which is an approximation. Fig. 2. Pair-wise assortativity coefficient using Spearman rank correlation, ρ S (S α β ) in (a) and ρ S (Z α β ) in (b). Observe that the main diagonal presents the same coefficient considering the layer extraction operation, ρ S (C α β (rr)). Besides, such approximation was shown to capture the main features of the process as observed in [28]. In addition to the numerical approach we also perform Monte Carlo simulations of the epidemic process.
As in [13], we consider intra and inter layer spreading. Let λ be the probability of spreading through an intra-layer contact and γ the spreading probability through an inter-layer contact. We also assume the possibility of one-step reinfection, that is, an infected individual can be cured and reinfected in the same time interval. Furthermore, it is convenient to consider the ratio between intralayer and inter-layer spreading probabilities as a constant [13], hence, we set η = γ λ = 2 on our numerical experiments. Other choices were also possible, for instance if η = 1 the intra and inter-edges are indistinguishable. Moreover, if η < 1 it is easier to spread over an intra-layer edge than over an inter-layer edge.
Besides, note that for η = 0 we have a set of decoupled layers. On the other hand, for η > 1 the opposite applies. The chosen value (η = γ λ = 2) enforces the multiplex structure, however, we have found similar results considering other ratios.
To obtain the expression of the macro-state variable in the tensorial notation, we redefine the supra-contact matrix as where Eσ η (γδ) ∈ R L×L indicates the tensor in the canonical basis and δγ δ is the Kronecker delta, which is equal to one ifγ =δ and zero otherwise. Such tensor encodes the relationship between individuals, weighting the inter-layer edges by γ λ , which implicitly impose that the spreading rate between layers is a constant times the intra-layer spreading ratio. It is worth mentioning that in [29] the authors study such tensor and its role in the epidemic spreading process considering following the tensorial formalism, however considering a continuous time approach.
Denoting the probability of the node β, on layerδ, becoming infected at time t as X βδ (t), the discrete time evolution equation for this probability is described as where the probability that a node will not be infected by any of its neighbors at time t is given as Observe that in Eq. (17), the indices βδ are not dummy and there is no summation on it. A more formal notation would be obtained substituting X βδ by X ησ E ησ (βδ). The implicit summation has only one term different from zero, which is X βδ . Finally, the macro-state variable is given as where U βδ ∈ R N×L is the all one tensor. In other words it is an average over all the individuals. Observe that our equations are exactly the same presented in [13], but here we consider the tensorial notation. In addition to the analytical approach we also perform Monte Carlo simulations to evaluate the influence of degree-degree correlations on epidemic spreading. The simulations are performed in a synchronous manner, i.e., every node changes its state at the same time and the events between t and t +1 are assumed to occur at the same time. In this way, at each time step, every spreader is cured with probability µ. After that, every infected individual contacts all its neighbors, thus representing fully reactive processes (RP). However, note that a cured spreader can still propagate the disease, because its state changes only at the end of the time step. This procedure enables the occurrence of one-step reinfections. After the contact, the disease spreading can occur in two different ways: (i) for inter-layers, where the spreading takes place with probability λ, or (ii) for intra-layers, where the spreading occurs with probability γ .
In order to quantify the effect of degree-degree correlations on the spreading process, we generate two scale-free networks, with degree distribution P(k) ≈ k −m , according to the configuration model [30]. The first layer has m ≈ 3 and ⟨k⟩ ≈ 17, whereas the other one we evaluate in two different configurations: (i) m ≈ 4.5 and ⟨k⟩ ≈ 13 and (ii) m ≈ 2.8 and ⟨k⟩ ≈ 12. Both networks are composed of N = 10 4 nodes.
On the other hand, to control the level of degree-degree correlations in random networks, we consider a simulated annealing algorithm [31]. This algorithm is based on two functions, i.e., (i) the perturbation function, which changes the system configuration, and (ii) the energy function, which is minimized. In our case, the perturbation function is a rewiring procedure that preserves the degree distribution of the network, but changes the large-scale degree-degree correlations. The energy function is defined as E t = c(ρ t + 1), where ρ t is the network assortativity at time t and c is a constant related to the level of degree-degree correlation, i.e., c = −1 if the goal is to obtain an assortative network or c = 1 if the goal is a disassortative network.
Given an initial network configuration, an initial temperature, T and a cooling factor α, the algorithm can be described by the following steps: (i) the energy function is initialized as E 0 ; (ii) while the number of iterations are less than a threshold or the optimal solution is not found (or good solution, given a tolerance) the following steps are performed: (iii) a rewiring preserving the degree distribution is executed, according to our perturbation function; (iv) the new energy function, E t+1 , is calculated; is a random number sampled from a uniform distribution in Observe that a worse state than the current one can be accepted with a probability exp This mechanism allows the system to avoid local minima. Following this procedure, we can generate random networks with a defined level of degree-degree correlation. Thus, using the simulated annealing algorithm above, we tune the assortativity on the individual layers. We can have three different configurations for each layer, i.e., (i) one assortative, (ii) one disassortative and (iii) one non-assortative. Those individual layers are connected, forming a multiplex network. In this case, we can have three different configurations: (i) assortative: densely connected nodes from one layer is connected to densely connected nodes in the other layer, (ii) disassortative: hubs in one layer are connected to low degree nodes in the other layer, and (iii) random: nodes in different layers are randomly connected. In this way, our dataset is composed by 27 multiplex networks presenting different levels of assortativity. We also consider η = γ λ = 2 and µ = 1. Similar results are found for different values of η. Fig. 3 shows the simulations of the SIS dynamics on top of multiplex networks with different levels of assortativity. We can see a good agreement between the Monte Carlo simulation and the theoretical macro-state variable (see Eq. (19)), although we assume that there is no correlation among the state of each random variable. Each network has different values of the epidemic threshold and also exhibits different behaviors near the threshold. Indeed, ) - The continuous lines are the analytical solution (Eq. (19)), while the symbols are obtained from Monte Carlo simulations, averaging over 10 2 runs. The standard deviation is of the size of the symbols. Each line was obtained as a result of a multiplex network whose assortative was tuned following the simulated annealing algorithm described on the text. The obtained assortativity is represented by the colors of the curves. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.) the epidemic threshold for assortative networks is at a lower transmission probability. This happens because the disease has a faster initial growth rate and a shorter duration in assortative networks than in disassortative networks. Nevertheless, disassortative networks show higher values of φ for larger values of γ /λ. This result agrees with the analysis of single layer networks in [32]. Notice that the same behavior is observed for Fig. 3(a) and (b), although in (a) the network is more heterogeneous than in (b). In fact, for more heterogeneous networks, the influence of degree-degree correlations is reduced.

Conclusions
In this paper we have generalized the metric used to calculate assortativity of multilayer networks. Our approach consists of reducing the dimension of the adjacency tensor and applying the Pearson correlation coefficient on the extremes of each edge. We follow the tensorial approach, which help us to have a compact, algorithmic and general formulation, covering various topological representations and possibilities, such as overlay and projected networks, and also pair-wise measurements. The calculation of the Spearman rank correlation is also possible from our formulation.
In the study of the airport transportation network, we verified that the individual analysis of the overlay or the projected networks can yield misleading conclusions. Indeed, as shown in Table 1 the assortativity values for the projected and overlay networks are different. The overlay network shows a small disassortative behavior, while the projected graph is highly assortative. This indicates that the main contribution to the assortativity of the projected network is given by self-edges, i.e., hub airports that are present in many different layers. On the other hand, individually, each layer is disassortative, with a value much higher in module than the one obtained for the overlay network. This is because companies tend to have one big hub from which connections to many other airports are established, but, at the same time, they tend to have direct flights to the hubs of other companies. This interpretation of the data is also confirmed in Fig. 2, in which we observe a high negative value along the diagonal of the matrix, i.e., a strong disassortative behavior of isolated layers. In addition, we can see relative smaller negative values for the elements out of the diagonal, i.e., a relative weaker disassortative behavior, representing pair-wise correlations between different layers, i.e. companies. Furthermore, the comparison of the P-assortativity and the O-assortativity matrices also emphasizes the importance of the two different analysis, where the first is slightly higher due to the selfedges. Finally, the network of layers shows an assortative behavior, suggesting again that the main airline companies share similar airports.
Finally, we studied the effects of degree-degree correlations on epidemic spreading. The results obtained from Monte Carlo simulations and theoretical analysis using a Markov Chain formulation in terms of tensors show that the level of assortativity and heterogeneity between layers influence the spreading process. More specifically, we verified that assortative networks show a smaller epidemic threshold, and that the disease has a faster initial growth rate in these networks, but a shorter duration. On the contrary, the fraction of infected individuals is larger in disassortative networks. Finally, we have also shown that degree-degree correlations have a larger impact on the spreading dynamics when the coupled networks have similar levels of heterogeneity.