Analytical estimation of the correlation dimension of integer lattices

Recently [L. Lacasa and J. G\'omez-Garde\~nes, Phys. Rev. Lett. {\bf 110}, 168703 (2013)], a fractal dimension has been proposed to characterize the geometric structure of networks. This measure is an extension to graphs of the so called {\em correlation dimension}, originally proposed by Grassberger and Procaccia to describe the geometry of strange attractors in dissipative chaotic systems. The calculation of the correlation dimension of a graph is based on the local information retrieved from a random walker navigating the network. In this contribution we study such quantity for some limiting synthetic spatial networks and obtain analytical results on agreement with the previously reported numerics. In particular, we show that up to first order the correlation dimension $\beta$ of integer lattices $\mathbb{Z}^d$ coincides with the Haussdorf dimension of their coarsely-equivalent Euclidean spaces, $\beta=d$.

Recently [L. Lacasa and J. Gómez-Gardeñes, Phys. Rev. Lett. 110, 168703 (2013)], a fractal dimension has been proposed to characterize the geometric structure of networks. This measure is an extension to graphs of the so called correlation dimension, originally proposed by Grassberger and Procaccia to describe the geometry of strange attractors in dissipative chaotic systems. The calculation of the correlation dimension of a graph is based on the local information retrieved from a random walker navigating the network. In this contribution we study such quantity for some limiting synthetic spatial networks and obtain analytical results on agreement with the previously reported numerics. In particular, we show that up to first order the correlation dimension β of integer lattices Z d coincides with the Haussdorf dimension of their coarsely-equivalent Euclidean spaces, β = d.

PACS numbers:
In this article we address the concept of correlation dimension which has been recently extended to network theory in order to efficiently characterize and estimate the dimensionality and geometry of complex networks [1]. This extension is inspired in the Grassberger-Procaccia method [2][3][4], originally designed to quantify the fractal dimension of strange attractors in dissipative chaotic dynamical systems. When applied to networks, it proceeds by capturing the trajectory of a random walker diffusing over a network with well defined dimensionality. From this trajectory, an estimation of the network correlation dimension is retrieved by looking at the scaling of the walker's correlation integral. Here we give analytical support to this methodology by obtaining the correlation dimension of synthetic networks representing well-defined limits of real networks. In particular, we explore fully connected networks and integer lattices, these latter being coarsely-equivalent [20] to Euclidean spaces. We show that their correlation dimension coincides with the the Haussdorff dimension of the respective coarsely-equivalent Euclidean space.

I. INTRODUCTION
During the last decade the science of networks has shed light on the importance that the real architecture of the interactions among the constituents of complex systems has on the onset of collective behavior [5][6][7]. In this way it has contributed to the advance in many branches of * Electronic address: l.lacasa@qmul.ac.uk † Electronic address: gardenes@gmail.com science, such as statistical physics and nonlinear dynamics, in which the understanding of collective phenomena is fundamental. While the structural aspects of networks have been largely explored by means of topological measures [8], their geometrical aspects have been ignored, with the remarkable exception of a few attempts to characterize the dimensionality of their complex interaction backbone [9][10][11][12]. For instance, the box-counting technique, widely used for estimating the capacity dimension D 0 of an object, was extended in [12][13][14][15] as a boxcovering algorithm, aimed at characterizing the dimensionality of complex networks.
Recently [1], we proposed an extension of the concept of correlation dimension [16] to estimate the dimensionality of complex networks by using random walkers to explore the network topology. This extension builds up on the well-known Grassberger-Procaccia method [2][3][4], originally designed to quantify the fractal dimension of strange attractors in dissipative chaotic dynamical systems. This approach relies on embedding a trajectory of the dynamical system in an m-dimensional space and calculating a correlation integral over this trajectory.
The rationale of the extension of the Grassberger-Procaccia method to the network realm is that the geometrical structure of the network restricts the movement of a random walker and, accordingly, a notion of dimensionality can be extracted through the properties of the walker's trajectory. In particular, if the trajectory evolves over some object with well-defined correlation dimension, such dimension, β, should be accessible experimentally through the scaling of the walker's correlation sum defined in the next section. In addition to its novelty, the use of the Grassberger-Procaccia method together with the machinery of random walks, provides another nice example of the use of walkers to capture the structure and organization of a complex network, such as the centrality of nodes [17], its community structure [18] or the existence of degree correlations [19].
In [1] we showed numerical estimates of the correlation sum for walkers navigating a set of synthetic and realworld networks, finding a range of dimensions 1 < β < 3 (comprising integer and fractal values) for systems such as the world-wide air transportation network, road and energy networks or the Internet. On the other hand, other networks lack a scaling for the correlation sum, distinguishing those systems whose structure has a strong degree of self-similarity from others in which such fundamental symmetry is missing. In the present contribution we give some analytical support to the findings and conjectures shown in [1]. We first address fully connected networks, which intuitively can only be embedded in infinite dimensional spaces, and show that the correlation dimension is indeed a diverging quantity. Then we address integer lattices, which are coarsely-equivalent [20] to Euclidean spaces, giving analytical evidence that their correlation dimension coincides with the Haussdorff dimension of the respective coarsely-equivalent Euclidean space.

II. CORRELATION DIMENSION FROM RANDOM WALKS IN NETWORKS
Let us start by briefly reviewing the generalization of the Grassberger-Procaccia method to the computation of the correlation dimension of a complex network. We denote by G a spatially embedded undirected network with N nodes and L links, so that each node i of G is labelled by a generic vector v i that uniquely determines the location of node i in the underlying space (v ∈ R d , or ∈ Z d when the space is discrete). The network topology is given by the so-called N × N adjacency matrix A, whose elements are defined (for undirected and unweighted graphs) as A ij = A ji = 1 when nodes i and j are connected and A ij = A ji = 0 otherwise.
Once the network is defined, we must define the dynamical evolution of a random walker on network G. The time-discrete version of a random walks determines that, at each time step t, the walker at some node i hops to one of the neighbors j with equal probability. In this way the transition matrix M of a walker defines the probability that a walker at node i at time t is at a node j at time t + 1 as: A il is the degree of node i. Thus, by initially setting the a walker at some randomly chosen node, one iterates the dynamics prescribed by matrix M and follows the trajectory of the walker (note at this point that in practice one does need to store M, as we only need to have (local) information at each time step of the neighbors of a given node, rendering this method useful for practical situations involving arbitrarily large networks, e.g. Internet). Now consider a trajectory of length n generated by an ergodic random walker navigating the network G as described above. The trajectory can be described as the sequence of visited nodes. In the case of spatial networks, the trajectory can be casted in the series {v 1 , v 2 , . . . , v n }, and embed the series in R m·d (where m is the embedding dimension) by defining the vector-valued series {V(t)}, where V(i) ∈ R m·d is defined as: Finally, the correlation sum function C m (r) is defined as the fraction of pairs of vectors whose distance is smaller than some similarity scalar r ∈ R: where Θ(x) is the Heaviside step function, and · is a Here, without loss of generality, we choose for convenience · as the L ∞ norm, . Note, that although within the seminal Grassberger-Procaccia method the use of the Euclidean norm was proposed [2][3][4], the use of L ∞ norm was later adopted by Takens in [21], although the results obtained should be norm invariant [16].
The main scaling conjecture that was proposed and addressed numerically in [1] states that when the series is extracted from the trajectory of a random walker navigating a network G with well defined dimension, for sufficiently long series and sufficiently small values of r, C m (r) evidences a scaling regime such that: The value β m approaches a constant value β m → β for sufficiently large embedding dimension m. This latter value β constitutes the estimate of the network's 'correlation dimension'. Notice that, in practice, the limit r → 0 should be substituted by a sufficiently small r which depends on the characteristic space labeling, i.e., if nodes are labelled by integer valued vectors then the limit r → 0 should be substituted by r n.

III. FULLY CONNECTED NETWORK
After introducing the basis for the calculation of the correlation dimension of spatial graphs we begin our study with the simple case of a fully connected network. This network, also termed as complete, is a graph G in which each node is connected with the rest of the N − 1 nodes and thus the adjacency matrix reads A ij = 1 − δ ij (with δ ij = 1 if i = j and δ ii = 0). Note that the fully connected network can be understood as the dense-limit (L → N 2 ) of a real network.
A fully connected network can be embedded in an Euclidean space with diverging dimensionality, where each node i is in turn labeled by an infinite dimensional vector: with α i ∈ N and e i · e j = δ ij . In order to prove that the correlation dimension β of such object diverges, we need to find that β m is a monotonically increasing function of the embedding dimension m.
In what follows we prove the above claim. First, notice that the transition matrix M (Eq. (1)) of a random walker navigating a fully connected network reads: Showing that the walker can hop between any pair of nodes i and j with equal probability. This makes the infinite dimensional labeling above arbitrary for any practical purpose. Thus for convenience and without loss of generality, we label each node by a random number x extracted from a uniform distribution U [0, 1]. Accordingly, a random walker navigating this fully connected network generates a trajectory which is a sequence of n independent and identically distributed random variables, Consider now the embedding vector as a positivedefinite random variable itself, i.e., extracted from some unknown probability density ξ ∈ ρ(x), x ≥ 0. After dropping irrelevant constants, the correlation sum (Eq. 3) reduces to the probability: Our program is based on the calculation of ρ(x).
Let us begin with embedding dimension m = 1. In this case V(i) = v i = x(i) and, according to the L ∞ norm: where we recall that x(i) and x(j) are uniformly distributed random variables. Trivially, ξ is distributed according to a triangular distribution f (x) = 2(1 − x). Hence ρ(x) = f (x) and For small values of r, the scaling is linear, and we obtain: that is, up to first order we find β 1 = 1.
In a second step let us consider the case m ≥ 2, for which V(i) = (x(i), x(i + 1), . . . , x(i + m − 1)), for which ξ = max{|x(i + l) − x(i + l + α)|; l = 0, ..., m − 1} , (12) where each of the random variables of the form |x(i) − x(j)| is now distributed following a triangular distribution f (x). Our problem thus lies in deriving how ξ is distributed. Note that this problem reduces to an extreme value problem, which can be solved using order statistics such that: where F (x) = x 0 f (x)dx is the cumulative distribution function of f (x). Therefore, in this general case the correlation sum yields: Thus, we conclude that, up to first order, the correlation sum of a random walker navigating a fully connected network evidences a so called trivial scaling with the similarity distance r: the exponent of the scaling β m increases linearly with the embedding dimension without saturation, β m = m. This result is reminiscent of the infinite dimensional attractor of white noise in the original Grassberger-Procaccia procedure [2][3][4], and, applied to the network realm, it corresponds to an infinite correlation dimension.

IV. INTEGER LATTICES
In what follows we address integer lattices Z d , which are coarsely-equivalent [20] to Euclidean spaces with Haussdorff dimension d. For d ≤ 2 these lattices are, for instance, the regular-limit of road or infrastructure networks (in this limit, all nodes have the same degree k i = 2 d ∀i = 1, ..., N and are homogeneously located in the underlying space, tiling it in a regular way), and for d ≥ 3 these lattices respresent discretizations of the Euclidean space.

A. 1D Lattice
A 1D lattice is simply a chain graph which, intuitively, tends to an object of Haussdorff dimension one as the distance between nodes shrinks continuously to zero. In what follows we propose two alternative proofs, a ballistic approximation and a calculation based on the unbiased motion of random walkers, both showing that the correlation dimension of 1D lattices is β = 1.

Ballistic approximation
As an approximation (relaxed below), let us first consider the case of a ballistic (deterministic) walker in the 1D lattice. If this lattice is labeled without loss of generality by integers (where two adjacent nodes are labeled as i and i + 1, and A ij = δ i,i+1 + δ i,i−1 ), then a typical walker produces the string {i, i + 1, i + 2, i + 3, i + 4, ...} or, by symmetry {i, i − 1, i − 2, i − 3, i − 4, ...}. Both cases are equivalent and therefore yield equivalent results. We shall therefore address the former for concreteness.
Let us start with embedding dimension m = 1. Then, is a deterministic variable, and therefore the correlation sum can be explicitly calculated as Now, for an arbitrary embedding dimension m, the embedded vectors are of the form: and according to the L ∞ norm we obtain: Thus, the arbitrary m case reduces to the case m = 1, so that β m = 1 ∀m, showing, under the ballistic assumption, a correlation dimension β = 1.

Random walker
Now we relax the ballistic approximation shown above and present address the correlation dimension derived from the motion of a random walker. First, we label again without loss of generality the nodes of the 1D lattice by consecutive integers, and start by considering an embedding dimension m = 1. In this case the random walker performs a simple walk in Z, and To analyze how ξ is distributed it is easy to notice that the distance between x(i)−x(j) is generated through the sum of j − i random variables, each of which is extracted from {−1, +1}, which tends to a normal distribution with zero mean and variance |j−i| by virtue of the central limit theorem. Therefore, ξ is the absolute value of the sum of j − i random variables, whose distribution tends to a folded normal distribution with zero mean and variance |j − i|. Therefore, after dropping irrelevant constants we obtain: so that where erf(x) is the error function that fulfills: whose first order is r for r ≤ 10 −3 n (see the left panel of Fig. 1), and therefore: i.e., up to first order β 1 = 1 for sufficiently large n and sufficiently small r.
As a second step, consider an embedding dimension m = 2. In this situation, Now, the important point is that these three random variables are completely correlated: they are not independent realizations but, on the contrary, all three depend on a single realization of the duple {x(i), x(j)}. Therefore, we do not need to apply order statistics in this case: ξ is again folded-normally distributed. The argument then proceeds as for m = 1 such that C 2 (r) ∼ r + h.o.t..
A similar argument holds for a general embedding dimension,m, and therefore we conclude that for a 1D lattice, an unbiased random walker generates a correlation sum which, in an embedding dimension m reads: that is to say, up to first order the predicted correlation dimension of the 1D lattice is again β = 1.

B. Lattice 2D
We now consider a random walker in a 2D lattice. This is a regular network where all nodes have degree k i = 4 that tiles Z 2 . In what follows we prove that, up to first order, the correlation dimension of this network is β = 2.
In this case each node of this lattice is labelled by a two dimensional vector (x, y), where x, y ∈ Z. Accordingly, a random walker generates a trajectory of the form x(i + 1) y(i + 1) , x(i + 2) y(y + 2) , . . . , x(n) y(n) where the initial x(i) and y(i) are uncorrelated random variables extracted from a uniform discrete distribution U (1, n) and the trajectory is the result of the Markov process defined as: x(i + 1) = x(i) + 1, with probability 1/2 x(i) − 1, with probability 1/2 (25) and y(i + 1) = y(i) + 1, with probability 1/2 y(i) − 1, with probability 1/2 (26) Let us begin analyzing the case of embedding dimension m = 1. In this case: where η 1 and η 2 are random variables with a probability distribution f (x) which reduces to the case of a 1D lattice, i.e., by dropping irrelevant terms: Therefore, according to order statistics, we find that: and finally the correlation sum reads: i.e., up to a first order expansion in r, the correlation sum for m = 1, C 1 (r), scales quadratically (see the right panel of Fig. 1 for a numerical check).
Finally, in the general case m > 1, one can trivially follow an argument similar to the one used for a random walker in a 1D lattice, finding that: i.e., the exponent β m saturates to the correlation dimension β = 2.

C. Lattice dD
To round off, now we prove that in the general case of integer lattices (for a general value d), the correlation dimension of the lattice coincides, up to first order, with the Haussdorff dimension of the coarsely equivalent Euclidean space β = d.
First, the trajectory generated by the walker in a d dimensional lattice, where each node is labelled by a d dimensional vector (x 1 , x 2 , . . . , x d ), x i ∈ Z ∀i = 1, 2, . . . , d, is: and therefore, for a one dimensional embedding (m = 1) we have where η l = |x l (i) − x l (j)| are random variables with a probability distribution f (x). Finding the probability density of ξ is again an extreme value problem, where order statistics predicts: Therefore, the correlation sum for m = 1 reads: up to a first order expansion in r. In the general case m > 1, an argument similar to the one used for a random walker in a 1D lattice holds, thus finding that indeed i.e., the correlation sum scales with D and thus the correlation dimension of a dD lattice is β = d.

V. CONCLUSION
Recently, the notion of fractal dimensionality has been investigated numerically within networks [1,12,13,15]. The techniques used have borrowed concepts from measure theory and dynamical systems such as the capacity and correlation dimension respectively. To this aim the corresponding techniques, such as the classical box-counting algorithm and the Grassberger-Procaccia method, have been generalized to the network realm.
In this manuscript we have focused on the latter of these techniques to show that the correlation dimension of some synthetic networks, as defined in [1] and in equations 3 and 4, coincides with the Haussdorff dimension of their coarsely equivalent Euclidean spaces [20]. Note that a network and an Euclidean space are very different objects in the small-scale (their topology is entirely different) but they resemble each other in the large-scale. Therefore, our results although desired and expected, are nontrivial.
In addition, the analytical calculations shown in this manuscript illustrate the validity of the numerical results shown in [16] in more sophisticated synthetic and real-world network. However, finding similar analytical evidences in the case of empirical networks is quite a difficult task. A slightly easier problem which is left for future work is to address the correlation dimension of spatially embedded complex network ensembles with robust statistical properties, i.e., the so-called annealed graphs [22][23][24][25].