Continuity Scaling: A Rigorous Framework for Detecting and Quantifying Causality Accurately

Data-based detection and quantification of causation in complex, nonlinear dynamical systems is of paramount importance to science, engineering, and beyond. Inspired by the widely used methodology in recent years, the cross-map-based techniques, we develop a general framework to advance towards a comprehensive understanding of dynamical causal mechanisms, which is consistent with the natural interpretation of causality. In particular, instead of measuring the smoothness of the cross-map as conventionally implemented, we define causation through measuring the scaling law for the continuity of the investigated dynamical system directly. The uncovered scaling law enables accurate, reliable, and efficient detection of causation and assessment of its strength in general complex dynamical systems, outperforming those existing representative methods. The continuity scaling-based framework is rigorously established and demonstrated using datasets from model complex systems and the real world.


Introduction
Identifying and ascertaining causal relations are a problem of paramount importance to science and engineering with broad applications [1][2][3]. For example, accurate detection of causation is the key to identifying the origin of diseases in precision medicine [4] and is important to fields such as psychiatry [5]. Traditionally, associational concepts are often misinterpreted as causation [6,7], while causal analysis in fact goes one step further beyond association in a sense that, instead of using static conditions, causation is induced under changing conditions [8]. The principle of Granger causality The cross-map is originated from nonlinear time series analysis [37][38][39][40][41][42]. A brief understanding of such a map is as follows. Consider two subsystems: X and Y. In the reconstructed phase space of X, if for any state vector at a time a set of neighboring vectors can be found, the set of the cross-mapped vectors, which are the partners with equal time of X, could be available in Y. The cross-map underlying the reconstructed spaces can be written as Y t = ΦðX t Þ (where X t and Y t are delay coordinates with sufficiently large dimensions) for the case of Y unidirectionally causing X while, mathematically, its inverse map does not exist [34]. In practice, using the prior knowledge on the true causality in toy models or/and the assumption on the expanding property of Φ (representing by its Jacobian's singular value larger than one in the topological causality framework [24]), scientists developed many practically useful techniques based on the cross-map for causality detection. For instance, the "activity" method, originally designed to measure the continuity of the inverse of the cross-map, compares the divergence of the cross-mapped vectors to the state vector in X with the divergence of the independentlyselected neighboring vectors to the same state vector [22,23]. The topological causality measures the divergence rate of the cross-mapped vectors from the state vectors in Y [24], and the convergent cross-mapping (CCM), increasing the length of time series, compares the true state vector Y with the average of the cross-mapped vectors, as the estimation of Y [21,[25][26][27][28][29][30][31][32][33][34][35][36]. Then, the change of the divergence or the accuracy of the estimation is statistically evaluated for determining the causation from Y to X. Inversely, the causation from X to Y can be evaluated in an analogous manner. The above evaluations [21,24,[26][27][28][29][30][31][32][33][34][35][36] can be understood at a conceptional and qualitative level and perform well in many demonstrations.
In this work, striving for a comprehensive understanding of causal mechanisms and inspired by the cross-map-based techniques, we develop a mathematically rigorous framework for detecting causality in nonlinear dynamical systems, turning eyes towards investigating the original systems from their cross-maps, which is also logically consistent with the natural interpretation of causality as functional dependences [2,8]. The skills used in cross-map-based methods are assimilated in our framework, while we directly study the original dynamical systems or the reconstructed systems instead of the cross-maps. The foundation of our framework is the scaling law for the changing relation of ε with δ arising from the continuity for the investigated system, henceforth the term "continuity scaling". In addition to providing a theory, we demonstrate, using synthetic and real-world data, that our continuity scaling framework is accurate, computationally efficient, widely applicable, showing advantages over the existing methods.

Continuity Scaling Framework
To explain the mathematical idea behind the development of our framework, we use the following class of discrete time dynamical systems: x t+1 = fðx t , y t Þ and y t+1 = gðx t , y t Þ for t ∈ ℕ, where the state variables fx t g t∈ℕ , fy t g t∈ℕ evolve in the compact manifolds M, N of dimension D M , D N under sufficiently smooth map f, g, respectively. We adopt the common recognition of causality in dynamical systems. Definition 1. If the dependence of fðx, yÞ on y is nontrivial (i. e., a directional coupling exists), a variation in y results in a change in the value of fðx, yÞ for any given x, which, according to the natural interpretation of causality [2,43], admits that y : fy t g t∈ℕ has a direct causal effect on x : fx t g t∈ℕ , denoted by y↪x, as shown in the upper panel of Figure 1(a).
We now interpret the causal relationship stipulated by the continuity of a function. Let f x g ð·Þ ≜ fðx g , ·Þ for a given point x g ∈ M. For any y P ∈ N , we denote its image under the given function by x I ≜ f x g ðy P Þ. Applying the logic statement of a continuous function to f x g ð·Þ, we have that, for any neighborhood Oðx I , εÞ centered at x I and of radius ε > 0, there exists a neighborhood Oðy P , δÞ centered at y P of radius δ > 0, such that f x g ðOðy P , δÞÞ ⊂ Oðx I , εÞ. The neighborhood and its radius are defined by where dist S ð·, · Þ represents an appropriate metric describing the distance between two given points in a specified manifold S with S = M or N . The meaning of this mathematical statement is that, if we have a neighborhood of the resulting variable x I first, we can then find a neighborhood for the causal variable y P to satisfy the above mapping and inclusion relation. This operation of "first-ε-then-δ" provides a rigorous base for the principle that the information about the resulting variable can be used to estimate the information of the causal variable and therefore to ascertain causation, as indicated by the long arrow in the middle panels of Figure 1(a). Note that, the existence of the δ > 0 neighborhood is always guaranteed for a continuous map f x g . In fact, due to the compactness of the manifold N , a largest value of δ exists. However, if y P does not have an explicit causal effect on the variable x I , i.e., f x g is independent of y P , the existence of δ is still assured but it is independent of the value of ε, as shown in the upper panel of Figure 1(b). This means that merely determining the existence of a δneighborhood is not enough for inferring causation -it is necessary to vary ε systematically and to examine the scaling relation between δ and ε. In the following we discuss a number of scenarios.
Case I. Dynamical variables fðx t , y t Þg t∈ℕ are fully measurable. For any given constant ε x > 0, the set fx τ ∈ Mjτ ∈ I t x ðε x Þg can be used to approximate the neighborhood Oðx t+1 , ε x Þ, where the time index set is The radius δ t y = δ t y ðε x Þ of the neighborhood Oðy t , δ t y Þ 2 Research where #½· is the cardinality of the given set and the index set The strict mathematical steps for estimating δ t y are given in Section II of Supplementary Information (SI). We emphasize that here correspondence between x t+1 and y t is investigated, differing from the cross-map-based methods, with one-step time difference naturally arising. This consideration yields a key condition [DD], which is only need when considering the original iteration/flow and whose detailed description and universality are demonstrated in SI. We reveal a linear scaling law between hδ t y i t∈ℕ and ln ε x , as shown in the lower panels of Figure 1, whose slope s y↪x is an indicator of the correspondent relation between ε and δ and hence the causal relation y↪x. Here, h·i t∈ℕ denotes the average over time. In particular, a larger slope value implies a stronger causation in the direction from y to x as represented by the map functions fðx t , y t Þ (Figure 1(a)), while a near zero slope indicates null causation in this direction ( Figure 1(b)). Likewise, possible causation in the reversed direction, x↪y, as represented by the function gð x t , y t Þ, can be assessed analogously. And the unidirectional case when fðx, yÞ = f 0 ðxÞ independent of y is uniformly considered in Case II. We summarize the consideration below and an argument for the generic existence of the scaling law is provided in Section II of SI.

Theorem 2.
For dynamical variables fðx t , y t Þg t∈ℕ measured directly from the dynamical systems, if the slope s y↪x defined above is zero, no causation exists from y to x. Otherwise, a directional coupling can be confirmed from y to x and the slope s y↪x increases monotonically with the coupling strength.
Case II. The dynamical variables fðx t , y t Þg t∈ℕ are not directly accessible but measurable time series fu t g t∈ℕ and fv t g t∈ℕ are available, where u t = uðx t Þ and v t = vðy t Þ with u: M ⟶ ℝ r u and v: N ⟶ ℝ r v being smooth observational functions. To assess causation from y to x, we assume onedimensional observational time series (for simplicity): r u = r v = 1, and use the classical delay-coordinate embedding method [37][38][39][40][41][42]44] to reconstruct the phase space: where τ u,v is the delay time and d u,v > 2ðD M + D N Þ is the embedding dimension that can be determined using some standard criteria [45]. As illustrated in Figure 2, the dynamical evolution of the reconstructed states fðu t , v t Þg t∈ℕ is governed by The map functions can be calculated asfðu, vÞ ≜ E u ∘ ½ f, gðΠ 1 Figure 1: Illustration of causal relation between two sets of dynamical variables. (a) Existence of causation from y in N to x in M, where each correspondence from x t+1 to y t is one-to-one, represented by the line or the arrow, respectively, in the upper and the middle panels. In this case, a change in ln ε x results in a direct change in δ y (the lower panel) with ε x and δ y denoting the neighborhood size of the resulting variable x and of the causal variable y, respectively. (b) Absence of causation from y to x, where every point on each trajectory, fy t g, in N could be the correspondent point from x t+1 in M (the upper panel) and thus every point in N belongs to the largest δ-neighborhood of y t (the middle panel). In this case, δ y does not depend on ε x (the lower panel). Also refer to the supplemental animation for illustration.

Research
with the inverse function E −1 s defined onL s , ½f, g k representing the kth iteration of the map and the projection mappings as Π 1 ðx, yÞ = x and Π 2 ðx, yÞ = y. Case II has now been reduced to Case I, and our continuity scaling framework can be used to ascertain the causation from v to u based on the measured time series with the indices I t u ðε u Þ, δ t v ðε u Þ and s v↪u (equations (2) and (3)).
Does the causation from v to u imply causation from y to x? The answer is affirmative, which can be argued, as follows. If the original map function f is independent of y: fðx, yÞ = f 0 ðxÞ, there is no causation from y to x. In this case, the embedding E u ðx, yÞ becomes independent of y, degenerating into the form of E u ðx, yÞ = E u0 ðxÞ, a diffeomorphism from M toL u0 = E u0 ðMÞ only. As a result, equation (4) becomes u0 ð uÞ and the resulting mappingf 0 is independent of v. The independence can be validated by computing the slope s v↪u associated with the scaling relation between hδ t v i t∈ℕ and ln ε u , where a zero slope indicates null causation from v to u and hence null causation from y to x. Conversely, a finite slope signifies causation between the variables. Thus, any type of causal relation (unidirectional or bi-directional) detected between the reconstructed state variables fðu t , v t Þg t∈ℕ implies the same type of causal relation between the internal but inaccessible variables x and y of the original system.
Case III. The structure of the internal variables is completely unknown. Given the observational functionsũ,ṽ: M × N ⟶ ℝ withũ t =ũðx t , y t Þ andṽ t =ṽðx t , y t Þ, we first reconstruct the state space:ũ t = ðũ t ,ũ t+τ ,⋯,ũ t+ðd−1Þτ Þ T andṽ t = ðṽ t ,ṽ t+τ ,⋯,ṽ t+ðd−1Þτ Þ T . To detect and quantify causation fromṽ toũ (or vice versa), we carry out a continuity scaling analysis with the modified indices I tũ ðεũÞ, δ t v ðεũÞ and sṽ ↪ũ . Differing from Case II, here, due to the lack of knowledge about the correspondence structure between the internal and observational variables, a causal relation for the latter does not definitely imply the same for the former.
Case IV. Continuous-time dynamical systems possessing a sufficiently smooth flow fS t ; t ∈ ℝg on a compact manifold H : dS t ðu 0 Þ/dt = χðS t ðu 0 ÞÞ, where χ is the vector field. Let fû t=ωn+ν g n∈ℤ and fv t=ωn+ν g n∈ℤ be two respective time series from the smooth observational functionsû,v: H ⟶ ℝ witĥ u t =ûðS t Þ andv t =vðS t Þ, where 1/ω is the sampling rate and ν is the time shift. Defining Ξ ≜ S ω : H ⟶ H andŜ n ≜ S ωn+ν ðu 0 Þ, we obtain a discrete-time system asŜ n+1 = ΞðŜ n Þ with the observational functions asû n =ûðŜ n Þ andv n =vð S n Þ, reducing the case to Case III and rendering applicable our continuity scaling analysis to unveil and quantify the causal relation between fû t=ωn+ν g n∈ℤ and fv t=ωn+ν g n∈ℤ . If the domains ofû andv have their own restrictions on some particular subspaces, e.g.,û: the case is further reduced to Case II, so the detected causal relation between the observational variables imply causation between the internal variables belonging to their respective subspaces.

Demonstrations: From Complex Dynamical Models to Real-World Networks
To demonstrate the efficacy of our continuity scaling framework and its superior performance, we have carried out extensive numerical tests with a large number of synthetic and empirical datasets, including those from gene regulatory networks as well as those of air pollution and hospital admission. The practical steps of the continuity scaling framework together with the significance test procedures are described in Methods. We present three representative examples here, while leaving others of significance to SI. The first example is an ecological model of two unidirectionally interacting species: x 1,t+1 = x 1,t ð3:8 − 3:8x 1,t − μ 12 x 2,t Þ and x 2,t+1 = x 2,t ð3:7 − 3:7x 2,t − μ 21 x 1,t Þ. With time series fðx 1,t , x 2,t Þg t∈ℕ obtained from different values of the coupling parameters, our continuity scaling framework yields correct results of different degree of unidirectional causation, as shown in Figures 3(a) and 3(b). In all cases, there exists a reasonable range of ln ε x 2 (neither too small nor too large) from which the slope s x 1 ↪x 2 of the linear scaling can be extracted. The statistical significance of the estimated slope values and consequently the strength of causation can be assessed with the standard p-value test [46] (Methods and SI). An ecological model with bidirectional coupling has also been tested (see Section III of SI). Figures 3(c) and 3 igure 2: Illustration of system dynamics before and after embedding for Case II. In the left panel, the arrows describe how the original systems ðf, gÞ is equivalent to the system ðf,gÞ after embedding. In the right panel, causation between the internal variables x and y can be ascertained by detecting the causation between the variables u and v reconstructed from measured time series. 4 Research show the results from ecological networks of five mutually interacting species on a ring and on a tree structure, respectively, where the color-coded slope values reflect accurately the interaction patterns in both cases. The second example is the coupled Lorenz system: _ We use time series fy 1,t , y 2,t g t=nω for detecting different configurations of causation (see Section III of SI). Figure 4 presents the overall result, where the color-coded estimated values of the slope from the continuity scaling are shown for different combinations of the sampling rate 1/ω and coupling strength. Even with relatively low sampling rate, our continuity scaling framework can successfully detect and quantify the strength of causation. Note that the accuracy does not vary monotonously with the sampling rate, indicating the potential of our framework to ascertain and quantify causation even with rare data. Moreover, the proposed index can accurately reflect the true causal strength (denoting by the coupling parameter), which is also evidenced by numerical tests in Sections III and IV of SI. Robustness tests against different noise perturbations are provided in Section III of SI demonstrating the practical usefulness of our framework. Additionally, analogous to the first example, we present in SI several examples on causation detection in the coupled Lorenz system with nonlinear couplings, and the Rössler-Lorenz system, etc., which further demonstrates the generic efficacy of our framework.
In addition, we present study on several real-world dataset, which brings new insights to the evolutionary mechanism of underlying systems. We study gene expression data from DREAM4 in silico Network Challenge [47,48], whose intrinsic gene regulatory networks (GRNs) are known for verification (Figure 5(a) and Figure S17 of SI). Applying 5 Research our framework to these data, we ascertain the causations between each pair of genes by using the continuity scaling framework. The corresponding ROC curves for five different networks as well as their AUROC values are shown in Figure 5(b), which indicates a high detection accuracy in dealing with real-world data.
We then test the causal relationship in a marine ecosystem consisting of Pacific sardine landings, northern anchovy landings and sea surface temperature (SST). We reveal new findings to support the competing relationship hypothesis stated in [49] which cannot be detected by CCM [25]. As pointed out in Figure 6, while common influence from SST to both species is verified with both methods, our continuity scaling additionally illuminates notable influence from anchovy to sardine with its reverse direction being less significant. While competing relationship plays an important role in ecosystems, continuity scaling can reveal more essential interaction mechanism. See Section III.E of SI for more details.
Moreover, we study the transmission mechanism of the recent COVID-19 pandemic. Particularly, we analyze the daily new cases of COVID-19 of representative countries for two stages: day 1 (January 22 nd 2020) to day 100 (April 30 th 2020) and day 101 (May 1 st 2020) to day 391 (February 15 th 2021). Our continuity scaling is pairwisely applied to reconstruct the transmission causal network. As shown in Figure 7, China shows a significant effect on a few countries at the first stage and this effect disappears at the second stage. However, other countries show a different situation with China, whose external effect lasts as shown in Section III.E and Figure S18 of SI. Our results accord with that China holds stringent epidemic control strategies with  Table S9 of SI for all the other parameters including the time series lengths used in the simulations.  Research sporadic domestic infections, as evidenced by official daily briefings, demonstrating the potential of continuity scaling in detecting causal networks for ongoing complex systems. Additionally, We emphasize that day 100 is a suitable critical day to distinguish the early severe stage and the late well-under-control stage of the pandemic (see Figure S18(a) of SI), while slight change of the critical day will not nullify our result. As shown in Figure S18(b) of SI, when the critical day varies from day 94 to day 106, no significant change (less than 5%) of the detected causal links occurs at both stages, and the number of countries under influence of China at Stage 2 remains zero. See more details in Section III.E of SI. Additional real world examples including air pollutants and hospital admission record from Hong Kong are also shown in Section III of SI.

Discussion
To summarize, we have developed a novel framework for data-based detection and quantification of causation in complex dynamical systems. On the basis of the widely used cross-map-based techniques, our framework enjoys a rigorous foundation, focusing on the continuity scaling law of the concerned system directly instead of only investigating the continuity of its cross-map. Therefore, our framework is consistent with the standard interpretation of causality, and works even in the typical cases where several existing typical methods do not perform that well or even they fail (see the comparison results in Section IV of SI). In addition, the mathematical reasoning leading to the core of our framework, the continuity scaling, helps resolve the long-standing issue associated with techniques directly using cross-map that information about the resulting variables is required to project the dynamical behavior of the causal variables, whereas several works in the literature [50], which directly studied the continuity or the smoothness of the cross-map, likely yielded confused detected results on causal directions.
Computational complexity. The computational complexity of the algorithm is OðT 2 N ε Þ, which is relatively smaller than the CCM method, whose computational complexity is OðT 2 log TÞ.
Limitations and future works. Nevertheless, there are still some spaces for improving the presently proposed framework. First, currently, only bivariate detection algorithm is designed, so generalization to multivariate network inference requires further considerations, as analogous to those works presented in Refs. [51][52][53]. Second, the causal time delay has not been taken into account in the current framework, so it also could be further investigated, similar to the work reported in Ref. [33]. Also, more advanced algorithms, such as the one developed in Ref. [54], could be  7 Research integrated into this framework for detecting those temporal causal structures. Definitely, we will settle these questions in our future work.
Detecting causality in complex dynamical systems has broad applications not only in science and engineering, but also in many aspects of the modern society, demanding accurate, efficient, and rigorously justified and hence trustworthy methodologies. Our present work provides a vehicle along this feat and indeed resolves the puzzles arising in the use of those influential methods.

Methods
Continuity scaling framework: a detailed description of algorithms. Let fu t g t=1,2,⋯,T and fv t g t=1,2,⋯,T be two experimentally measured time series of internal variables fðx t , y t Þg t∈ℕ . Typically, if the dynamical variables fðx t , y t Þg t∈ℕ are accessible, fðu t , v t Þg reduce to one-dimensional coordinate of the internal system. The key computational steps of our continuity scaling framework are described, as follows.
We reconstruct the phase space using the classical method of delay coordinate embedding [37] with the optimal embedding dimension d z and time lag τ z determined by the methods in Refs. [55,56] (i.e., the false nearest neighbors and the delayed mutual information, respectively): where z = u, v, T 0 = min fT − ðd z − 1Þτ z jz = u, vg, and Euclidean distance is used for both L u,v . We present the steps for causation detection using the case of v↪u as an example.
We calculate the respective diameters for L u,v as where z = u, v, and z = u, v. We set up a group of numbers, fε u,j g j=1,⋯,N ε , as ε u,1 = e · D u , ε u,N ε = D u , with the other elements satisfying for j = 2, ⋯, N ε − 1. Then, in light of (2) with (3), we have with where numerically, ε u alters its value successively from the set fε u,j g j=1,⋯,N ε , and the threshold E is a positive number chosen to avoid the situation where the nearest neighboring points are induced by the consecutive time order only. As defined, hδ t v ðε u Þi t∈ℕ is the average of δ t v ðε u Þ over all possible time t. We use a finite number of pairs fðhδ t v ðε u,j Þi t∈ℕ T 0 , ln ε u,j Þg j=1,⋯,N ε to approximate the scaling relation between hδ t v ðε u Þi t∈ℕ and ln ε u , where ℕ T 0 = f1, 2,⋯,T 0 g. Theoretically, a larger value of N ε and a smaller value of e will result in a more accurate approximation of the scaling relation. In practice, the accuracy is determined by the length of the observational time series, the sampling duration, and different types of noise perturbations. In numerical simulations, we set e = 0:001 and N ε = 33. In addition, a too large or a too small value of ε u can induce insufficient data to restore the neighborhood and/or the entire manifold. We thus set δ t v ðε u,j Þ = δ t v ðε u,j+1 Þ as a practical technique as the number of points is limited practically in a small neighborhood. As a result, near zero slope values would appear on both sides of the scaling curve hδ t v ðε u Þi t∈ℕ -ln ε u , as demonstrated in Figure 3 and in SI. In such a case, to estimate the slope of the scaling relation, we take the following approach.
Define a group of numbers by where j = 1, ⋯, N ε − 1, sort them in a descending order, from which we determine the ½N ε + 1/2 largest numbers, collect their subscriptsj's together as an index setĴ, and set H ≜ fj, j + 1jj ∈Ĵg. Applying the least squares method to the linear regression model: with the dataset fðhδ t v ðε u,j Þi t∈ℕ T 0 , ln ε u,j Þg j∈H , we get the optimal values ðŜ,bÞ for the parameters ðS, bÞ in (12) and finally obtain the slope of the scaling relation as s v↪u ≜Ŝ.
For the other causal direction from u to v, these steps are equally applicable to estimating the slope s u↪v .
To assess the statistical significance of the numerically determined causation, we devise the following surrogate test using the case of v causing u as an illustrative example.
Divide the time series fuðtÞg t∈ℕ T 0 into N G consecutive segments of equal length (except for the last segment -the shortest segment). Randomly shuffle these segments and then regroup them into a surrogate sequence fûðtÞg t∈ℕ T 0 . Applying such a random permutation method to fvðtÞg t∈ℕ T 0 generates another surrogate sequence fvðtÞg t∈ℕ T 0 . Carrying out the slope computation yields sv ↪û . The procedure can be repeated for a sufficient number of times, say Q, which consequently yields a group of estimated slopes, denoted as fs where normcdf ½· is the cumulative Gaussian distribution function. The principle of statistical hypothesis testing guarantees the existence of causation from v to u if p s v↪u < 0:05.
In simulations, we set the number of segments to be N G = 25 and the number of times for random permutations to be Q ≥ 20.

Additional Points
Code Availability. The source codes for our CS framework are available at https://github.com/bianzhiyu/ContinuityScaling.

Conflicts of Interest
The authors declare no competing interests.