| Title: | Cross-Sectional Dependence Models |
|---|---|
| Description: | Provides estimators and utilities for large panel-data models with cross-sectional dependence, including mean group (MG), common correlated effects (CCE) and dynamic CCE (DCCE) estimators, and cross-sectionally augmented ARDL (CS-ARDL) specifications, plus related inference and diagnostics. |
| Authors: | Joao Claudio Macosso [aut, cre] (ORCID: <https://orcid.org/0009-0006-5051-9312>) |
| Maintainer: | Joao Claudio Macosso <[email protected]> |
| License: | GPL-3 |
| Version: | 1.0.1 |
| Built: | 2026-05-23 08:46:49 UTC |
| Source: | https://github.com/macosso/csdm |
Computes Pesaran CD, CDw, CDw+, and CD* tests for cross-sectional dependence
in panel residuals. The implementation supports residual matrices or fitted
csdm_fit objects and provides consistent handling of unbalanced panels.
cd_test(object, ...) ## Default S3 method: cd_test( object, type = c("CD", "CDw", "CDw+", "CDstar", "all"), n_pc = 4L, seed = NULL, min_overlap = 2L, na.action = c("drop.incomplete.times", "pairwise"), ... ) ## S3 method for class 'csdm_fit' cd_test( object, type = c("CD", "CDw", "CDw+", "CDstar", "all"), n_pc = 4L, seed = NULL, min_overlap = 2L, na.action = c("drop.incomplete.times", "pairwise"), ... ) ## S3 method for class 'cd_test' print(x, digits = 3, ...)cd_test(object, ...) ## Default S3 method: cd_test( object, type = c("CD", "CDw", "CDw+", "CDstar", "all"), n_pc = 4L, seed = NULL, min_overlap = 2L, na.action = c("drop.incomplete.times", "pairwise"), ... ) ## S3 method for class 'csdm_fit' cd_test( object, type = c("CD", "CDw", "CDw+", "CDstar", "all"), n_pc = 4L, seed = NULL, min_overlap = 2L, na.action = c("drop.incomplete.times", "pairwise"), ... ) ## S3 method for class 'cd_test' print(x, digits = 3, ...)
object |
A |
... |
Additional arguments passed to methods. |
type |
Which test(s) to compute: one of |
n_pc |
Number of principal components for CD* (default 4). |
seed |
Integer seed for weight draws in CDw/CDw+ (default NULL = no seed set). |
min_overlap |
Minimum number of overlapping time periods required for a unit pair to be included in CD/CDw/CDw+ (default 2). |
na.action |
How to handle missing data: |
x |
An object of class |
digits |
Number of digits to print (default 3). |
Let be the residual matrix with cross-sectional units and
time periods. For each unit pair , let be the number of
overlapping time periods and the pairwise correlation.
Random sign flips are applied to residuals before
computing correlations. The statistic is CD applied to the sign-flipped data.
Power enhancement adds a sparse thresholding term to CDw. The threshold is
and the power term sums for pairs exceeding
the threshold.
CD is computed on residuals after removing n_pc principal components
from . This provides a bias-corrected test under multifactor errors.
Always use pairwise-complete observations. Each pairwise correlation uses available overlaps.
Requires a balanced panel. By default, na.action = "drop.incomplete.times"
removes any time period with missing observations. With na.action = "pairwise",
CD* returns NA and a warning when missing values are present.
An object of class cd_test with fields tests, type,
N, T, na.action, and call. The tests list
contains one or more test results, each with statistic and p.value.
Pesaran MH (2015). “Testing weak cross-sectional dependence in large panels.” Econometric Reviews, 34(6-10), 1089–1117.
Pesaran MH (2021). “General diagnostic tests for cross-sectional dependence in panels.” Empirical Economics, 60(1), 13–50.
Juodis A, Reese S (2021). “The incidental parameters problem in testing for remaining cross-sectional correlation.” Journal of Business and Economic Statistics, 40(3), 1191–1203.
Fan J, Liao Y, Yao J (2015). “Power Enhancement in High-Dimensional Cross-Section Tests.” Econometrica, 83(4), 1497–1541.
Pesaran MH, Xie Y (2021). “A bias-corrected CD test for error cross-sectional dependence in panel models.” Econometric Reviews, 41(6), 649–677.
# Simulate independent and dependent panels set.seed(1) E_indep <- matrix(rnorm(100), nrow = 10) E_dep <- matrix(rnorm(10), nrow = 10, ncol = 10, byrow = TRUE) # Compute all tests cd_test(E_indep, type = "all") cd_test(E_dep, type = "all") # Specific test with parameters cd_test(E_indep, type = "CDstar", n_pc = 2) # From a fitted csdm model data(PWT_60_07, package = "csdm") df <- PWT_60_07 ids <- unique(df$id)[1:10] df_small <- df[df$id %in% ids & df$year >= 1970, ] fit <- csdm( log_rgdpo ~ log_hc + log_ck + log_ngd, data = df_small, id = "id", time = "year", model = "cce", csa = csdm_csa(vars = c("log_rgdpo", "log_hc", "log_ck", "log_ngd")) ) cd_test(fit, type = "all")# Simulate independent and dependent panels set.seed(1) E_indep <- matrix(rnorm(100), nrow = 10) E_dep <- matrix(rnorm(10), nrow = 10, ncol = 10, byrow = TRUE) # Compute all tests cd_test(E_indep, type = "all") cd_test(E_dep, type = "all") # Specific test with parameters cd_test(E_indep, type = "CDstar", n_pc = 2) # From a fitted csdm model data(PWT_60_07, package = "csdm") df <- PWT_60_07 ids <- unique(df$id)[1:10] df_small <- df[df$id %in% ids & df$year >= 1970, ] fit <- csdm( log_rgdpo ~ log_hc + log_ck + log_ngd, data = df_small, id = "id", time = "year", model = "cce", csa = csdm_csa(vars = c("log_rgdpo", "log_hc", "log_ck", "log_ngd")) ) cd_test(fit, type = "all")
Returns estimated mean-group coefficients from a csdm_fit object. For
model = "cs_ardl", the returned vector includes short-run mean-group
coefficients, the adjustment coefficient (named lr_<y>), and long-run
coefficients when available.
## S3 method for class 'csdm_fit' coef(object, ...)## S3 method for class 'csdm_fit' coef(object, ...)
object |
A fitted object of class |
... |
Currently unused. |
A named numeric vector of estimated coefficients.
summary.csdm_fit(), vcov.csdm_fit()
Estimate heterogeneous panel data models with optional cross-sectional augmentation and dynamic structure. The interface supports Mean Group (MG), Common Correlated Effects (CCE), Dynamic CCE (DCCE), and Cross-Sectionally Augmented ARDL (CS-ARDL) estimators with a consistent specification workflow for cross-sectional averages, lag structure, and variance-covariance estimation.
csdm( formula, data, id, time, model = c("mg", "cce", "dcce", "cs_ardl", "cs_ecm", "cs_dl"), csa = csdm_csa(), lr = csdm_lr(), pooled = csdm_pooled(), trend = c("none", "unit", "pooled"), fullsample = FALSE, mgmissing = FALSE, vcov = csdm_vcov(), ... )csdm( formula, data, id, time, model = c("mg", "cce", "dcce", "cs_ardl", "cs_ecm", "cs_dl"), csa = csdm_csa(), lr = csdm_lr(), pooled = csdm_pooled(), trend = c("none", "unit", "pooled"), fullsample = FALSE, mgmissing = FALSE, vcov = csdm_vcov(), ... )
formula |
Model formula of the form |
data |
A |
id, time
|
Column names (strings) for the unit and time indexes. If
|
model |
Estimator to fit. One of |
csa |
Cross-sectional-average specification, created by |
lr |
Long-run or dynamic specification, created by |
pooled |
Pooled specification (reserved for future use), created by
|
trend |
One of |
fullsample |
Logical; reserved for future extensions. |
mgmissing |
Logical; reserved for future extensions. |
vcov |
Variance-covariance specification, created by |
... |
Reserved for future extensions. |
Let index cross-sectional units and
index time. A baseline heterogeneous panel model is
Here is a unit-specific intercept, is a vector
of regressors, is a vector of unit-specific slopes, and
is an error term that may exhibit cross-sectional dependence.
Cross-sectional averages are specified through csdm_csa() and dynamic or
long-run structure is specified through csdm_lr(). This keeps the model
interface consistent across estimators while allowing the degree of
cross-sectional augmentation and lag structure to vary by application.
Implemented estimators
MG (Pesaran and Smith, 1995)
The Mean Group estimator fits separate regressions for each unit and averages the resulting coefficients:
This estimator accommodates slope heterogeneity but does not explicitly model cross-sectional dependence.
CCE (Pesaran, 2006)
Regressions are augmented with cross-sectional averages to proxy unobserved common factors:
A common choice is
with
More generally, collects the cross-sectional averages
specified in csa.
DCCE (Chudik and Pesaran, 2015)
Dynamic CCE extends CCE by allowing lagged dependent variables and lagged cross-sectional averages:
In the package implementation, lagged dependent variables and distributed
lags of regressors are controlled through lr, while contemporaneous
and lagged cross-sectional averages are controlled through csa.
CS-ARDL (Chudik and Pesaran, 2015)
In the package implementation, model = "cs_ardl" is obtained by first
estimating a cross-sectionally augmented ARDL-style regression in levels,
using the same dynamic specification as model = "dcce", and then
transforming the unit-specific coefficients into adjustment and long-run
parameters.
The underlying unit-level regression is of the form
From this dynamic specification, the package recovers the implied error-correction form
where is the adjustment coefficient and is
the implied long-run relationship. In the current implementation, these
quantities are computed from the estimated lag polynomials rather than from a
direct ECM regression.
Identification and assumptions
MG requires sufficient time-series variation within each unit.
CCE relies on cross-sectional averages acting as proxies for latent common factors, together with adequate cross-sectional and time dimensions.
DCCE additionally requires enough time periods to support lagged dependent variables, distributed lags, and lagged cross-sectional averages.
CS-ARDL requires sufficient time length for the distributed-lag structure and is intended for applications where both short-run dynamics and long-run relationships are of interest in the presence of common factors.
An object of class csdm_fit containing estimated coefficients,
residuals, variance-covariance estimates, model metadata, and diagnostics.
Use summary(), coef(), residuals(), vcov(), and
cd_test() to access standard outputs.
Pesaran MH, Smith R (1995). “Estimating long-run relationships from dynamic heterogeneous panels.” Journal of Econometrics, 68(1), 79–113.
Pesaran MH (2006). “Estimation and inference in large heterogeneous panels with multifactor error structure.” Econometrica, 74(4), 967–1012.
Chudik A, Pesaran MH (2015). “Common correlated effects estimation of heterogeneous dynamic panel data models with weakly exogenous regressors.” Journal of Econometrics, 188(2), 393–420.
library(csdm) data(PWT_60_07, package = "csdm") df <- PWT_60_07 # Keep examples fast but fully runnable keep_ids <- unique(df$id)[1:10] df_small <- df[df$id %in% keep_ids & df$year >= 1970, ] # Mean Group (MG) mg <- csdm( log_rgdpo ~ log_hc + log_ck + log_ngd, data = df_small, id = "id", time = "year", model = "mg" ) summary(mg) # Common Correlated Effects (CCE) cce <- csdm( log_rgdpo ~ log_hc + log_ck + log_ngd, data = df_small, id = "id", time = "year", model = "cce", csa = csdm_csa(vars = c("log_rgdpo", "log_hc", "log_ck", "log_ngd")) ) summary(cce) # Dynamic CCE (DCCE) dcce <- csdm( log_rgdpo ~ log_hc + log_ck + log_ngd, data = df_small, id = "id", time = "year", model = "dcce", csa = csdm_csa(vars = c("log_rgdpo", "log_hc", "log_ck", "log_ngd"), lags = 3), lr = csdm_lr(type = "ardl", ylags = 1, xdlags = 0) ) summary(dcce) # CS-ARDL cs_ardl <- csdm( log_rgdpo ~ log_hc + log_ck + log_ngd, data = df_small, id = "id", time = "year", model = "cs_ardl", csa = csdm_csa(vars = c("log_rgdpo", "log_hc", "log_ck", "log_ngd"), lags = 3), lr = csdm_lr(type = "ardl", ylags = 1, xdlags = 0) ) summary(cs_ardl)library(csdm) data(PWT_60_07, package = "csdm") df <- PWT_60_07 # Keep examples fast but fully runnable keep_ids <- unique(df$id)[1:10] df_small <- df[df$id %in% keep_ids & df$year >= 1970, ] # Mean Group (MG) mg <- csdm( log_rgdpo ~ log_hc + log_ck + log_ngd, data = df_small, id = "id", time = "year", model = "mg" ) summary(mg) # Common Correlated Effects (CCE) cce <- csdm( log_rgdpo ~ log_hc + log_ck + log_ngd, data = df_small, id = "id", time = "year", model = "cce", csa = csdm_csa(vars = c("log_rgdpo", "log_hc", "log_ck", "log_ngd")) ) summary(cce) # Dynamic CCE (DCCE) dcce <- csdm( log_rgdpo ~ log_hc + log_ck + log_ngd, data = df_small, id = "id", time = "year", model = "dcce", csa = csdm_csa(vars = c("log_rgdpo", "log_hc", "log_ck", "log_ngd"), lags = 3), lr = csdm_lr(type = "ardl", ylags = 1, xdlags = 0) ) summary(dcce) # CS-ARDL cs_ardl <- csdm( log_rgdpo ~ log_hc + log_ck + log_ngd, data = df_small, id = "id", time = "year", model = "cs_ardl", csa = csdm_csa(vars = c("log_rgdpo", "log_hc", "log_ck", "log_ngd"), lags = 3), lr = csdm_lr(type = "ardl", ylags = 1, xdlags = 0) ) summary(cs_ardl)
Specification: Cross-sectional averages (CSA)
csdm_csa( vars = "_all", lags = 0, scope = c("estimation", "global", "cluster"), cluster = NULL )csdm_csa( vars = "_all", lags = 0, scope = c("estimation", "global", "cluster"), cluster = NULL )
vars |
Character. One of "_all", "_none", or a character vector of variable names. |
lags |
Integer. Either a scalar integer >= 0 applied to all CSA variables, or a named integer vector giving per-variable maximum lags. |
scope |
Character vector. One or more of c("estimation","global","cluster"). |
cluster |
Reserved for future use. |
A spec object (list) used by csdm().
# Cross-sectional averages (CSA) configuration for DCCE csa <- csdm_csa( vars = c("log_rgdpo", "log_hc", "log_ck", "log_ngd"), lags = 3 ) csa# Cross-sectional averages (CSA) configuration for DCCE csa <- csdm_csa( vars = c("log_rgdpo", "log_hc", "log_ck", "log_ngd"), lags = 3 ) csa
Specification: Long-run configuration
csdm_lr( vars = NULL, type = c("none", "ecm", "ardl", "csdl"), ylags = 0, xdlags = 0, options = list() )csdm_lr( vars = NULL, type = c("none", "ecm", "ardl", "csdl"), ylags = 0, xdlags = 0, options = list() )
vars |
Reserved for future use. |
type |
One of c("none","ecm","ardl","csdl"). |
ylags |
Integer >= 0. Within-unit lags of the dependent variable to include when supported by the chosen model/type. |
xdlags |
Integer >= 0. Scalar distributed lags to apply to each RHS regressor when supported by the chosen model/type. |
options |
Reserved for future use. |
A spec object (list) used by csdm().
# Long-run / dynamic configuration (ARDL-style lags) lr <- csdm_lr(type = "ardl", ylags = 1) lr # Minimal end-to-end DCCE example (kept small for speed) data(PWT_60_07, package = "csdm") df <- PWT_60_07 keep_ids <- unique(df$id)[1:10] df_small <- df[df$id %in% keep_ids & df$year >= 1970, ] fit <- csdm( log_rgdpo ~ log_hc + log_ck + log_ngd, data = df_small, id = "id", time = "year", model = "dcce", csa = csdm_csa(vars = c("log_rgdpo", "log_hc", "log_ck", "log_ngd"), lags = 3), lr = csdm_lr(type = "ardl", ylags = 1) ) summary(fit)# Long-run / dynamic configuration (ARDL-style lags) lr <- csdm_lr(type = "ardl", ylags = 1) lr # Minimal end-to-end DCCE example (kept small for speed) data(PWT_60_07, package = "csdm") df <- PWT_60_07 keep_ids <- unique(df$id)[1:10] df_small <- df[df$id %in% keep_ids & df$year >= 1970, ] fit <- csdm( log_rgdpo ~ log_hc + log_ck + log_ngd, data = df_small, id = "id", time = "year", model = "dcce", csa = csdm_csa(vars = c("log_rgdpo", "log_hc", "log_ck", "log_ngd"), lags = 3), lr = csdm_lr(type = "ardl", ylags = 1) ) summary(fit)
Specification: Pooled constraints (stub)
csdm_pooled(vars = NULL, constant = FALSE, trend = FALSE)csdm_pooled(vars = NULL, constant = FALSE, trend = FALSE)
vars |
Reserved for future use. |
constant |
Logical; pooled constant. |
trend |
Logical; pooled trend. |
A spec object (list) used by csdm().
Specification: Variance-covariance for MG output (stub)
csdm_vcov(type = c("mg", "np", "nw", "wpn", "ols"), ...)csdm_vcov(type = c("mg", "np", "nw", "wpn", "ols"), ...)
type |
One of c("mg","np","nw","wpn","ols"). |
... |
Reserved for future use. |
A spec object (list) used by csdm().
Produces fitted values (index "xb") when available, or returns model
residuals. Prediction on new data is not yet implemented.
## S3 method for class 'csdm_fit' predict(object, newdata = NULL, type = c("xb", "residuals"), ...)## S3 method for class 'csdm_fit' predict(object, newdata = NULL, type = c("xb", "residuals"), ...)
object |
A fitted object of class |
newdata |
Optional new data (not yet supported). |
type |
One of |
... |
Currently unused. |
A numeric matrix of fitted values or residuals, depending on
type.
residuals.csdm_fit(), summary.csdm_fit()
Prints a concise overview of a fitted csdm_fit object, including the
model type, formula, panel dimensions, and a coefficient table with standard
errors when available.
## S3 method for class 'csdm_fit' print(x, digits = 4, ...)## S3 method for class 'csdm_fit' print(x, digits = 4, ...)
x |
A fitted object of class |
digits |
Number of printed digits. |
... |
Currently unused. |
Invisibly returns x.
summary.csdm_fit(), coef.csdm_fit(), residuals.csdm_fit()
Formats and prints a summary.csdm_fit object. Output adapts to model
type and includes coefficient tables, selected goodness-of-fit diagnostics,
and compact model metadata.
## S3 method for class 'summary.csdm_fit' print(x, digits = 4, ...)## S3 method for class 'summary.csdm_fit' print(x, digits = 4, ...)
x |
A |
digits |
Number of digits to print. |
... |
Further arguments passed to methods. |
The printout includes classic Pesaran CD diagnostics from the summary object.
For a full CD diagnostic panel (CD, CDw, CDw+, CD*), use cd_test() on the
fitted model.
Invisibly returns x.
A panel of 93 countries (unit id) observed annually over 1960-2007 (time/year), with the log-transformed variables used in xtdcce2-style examples.
PWT_60_07PWT_60_07
A data frame with 4464 rows and 6 variables:
Unit identifier (country id).
Time identifier (year, 1960-2007).
Log real GDP (output).
Log human capital index.
Log capital stock.
Log (net) government debt (or similar), used as a covariate/control.
Penn World Table (PWT). This dataset is included as a small, convenient panel for examples and tests.
Returns residuals as an matrix (rows are units, columns are time).
This method is designed for panel diagnostics and downstream tools such as
cd_test().
## S3 method for class 'csdm_fit' residuals(object, type = c("e", "u"), ...)## S3 method for class 'csdm_fit' residuals(object, type = c("e", "u"), ...)
object |
A fitted object of class |
type |
Residual type. Currently only |
... |
Currently unused. |
A numeric matrix of residuals with dimensions .
get_residuals(), cd_test(), predict.csdm_fit()
Computes post-estimation summaries for csdm_fit objects, including
mean-group coefficient inference, model-level diagnostics, and model-specific
summary tables (for example, short-run and long-run blocks for CS-ARDL).
## S3 method for class 'csdm_fit' summary(object, digits = 4, ...)## S3 method for class 'csdm_fit' summary(object, digits = 4, ...)
object |
A fitted model object of class |
digits |
Number of digits to print. |
... |
Further arguments passed to methods. |
For each coefficient , the summary reports standard errors,
-statistics, and two-sided normal-approximation p-values:
The printed summary shows the classic Pesaran CD diagnostic by default. Extended
diagnostics (CDw, CDw+, CD*) are available through cd_test().
An object of class summary.csdm_fit with core metadata
(call/formula/model/N/T), coefficient tables, fit statistics, and
model-specific components for printing and downstream inspection.
print.summary.csdm_fit(), cd_test(), coef.csdm_fit(), vcov.csdm_fit()
data(PWT_60_07, package = "csdm") df <- PWT_60_07 ids <- unique(df$id)[1:10] df_small <- df[df$id %in% ids & df$year >= 1970, ] fit <- csdm( log_rgdpo ~ log_hc + log_ck + log_ngd, data = df_small, id = "id", time = "year", model = "cce", csa = csdm_csa(vars = c("log_rgdpo", "log_hc", "log_ck", "log_ngd")) ) s <- summary(fit) sdata(PWT_60_07, package = "csdm") df <- PWT_60_07 ids <- unique(df$id)[1:10] df_small <- df[df$id %in% ids & df$year >= 1970, ] fit <- csdm( log_rgdpo ~ log_hc + log_ck + log_ngd, data = df_small, id = "id", time = "year", model = "cce", csa = csdm_csa(vars = c("log_rgdpo", "log_hc", "log_ck", "log_ngd")) ) s <- summary(fit) s
Extract coefficient covariance matrix from a fitted csdm model
## S3 method for class 'csdm_fit' vcov(object, ...)## S3 method for class 'csdm_fit' vcov(object, ...)
object |
A fitted object of class |
... |
Currently unused. |
A numeric variance-covariance matrix aligned with coef(object)
for models where this is available.
coef.csdm_fit(), summary.csdm_fit()