Package 'xtsum'

Title: Summary Statistics for Panel Data
Description: Based on 'STATA' xtsum command, it is used to compute summary statistics for a panel data set. It generates overall, between-group, and within-group statistics for specified variables in a panel data set, as presented in S. Porter (2023) <https://stephenporter.org/files/xtsum_handout.pdf>, StataCorp (2023) <https://www.stata.com/manuals/xtxtsum.pdf>.
Authors: Joao Claudio Macosso
Maintainer: Joao Claudio Macosso <[email protected]>
License: GPL-3
Version: 0.1.0
Built: 2025-03-09 03:56:53 UTC
Source: https://github.com/macosso/xtsum

Help Index


Compute the maximum between-group

Description

This function calculates the maximum between-group in a panel data.

Usage

between_max(data, variable, id = NULL, t = NULL, na.rm = FALSE)

Arguments

data

A data.frame or pdata.frame object containing the panel data.

variable

The variable for which the maximum between-group effect is calculated.

id

(Optional) Name of the individual identifier variable.

t

(Optional) Name of the time identifier variable.

na.rm

Logical. Should missing values be removed? Default is FALSE.

Value

The maximum between-group effect.

Examples

# Example using pdata.frame
data("Gasoline", package = "plm")
Gas <- pdata.frame(Gasoline, index = c("country", "year"), drop.index = TRUE)
between_max(Gas, variable = "lgaspcar")

# Using regular data.frame with id and t specified
data("Crime", package = "plm")
between_max(Crime, variable = "crmrte", id = "county", t = "year")

Compute the minimum between-group

Description

This function calculates the minimum between-group of a panel data.

Usage

between_min(data, variable, id = NULL, t = NULL, na.rm = FALSE)

Arguments

data

A data.frame or pdata.frame object containing the panel data.

variable

The variable for which the minimum between-group effect is calculated.

id

(Optional) Name of the individual identifier variable.

t

(Optional) Name of the time identifier variable.

na.rm

Logical. Should missing values be removed? Default is FALSE.

Value

The minimum between-group effect.

Examples

# Example using pdata.frame
data("Gasoline", package = "plm")
Gas <- pdata.frame(Gasoline, index = c("country", "year"), drop.index = TRUE)
between_min(Gas, variable = "lgaspcar")

# Using regular data.frame with id and t specified
data("Crime", package = "plm")
between_min(Crime, variable = "crmrte", id = "county", t = "year")

Compute the standard deviation of between-group

Description

This function calculates the standard deviation of between-group in a panel data.

Usage

between_sd(data, variable, id = NULL, t = NULL, na.rm = FALSE)

Arguments

data

A data.frame or pdata.frame object containing the panel data.

variable

The variable for which the standard deviation of between-group effects is calculated.

id

(Optional) Name of the individual identifier variable.

t

(Optional) Name of the time identifier variable.

na.rm

Logical. Should missing values be removed? Default is FALSE.

Value

The standard deviation of between-group effects.

Examples

# Example using pdata.frame
data("Gasoline", package = "plm")
Gas <- pdata.frame(Gasoline, index = c("country", "year"), drop.index = TRUE)
between_sd(Gas, variable = "lgaspcar")

# Using regular data.frame with id and t specified
data("Crime", package = "plm")
between_sd(Crime, variable = "crmrte", id = "county", t = "year")

Compute the maximum within-group for a panel data

Description

This function computes the maximum within-group for a panel data.

Usage

within_max(data, variable, id = NULL, t = NULL, na.rm = FALSE)

Arguments

data

A data.frame or pdata.frame object containing the panel data.

variable

The variable for which the maximum within-group effect is calculated.

id

(Optional) Name of the individual identifier variable.

t

(Optional) Name of the time identifier variable.

na.rm

Logical. Should missing values be removed? Default is FALSE.

Value

The maximum within-group effect.

Examples

# Example using pdata.frame
data("Gasoline", package = "plm")
Gas <- pdata.frame(Gasoline, index = c("country", "year"), drop.index = TRUE)
within_max(Gas, variable = "lgaspcar")


# Using regular data.frame with id and t specified
data("Crime", package = "plm")
within_max(Crime, variable = "crmrte", id = "county", t = "year")

Compute the minimum within-group for panel data

Description

This function computes the minimum within-group for a panel data.

Usage

within_min(data, variable, id = NULL, t = NULL, na.rm = FALSE)

Arguments

data

A data.frame or pdata.frame object containing the panel data.

variable

The variable for which the minimum within-group effect is calculated.

id

(Optional) Name of the individual identifier variable.

t

(Optional) Name of the time identifier variable.

na.rm

Logical. Should missing values be removed? Default is FALSE.

Value

The minimum within-group effect.

Examples

# Example using pdata.frame
data("Gasoline", package = "plm")
Gas <- pdata.frame(Gasoline, index = c("country", "year"), drop.index = TRUE)
within_min(Gas, variable = "lgaspcar")

# Using regular data.frame with id and t specified
data("Crime", package = "plm")
within_min(Crime, variable = "crmrte", id = "county", t = "year")

Compute the standard deviation of within-group for a panel data

Description

This function computes the standard deviation of within-group for a panel data.

Usage

within_sd(data, variable, id = NULL, t = NULL, na.rm = FALSE)

Arguments

data

A data.frame or pdata.frame object containing the panel data.

variable

The variable for which the standard deviation of within-group effects is calculated.

id

(Optional) Name of the individual identifier variable.

t

(Optional) Name of the time identifier variable.

na.rm

Logical. Should missing values be removed? Default is FALSE.

Value

The standard deviation of within-group effects.

Examples

# Example using pdata.frame
data("Gasoline", package = "plm")
Gas <- pdata.frame(Gasoline, index = c("country", "year"), drop.index = TRUE)
within_sd(Gas, variable = "lgaspcar")

# Using regular data.frame with id and t specified
data("Crime", package = "plm")
within_sd(Crime, variable = "crmrte", id = "county", t = "year")

Calculate summary statistics for panel data

Description

This function computes summary statistics for panel data, including overall statistics, between-group statistics, and within-group statistics.

Usage

xtsum(
  data,
  variables = NULL,
  id = NULL,
  t = NULL,
  na.rm = FALSE,
  return.data.frame = FALSE,
  dec = 3
)

Arguments

data

A data.frame or pdata.frame object representing panel data.

variables

(Optional) Vector of variable names for which to calculate statistics. If not provided, all numeric variables in the data will be used.

id

(Optional) Name of the individual identifier variable.

t

(Optional) Name of the time identifier variable.

na.rm

Logical indicating whether to remove NAs when calculating statistics.

return.data.frame

If the return object should be a dataframe

dec

Number of significant digits to report

Value

A table summarizing statistics for each variable, including Mean, SD, Min, and Max, broken down into Overall, Between, and Within dimensions.

Examples

# Using a data.frame and specifying variables, id, it, na.rm, dec
data("nlswork", package = "sampleSelection")
xtsum(nlswork, "hours", id = "idcode", t = "year", na.rm = TRUE, dec = 6)

# Using pdata.frame object without specifying a variable
data("Gasoline", package = "plm")
Gas <- pdata.frame(Gasoline, index = c("country", "year"), drop.index = TRUE)
xtsum(Gas)


# Using regular data.frame with id and t specified
data("Crime", package = "plm")
xtsum(Crime, variables = c("crmrte", "prbarr"), id = "county", t = "year")

# Specifying variables to include in the summary
xtsum(Gas, variables = c("lincomep", "lgaspcar"))