Title: | Summary Statistics for Panel Data |
---|---|
Description: | Based on 'STATA' xtsum command, it is used to compute summary statistics for a panel data set. It generates overall, between-group, and within-group statistics for specified variables in a panel data set, as presented in S. Porter (2023) <https://stephenporter.org/files/xtsum_handout.pdf>, StataCorp (2023) <https://www.stata.com/manuals/xtxtsum.pdf>. |
Authors: | Joao Claudio Macosso |
Maintainer: | Joao Claudio Macosso <[email protected]> |
License: | GPL-3 |
Version: | 0.1.0 |
Built: | 2025-03-09 03:56:53 UTC |
Source: | https://github.com/macosso/xtsum |
This function calculates the maximum between-group in a panel data.
between_max(data, variable, id = NULL, t = NULL, na.rm = FALSE)
between_max(data, variable, id = NULL, t = NULL, na.rm = FALSE)
data |
A data.frame or pdata.frame object containing the panel data. |
variable |
The variable for which the maximum between-group effect is calculated. |
id |
(Optional) Name of the individual identifier variable. |
t |
(Optional) Name of the time identifier variable. |
na.rm |
Logical. Should missing values be removed? Default is FALSE. |
The maximum between-group effect.
# Example using pdata.frame data("Gasoline", package = "plm") Gas <- pdata.frame(Gasoline, index = c("country", "year"), drop.index = TRUE) between_max(Gas, variable = "lgaspcar") # Using regular data.frame with id and t specified data("Crime", package = "plm") between_max(Crime, variable = "crmrte", id = "county", t = "year")
# Example using pdata.frame data("Gasoline", package = "plm") Gas <- pdata.frame(Gasoline, index = c("country", "year"), drop.index = TRUE) between_max(Gas, variable = "lgaspcar") # Using regular data.frame with id and t specified data("Crime", package = "plm") between_max(Crime, variable = "crmrte", id = "county", t = "year")
This function calculates the minimum between-group of a panel data.
between_min(data, variable, id = NULL, t = NULL, na.rm = FALSE)
between_min(data, variable, id = NULL, t = NULL, na.rm = FALSE)
data |
A data.frame or pdata.frame object containing the panel data. |
variable |
The variable for which the minimum between-group effect is calculated. |
id |
(Optional) Name of the individual identifier variable. |
t |
(Optional) Name of the time identifier variable. |
na.rm |
Logical. Should missing values be removed? Default is FALSE. |
The minimum between-group effect.
# Example using pdata.frame data("Gasoline", package = "plm") Gas <- pdata.frame(Gasoline, index = c("country", "year"), drop.index = TRUE) between_min(Gas, variable = "lgaspcar") # Using regular data.frame with id and t specified data("Crime", package = "plm") between_min(Crime, variable = "crmrte", id = "county", t = "year")
# Example using pdata.frame data("Gasoline", package = "plm") Gas <- pdata.frame(Gasoline, index = c("country", "year"), drop.index = TRUE) between_min(Gas, variable = "lgaspcar") # Using regular data.frame with id and t specified data("Crime", package = "plm") between_min(Crime, variable = "crmrte", id = "county", t = "year")
This function calculates the standard deviation of between-group in a panel data.
between_sd(data, variable, id = NULL, t = NULL, na.rm = FALSE)
between_sd(data, variable, id = NULL, t = NULL, na.rm = FALSE)
data |
A data.frame or pdata.frame object containing the panel data. |
variable |
The variable for which the standard deviation of between-group effects is calculated. |
id |
(Optional) Name of the individual identifier variable. |
t |
(Optional) Name of the time identifier variable. |
na.rm |
Logical. Should missing values be removed? Default is FALSE. |
The standard deviation of between-group effects.
# Example using pdata.frame data("Gasoline", package = "plm") Gas <- pdata.frame(Gasoline, index = c("country", "year"), drop.index = TRUE) between_sd(Gas, variable = "lgaspcar") # Using regular data.frame with id and t specified data("Crime", package = "plm") between_sd(Crime, variable = "crmrte", id = "county", t = "year")
# Example using pdata.frame data("Gasoline", package = "plm") Gas <- pdata.frame(Gasoline, index = c("country", "year"), drop.index = TRUE) between_sd(Gas, variable = "lgaspcar") # Using regular data.frame with id and t specified data("Crime", package = "plm") between_sd(Crime, variable = "crmrte", id = "county", t = "year")
This function computes the maximum within-group for a panel data.
within_max(data, variable, id = NULL, t = NULL, na.rm = FALSE)
within_max(data, variable, id = NULL, t = NULL, na.rm = FALSE)
data |
A data.frame or pdata.frame object containing the panel data. |
variable |
The variable for which the maximum within-group effect is calculated. |
id |
(Optional) Name of the individual identifier variable. |
t |
(Optional) Name of the time identifier variable. |
na.rm |
Logical. Should missing values be removed? Default is FALSE. |
The maximum within-group effect.
# Example using pdata.frame data("Gasoline", package = "plm") Gas <- pdata.frame(Gasoline, index = c("country", "year"), drop.index = TRUE) within_max(Gas, variable = "lgaspcar") # Using regular data.frame with id and t specified data("Crime", package = "plm") within_max(Crime, variable = "crmrte", id = "county", t = "year")
# Example using pdata.frame data("Gasoline", package = "plm") Gas <- pdata.frame(Gasoline, index = c("country", "year"), drop.index = TRUE) within_max(Gas, variable = "lgaspcar") # Using regular data.frame with id and t specified data("Crime", package = "plm") within_max(Crime, variable = "crmrte", id = "county", t = "year")
This function computes the minimum within-group for a panel data.
within_min(data, variable, id = NULL, t = NULL, na.rm = FALSE)
within_min(data, variable, id = NULL, t = NULL, na.rm = FALSE)
data |
A data.frame or pdata.frame object containing the panel data. |
variable |
The variable for which the minimum within-group effect is calculated. |
id |
(Optional) Name of the individual identifier variable. |
t |
(Optional) Name of the time identifier variable. |
na.rm |
Logical. Should missing values be removed? Default is FALSE. |
The minimum within-group effect.
# Example using pdata.frame data("Gasoline", package = "plm") Gas <- pdata.frame(Gasoline, index = c("country", "year"), drop.index = TRUE) within_min(Gas, variable = "lgaspcar") # Using regular data.frame with id and t specified data("Crime", package = "plm") within_min(Crime, variable = "crmrte", id = "county", t = "year")
# Example using pdata.frame data("Gasoline", package = "plm") Gas <- pdata.frame(Gasoline, index = c("country", "year"), drop.index = TRUE) within_min(Gas, variable = "lgaspcar") # Using regular data.frame with id and t specified data("Crime", package = "plm") within_min(Crime, variable = "crmrte", id = "county", t = "year")
This function computes the standard deviation of within-group for a panel data.
within_sd(data, variable, id = NULL, t = NULL, na.rm = FALSE)
within_sd(data, variable, id = NULL, t = NULL, na.rm = FALSE)
data |
A data.frame or pdata.frame object containing the panel data. |
variable |
The variable for which the standard deviation of within-group effects is calculated. |
id |
(Optional) Name of the individual identifier variable. |
t |
(Optional) Name of the time identifier variable. |
na.rm |
Logical. Should missing values be removed? Default is FALSE. |
The standard deviation of within-group effects.
# Example using pdata.frame data("Gasoline", package = "plm") Gas <- pdata.frame(Gasoline, index = c("country", "year"), drop.index = TRUE) within_sd(Gas, variable = "lgaspcar") # Using regular data.frame with id and t specified data("Crime", package = "plm") within_sd(Crime, variable = "crmrte", id = "county", t = "year")
# Example using pdata.frame data("Gasoline", package = "plm") Gas <- pdata.frame(Gasoline, index = c("country", "year"), drop.index = TRUE) within_sd(Gas, variable = "lgaspcar") # Using regular data.frame with id and t specified data("Crime", package = "plm") within_sd(Crime, variable = "crmrte", id = "county", t = "year")
This function computes summary statistics for panel data, including overall statistics, between-group statistics, and within-group statistics.
xtsum( data, variables = NULL, id = NULL, t = NULL, na.rm = FALSE, return.data.frame = FALSE, dec = 3 )
xtsum( data, variables = NULL, id = NULL, t = NULL, na.rm = FALSE, return.data.frame = FALSE, dec = 3 )
data |
A data.frame or pdata.frame object representing panel data. |
variables |
(Optional) Vector of variable names for which to calculate statistics. If not provided, all numeric variables in the data will be used. |
id |
(Optional) Name of the individual identifier variable. |
t |
(Optional) Name of the time identifier variable. |
na.rm |
Logical indicating whether to remove NAs when calculating statistics. |
return.data.frame |
If the return object should be a dataframe |
dec |
Number of significant digits to report |
A table summarizing statistics for each variable, including Mean, SD, Min, and Max, broken down into Overall, Between, and Within dimensions.
# Using a data.frame and specifying variables, id, it, na.rm, dec data("nlswork", package = "sampleSelection") xtsum(nlswork, "hours", id = "idcode", t = "year", na.rm = TRUE, dec = 6) # Using pdata.frame object without specifying a variable data("Gasoline", package = "plm") Gas <- pdata.frame(Gasoline, index = c("country", "year"), drop.index = TRUE) xtsum(Gas) # Using regular data.frame with id and t specified data("Crime", package = "plm") xtsum(Crime, variables = c("crmrte", "prbarr"), id = "county", t = "year") # Specifying variables to include in the summary xtsum(Gas, variables = c("lincomep", "lgaspcar"))
# Using a data.frame and specifying variables, id, it, na.rm, dec data("nlswork", package = "sampleSelection") xtsum(nlswork, "hours", id = "idcode", t = "year", na.rm = TRUE, dec = 6) # Using pdata.frame object without specifying a variable data("Gasoline", package = "plm") Gas <- pdata.frame(Gasoline, index = c("country", "year"), drop.index = TRUE) xtsum(Gas) # Using regular data.frame with id and t specified data("Crime", package = "plm") xtsum(Crime, variables = c("crmrte", "prbarr"), id = "county", t = "year") # Specifying variables to include in the summary xtsum(Gas, variables = c("lincomep", "lgaspcar"))