Specification curve object. Uses a model to perform all variants of a specification. Stores the results of those regressions in a tidy format pandas dataframe. Plots the regressions in chart that can optionally be saved. Will iterate over multiple inputs for exog. and endog. variables. Note that categorical variables that are expanded cannot be mutually excluded from other categorical variables that are expanded.
The class can be initialized in two mutually exclusive ways: 1. Using a formula string (e.g., “y ~ x1 + x2”) 2. Using separate y_endog, x_exog, controls, and always_include parameters
Refits all of the specifications under the null of \(y_{i(k)}^* = y_{i(k)} - b_k*x_{i(k)}\) where i is over rows and k is over specifications and i is a function of k as y and x rows can change depending on the k when there are multiple y_endog and multiple x_exog. Each bootstrap sees a fraction of rows f_sample taken and then a new specification curve fit under the null. Then summary statistics are created by specification, and statistical tests.
Parameters
Name
Type
Description
Default
n_boot
int
Number of bootstraps. Defaults to 30.
30
f_sample
float
Fraction of rows to sample in each bootstrap. Defaults to 0.1.
Makes plot of fitted specification curve. Optionally returns figure and axes for onward adjustment.
Parameters
Name
Type
Description
Default
save_path
string or path
Exported fig filename. Defaults to None.
None
pretty_plots
bool
Whether to use this package’s figure formatting. Defaults to True.
True
preferred_spec
list
Preferred specification. Defaults to [].
[]
show_null_curve
bool
Whether to include the curve under the null. Defaults to False.
False
pretty_plots
bool
Whether to use this package’s figure formatting. Defaults to False.
True
return_fig
bool
Whether to return the figure and axes objects. Defaults to False.
False
**kwargs
dict
Additional arguments passed to .fit_null() when show_null_curve is True. Parameters: n_boot (int): Number of bootstrap iterations for null curve calculation. eg the argument would be **{"n_boot": 5}. f_sample (float): Fraction of rows to sample in each bootstrap. Defaults to 0.1.