skim()
Skim a pandas or polars dataframe and return visual summary statistics on it.
Usage
skim(df_in)skim is an alternative to pandas.DataFrame.describe(), quickly providing an overview of a data frame via a table displayed in the console. It produces a different set of summary functions based on the types of columns in the dataframe. You may get better results from ensuring that you set the datatypes in your dataframe you want before running skim.
Note that any unknown column types, or mixed column types, will not be processed.
Parameters
df_in: pd.DataFrame | pl.DataFrame-
Dataframe to skim.
Raises
NotImplementedError-
If the dataframe has a MultiIndex column structure.
Examples
Skim a dataframe
>>> df = pd.DataFrame(
{
'col1': ['Philip', 'Turanga', 'bob'],
'col2': [50, 100, 70],
'col3': [False, True, True]
})
>>> df["col1"] = df["col1"].astype("string")
>>> skim(df)