----------------------------------------------------------------------
This is the API documentation for the skimpy library.
----------------------------------------------------------------------


## Functions

Utility functions


clean_columns(df: 'pd.DataFrame | pl.DataFrame', case: 'str' = 'snake', replace: 'dict[str, str] | None' = None, remove_accents: 'bool' = True) -> 'pd.DataFrame | pl.DataFrame'

Clean messy column names in a pandas dataframe.

Args:
    df (pd.DataFrame | pl.DataFrame): Dataframe from which column names are to be cleaned.
    case (str, optional): The desired case style of the column name. Defaults to "snake".

            - 'snake' produces 'column_name';
            - 'kebab' produces 'column-name';
            - 'camel' produces 'columnName';
            - 'pascal' produces 'ColumnName';
            - 'const' produces 'COLUMN_NAME';
            - 'sentence' produces 'Column name';
            - 'title' produces 'Column Name';
            - 'lower' produces 'column name';
            - 'upper' produces 'COLUMN NAME';

    replace (dict[str, str] | None, optional): Values to replace in the column names. Defaults to None.

            - {'old_value': 'new_value'}

    remove_accents (bool, optional): If True, strip accents from the column names. Defaults to True.

Raises:
    ValueError: If case is not valid.

Returns:
    pd.DataFrame | pl.DataFrame: Dataframe with cleaned column names.

Examples:
    Clean column names by converting the names to camel case style, removing accents,
    and correcting a mispelling.

    >>> df = pd.DataFrame(
                        {
                        'FirstNom': ['Philip', 'Turanga'],
                        'lastName': ['Fry', 'Leela'],
                        'Téléphone': ['555-234-5678', '(604) 111-2335']
                        })

    >>> clean_columns(df, case='camel', replace={'Nom': 'Name'})
    firstName lastName       telephone
    0    Philip      Fry    555-234-5678
    1   Turanga    Leela  (604) 111-2335

generate_test_data() -> 'pd.DataFrame'

Generate a pandas dataframe with several different datatypes.

For testing skimpy, it's convenient to have a dataset with many different
data types. This function creates that dataframe.

Returns:
    pd.DataFrame: dataframe with columns spanning several data types.

Examples:
    Generate test data to demonstrate how skimpy works.

    >>> df = generate_test_data()

skim(df_in: 'pd.DataFrame | pl.DataFrame') -> 'None'

Skim a pandas or polars dataframe and return visual summary statistics on it.

skim is an alternative to pandas.DataFrame.describe(), quickly providing
an overview of a data frame via a table displayed in the console. It produces a different set of summary
functions based on the types of columns in the dataframe. You may get
better results from ensuring that you set the datatypes in your dataframe
you want before running skim.

Note that any unknown column types, or mixed column types, will not be
processed.

Args:
    df_in (pd.DataFrame | pl.DataFrame): Dataframe to skim.

Raises:
    NotImplementedError: If the dataframe has a MultiIndex column structure.

Examples:
    Skim a dataframe

    >>> df = pd.DataFrame(
            {
            'col1': ['Philip', 'Turanga', 'bob'],
            'col2': [50, 100, 70],
            'col3': [False, True, True]
            })
    >>> df["col1"] = df["col1"].astype("string")
    >>> skim(df)

skim_get_data(df_in: 'pd.DataFrame | pl.DataFrame') -> 'JSON | str'

Skim a pandas or polars dataframe and return summary statistics as a dictionary, and without printing to the console.

skim is an alternative to pandas.DataFrame.describe(), quickly providing
an overview of a data frame via a table of summary statistics. It produces a different set of summary
functions based on the types of columns in the dataframe. You may get
better results from ensuring that you set the datatypes in your dataframe
you want before running skim.

Note that any unknown column types, or mixed column types, will not be
processed.

Args:
    df_in (pd.DataFrame | pl.DataFrame): Dataframe to get summary statistics on.

Returns:
    JSON | str: Dictionary of summary statistics.

skim_get_figure(df_in: 'pd.DataFrame | pl.DataFrame', save_path: 'os.PathLike | str', format: 'str' = 'svg') -> 'None'

Skim a pandas or polars dataframe, print the stats to the console, and save a version of the table as an SVG, HTML, or text file.

skim is an alternative to pandas.DataFrame.describe(), quickly providing
an overview of a data frame via a table of summary statistics. It produces a different set of summary
functions based on the types of columns in the dataframe. You may get
better results from ensuring that you set the datatypes in your dataframe
you want before running skim.

Note that any unknown column types, or mixed column types, will not be
processed.

Args:
    df_in (pd.DataFrame | pl.DataFrame): Dataframe to skim.
    save_path (os.PathLike | str): Path to save figure to (include extension).
    format (str, optional): svg, html, or text. Defaults to "svg".

Raises:
    ValueError: If the format is not one of svg, html, or text.


## Constants

Module-level constants and data


CASE_STYLES

set() -> new empty set object
set(iterable) -> new set object

Build an unordered collection of unique elements.

COMPLETE_COL

str(object='') -> str
str(bytes_or_buffer[, encoding[, errors]]) -> str

Create a new string object from the given object. If encoding or
errors is specified, then the object must expose a data buffer
that will be decoded using the given encoding and error handler.
Otherwise, returns the result of object.__str__() (if defined)
or repr(object).
encoding defaults to sys.getdefaultencoding().
errors defaults to 'strict'.

DATE_COL_FIRST

str(object='') -> str
str(bytes_or_buffer[, encoding[, errors]]) -> str

Create a new string object from the given object. If encoding or
errors is specified, then the object must expose a data buffer
that will be decoded using the given encoding and error handler.
Otherwise, returns the result of object.__str__() (if defined)
or repr(object).
encoding defaults to sys.getdefaultencoding().
errors defaults to 'strict'.

DATE_COL_LAST

str(object='') -> str
str(bytes_or_buffer[, encoding[, errors]]) -> str

Create a new string object from the given object. If encoding or
errors is specified, then the object must expose a data buffer
that will be decoded using the given encoding and error handler.
Otherwise, returns the result of object.__str__() (if defined)
or repr(object).
encoding defaults to sys.getdefaultencoding().
errors defaults to 'strict'.

HIST_BINS

int([x]) -> integer
int(x, base=10) -> integer

Convert a number or string to an integer, or return 0 if no arguments
are given.  If x is a number, return x.__int__().  For floating point
numbers, this truncates towards zero.

If x is not a number or if base is given, then x must be a string,
bytes, or bytearray instance representing an integer literal in the
given base.  The literal can be preceded by '+' or '-' and be surrounded
by whitespace.  The base defaults to 10.  Valid bases are 0 and 2-36.
Base 0 means to interpret the base from the string as an integer literal.
>>> int('0b100', base=0)
4

MAX_COL_WIDTH

int([x]) -> integer
int(x, base=10) -> integer

Convert a number or string to an integer, or return 0 if no arguments
are given.  If x is a number, return x.__int__().  For floating point
numbers, this truncates towards zero.

If x is not a number or if base is given, then x must be a string,
bytes, or bytearray instance representing an integer literal in the
given base.  The literal can be preceded by '+' or '-' and be surrounded
by whitespace.  The base defaults to 10.  Valid bases are 0 and 2-36.
Base 0 means to interpret the base from the string as an integer literal.
>>> int('0b100', base=0)
4

MIN_COL_WIDTH

int([x]) -> integer
int(x, base=10) -> integer

Convert a number or string to an integer, or return 0 if no arguments
are given.  If x is a number, return x.__int__().  For floating point
numbers, this truncates towards zero.

If x is not a number or if base is given, then x must be a string,
bytes, or bytearray instance representing an integer literal in the
given base.  The literal can be preceded by '+' or '-' and be surrounded
by whitespace.  The base defaults to 10.  Valid bases are 0 and 2-36.
Base 0 means to interpret the base from the string as an integer literal.
>>> int('0b100', base=0)
4

MISSING_COL

str(object='') -> str
str(bytes_or_buffer[, encoding[, errors]]) -> str

Create a new string object from the given object. If encoding or
errors is specified, then the object must expose a data buffer
that will be decoded using the given encoding and error handler.
Otherwise, returns the result of object.__str__() (if defined)
or repr(object).
encoding defaults to sys.getdefaultencoding().
errors defaults to 'strict'.

NULL_VALUES

set() -> new empty set object
set(iterable) -> new set object

Build an unordered collection of unique elements.

NUM_COL_MEAN

str(object='') -> str
str(bytes_or_buffer[, encoding[, errors]]) -> str

Create a new string object from the given object. If encoding or
errors is specified, then the object must expose a data buffer
that will be decoded using the given encoding and error handler.
Otherwise, returns the result of object.__str__() (if defined)
or repr(object).
encoding defaults to sys.getdefaultencoding().
errors defaults to 'strict'.

QUANTILES

Built-in mutable sequence.

If no argument is given, the constructor creates a new empty list.
The argument must be an iterable if specified.


----------------------------------------------------------------------
This is the CLI documentation for the package.
----------------------------------------------------------------------

## CLI: skimpy

```
Usage: skimpy [OPTIONS] INPUT

  The skimpy command line interface. Usage refers only to command line.

  Args:     input (str): Path of data file (csv, parquet, or sqlite)     table
  (str | None): Table name for sqlite files; shows available tables if not
  provided

Options:
  --version         Show the version and exit.
  -t, --table TEXT  Table name (required for sqlite files). If not provided,
                    shows available tables.
  --help            Show this message and exit.
```