clean_columns()
Clean messy column names in a pandas dataframe.
Usage
clean_columns(
df,
case="snake",
replace=None,
remove_accents=True,
)Parameters
df: pd.DataFrame | pl.DataFrame-
Dataframe from which column names are to be cleaned.
case: str = "snake"-
The desired case style of the column name. Defaults to “snake”.
- 'snake' produces 'column_name'; - 'kebab' produces 'column-name'; - 'camel' produces 'columnName'; - 'pascal' produces 'ColumnName'; - 'const' produces 'COLUMN_NAME'; - 'sentence' produces 'Column name'; - 'title' produces 'Column Name'; - 'lower' produces 'column name'; - 'upper' produces 'COLUMN NAME'; replace: dict[str, str] | None = None-
Values to replace in the column names. Defaults to None.
- {'old_value': 'new_value'} remove_accents: bool = True-
If True, strip accents from the column names. Defaults to True.
Raises
ValueError-
If case is not valid.
Returns
pd.DataFrame | pl.DataFrame-
pd.DataFrame | pl.DataFrame: Dataframe with cleaned column names.
Examples
Clean column names by converting the names to camel case style, removing accents, and correcting a mispelling.
>>> df = pd.DataFrame(
{
'FirstNom': ['Philip', 'Turanga'],
'lastName': ['Fry', 'Leela'],
'Téléphone': ['555-234-5678', '(604) 111-2335']
})>>> clean_columns(df, case='camel', replace={'Nom': 'Name'})
firstName lastName telephone
0 Philip Fry 555-234-5678
1 Turanga Leela (604) 111-2335