clean_columns
clean_columns(df, case='snake', replace=None, remove_accents=True)
Clean messy column names in a pandas dataframe.
Parameters
df |
Union[pd.DataFrame, pl.DataFrame] |
Dataframe from which column names are to be cleaned. |
required |
case |
str |
The desired case style of the column name. Defaults to “snake”. - ‘snake’ produces ‘column_name’; - ‘kebab’ produces ‘column-name’; - ‘camel’ produces ‘columnName’; - ‘pascal’ produces ‘ColumnName’; - ‘const’ produces ‘COLUMN_NAME’; - ‘sentence’ produces ‘Column name’; - ‘title’ produces ‘Column Name’; - ‘lower’ produces ‘column name’; - ‘upper’ produces ‘COLUMN NAME’; |
'snake' |
replace |
Optional[Dict[str, str]] |
Values to replace in the column names. Defaults to None. - {‘old_value’: ‘new_value’} |
None |
remove_accents |
bool |
If True, strip accents from the column names. Defaults to True. |
True |
Raises
|
ValueError |
If case is not valid. |
Returns
|
Union[pd.DataFrame, pl.DataFrame] |
Dataframe with cleaned column names. |
Examples
Clean column names by converting the names to camel case style, removing accents, and correcting a mispelling.
>>> df = pd.DataFrame(
{
'FirstNom': ['Philip', 'Turanga'],
'lastName': ['Fry', 'Leela'],
'Téléphone': ['555-234-5678', '(604) 111-2335']
})
>>> clean_columns(df, case='camel', replace={'Nom': 'Name'})
firstName lastName telephone
0 Philip Fry 555-234-5678
1 Turanga Leela (604) 111-2335