Pandas Cleaning Empty Cells

Pandas – Cleaning Empty Cells

In Pandas, empty cells usually mean missing values, represented as NaN (Not a Number). Cleaning empty cells is an essential step to ensure accurate data analysis.


1. Detect Empty Cells

df.isnull() # True for empty cells
df.isnull().sum() # Count empty cells per column

(You can also use isna() — both work the same.)


2. Remove Rows with Empty Cells

Remove rows containing any empty cell

df.dropna()

Remove rows where all values are empty

df.dropna(how="all")

3. Remove Columns with Empty Cells

df.dropna(axis=1)

4. Fill Empty Cells

Fill with a Constant Value

df.fillna(0)
df.fillna("Unknown")

Fill with Mean / Median / Mode

df["Age"].fillna(df["Age"].mean())
df["Salary"].fillna(df["Salary"].median())
df["City"].fillna(df["City"].mode()[0])

5. Forward Fill & Backward Fill

df.fillna(method="ffill") # Forward fill
df.fillna(method="bfill") # Backward fill

(Useful for time-series data.)


6. Replace Empty Strings with NaN

Sometimes empty cells are not NaN but empty strings "".

import numpy as np

df.replace(“”, np.nan, inplace=True)


7. Fill Empty Cells Column-wise

df.fillna({
"Age": df["Age"].mean(),
"City": "Unknown"
})

8. Real-World Example

import pandas as pd
import numpy as np
data = {
“Name”: [“Amit”, “Riya”, “Karan”, “”],
“Age”: [22, None, 25, 23],
“City”: [“Delhi”, “Mumbai”, None, “Pune”]
}

df = pd.DataFrame(data)

df.replace(“”, np.nan, inplace=True)
df[“Age”].fillna(df[“Age”].mean(), inplace=True)
df[“City”].fillna(“Unknown”, inplace=True)

print(df)


9. Best Practices

  • Use dropna() when missing data is small

  • Use fillna() when data is important

  • Always analyze missing data before removing it


Conclusion

Cleaning empty cells in Pandas ensures your dataset is complete, consistent, and reliable. Knowing when to remove or fill missing values is key to good data analysis.

You may also like...