![]() |
VOOZH | about |
DataFrame.dropna() function remove missing values (NaN or None) from a DataFrame. It can drop entire rows or columns depending on the axis and threshold you specify. This method is commonly used during data cleaning to eliminate incomplete data before analysis.
For Example:
A B 0 1.0 4.0
Explanation: By default, dropna() removes rows with any missing values. Row 0 has no missing data, so it's kept. Rows 1 and 2 contain NaN or None, so they're dropped. Only row 0 remains.
DataFrame.dropna(axis=0, how='any', thresh=None, subset=None, inplace=False)
Parameters:
Parameter | Description |
|---|---|
axis | 0 to drop rows (default), 1 to drop columns |
how | 'any' (default): drop if any value is missing and 'all': drop if all are missing |
thresh | Minimum number of non-NA values required to keep the row/column |
subset | Labels to consider for NA checks (subset of columns) |
inplace | If True, modifies the original DataFrame; if False (default), returns a new one |
Returns: A new DataFrame with the specified rows or columns removed unless inplace=True.
Example 1: We drop rows only if all values are missing.
A B 2 3.0 4.0
Explanation: Only the first two rows contain all missing values. The third row is kept because it has valid values.
Example 2: We drop columns that contain any missing values by setting axis=1.
Empty DataFrame Columns: [] Index: [0, 1, 2]
Explanation: Since both columns 'A' and 'B' have at least one missing value (NaN or None), using dropna(axis=1) drops them. This leaves an empty DataFrame with only row indices and no columns.
Example 3: We use thresh to keep rows that have at least 2 non-null values.
Empty DataFrame Columns: [A, B] Index: []
Explanation: thresh=2 keep rows that have at least 2 non-null values. Each row in the DataFrame has only 1 non-null value, so all rows are dropped.
Example 4: In this example, we drop rows that have missing values only in a specific column ('A') using subset.
A B 0 1.0 4.0 2 3.0 NaN
Explanation: Only rows where column 'A' is NaN are dropped. Other missing values are ignored.
Example 5: In this example, we use inplace=True to modify the DataFrame directly.
X Y 2 3.0 6.0
Explanation: Only the last row has no missing values. inplace=True updates df directly without returning a new object.