![]() |
VOOZH | about |
Reindexing in Pandas is used to change the row or column labels of a DataFrame to match a new set of indices. This is useful when aligning data, adding missing labels, or reshaping your DataFrame. If the new index includes values not present in the original DataFrame, Pandas fills those with NaN by default. For example, if we try adding a new row using reindex():
A B 0 1.0 4.0 1 2.0 5.0 2 3.0 6.0 3 NaN NaN
As you can see, index 3 wasn’t present in the original DataFrame, so it's filled with NaN.
When you reindex a DataFrame, you provide a new set of labels (indices) for either the rows or columns. If any of these new labels are not present in the original DataFrame, Pandas will assign NaN as the value for those missing indices. The syntax for reindex() is as follows:
DataFrame.reindex(labels=None, index=None, columns=None, axis=None, method=None, copy=True, fill_value=NaN)
- labels: New labels/indexes to conform to.
- index/columns: New row/column labels.
- fill_value: Value to use for filling missing entries (default is NaN).
- method: Method for filling holes (ffill, bfill, etc.).
You can change or expand row indices using reindex(). Any new index not found in the DataFrame will be assigned NaN, unless you provide a fill_value.
A B 0 10 40 1 20 50 2 30 60 3 0 0
Reindexed DataFrame: A B 0 10 40 1 20 50 2 30 60 3 0 0 4 0 0 5 0 0
You can change the order of columns or add new columns using reindex() on the columns parameter.
B A C 0 40 10 100 1 50 20 100 2 60 30 100
Reindexed DataFrame: B A C 0 40 10 100 1 50 20 100 2 60 30 100
When new labels introduce NaN values, you can handle them using:
A B 0 10.0 40.0 1 20.0 50.0 2 30.0 60.0 3 30.0 60.0 4 30.0 60.0
A B 0 10.0 40.0 1 20.0 50.0 2 30.0 60.0 3 NaN NaN 4 NaN NaN