Coalesce in Python Pandas
Posted August 23, 2022 by Rohith ‐ 1 min read
The coalesce function returns the first non-null value from a series of given columns in sql. In this article, we will perform coalesce operation on python pandas dataframe.
TL;DR
Use combine_first()
pandas method.
Example
Let’s understand with an example. So, let’s create sample DataFrame.
Create Sample Dataframe
Here, we are creating a dataframe with two columns, a
and b
. a
column will have NaN
values. The goal is to create a column c
- which should be coalesced of a
and b
.
import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randint(0, 10, size=(10, 2)), columns=list('ab'))
df.loc[::2, 'a'] = np.nan
Output:
>>> df
a b
0 NaN 8
1 7.0 4
2 NaN 8
3 0.0 9
4 NaN 4
5 1.0 6
6 NaN 7
7 7.0 2
8 NaN 9
9 7.0 5
Apply Coalesce Operation
df['c'] = df.a.combine_first(df.b)
Output:
a b c
0 NaN 8 8.0
1 7.0 4 7.0
2 NaN 8 8.0
3 0.0 9 0.0
4 NaN 4 4.0
5 1.0 6 1.0
6 NaN 7 7.0
7 7.0 2 7.0
8 NaN 9 9.0
9 7.0 5 7.0