Coalesce in Python Pandas

Posted August 23, 2022 by Rohith ‐ 1 min read

The coalesce function returns the first non-null value from a series of given columns in sql. In this article, we will perform coalesce operation on python pandas dataframe.

TL;DR

Use combine_first() pandas method.

Example

Let’s understand with an example. So, let’s create sample DataFrame.

Create Sample Dataframe

Here, we are creating a dataframe with two columns, a and b. a column will have NaN values. The goal is to create a column c - which should be coalesced of a and b.

import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.randint(0, 10, size=(10, 2)), columns=list('ab'))
df.loc[::2, 'a'] = np.nan

Output:

>>> df
     a  b
0  NaN  8
1  7.0  4
2  NaN  8
3  0.0  9
4  NaN  4
5  1.0  6
6  NaN  7
7  7.0  2
8  NaN  9
9  7.0  5

Apply Coalesce Operation

df['c'] = df.a.combine_first(df.b)

Output:

     a  b    c
0  NaN  8  8.0
1  7.0  4  7.0
2  NaN  8  8.0
3  0.0  9  0.0
4  NaN  4  4.0
5  1.0  6  1.0
6  NaN  7  7.0
7  7.0  2  7.0
8  NaN  9  9.0
9  7.0  5  7.0

quick-references blog python pandas

Subscribe For More Content