Blog
date_trunc in Python Pandas
date_trunc is the date function used to truncate a date or datetime value to the start of a given unit of duration. The function helps in truncating the date column to year, decade, century, quarter, month, week, day, hour, minute, second, or millisecond. In this article, we will truncate the date column using python pandas.
Posted August 23, 2022 by Rohith ‐ 1 min read
Coalesce in Python Pandas
The coalesce function returns the first non-null value from a series of given columns in sql. In this article, we will perform coalesce operation on python pandas dataframe.
Posted August 23, 2022 by Rohith ‐ 1 min read
Get Parquet Schema Using Python
Parquet is widely used in data transformations. Every parquet file has schema associated with it. As it is a binary file, we cannot read the data using any text editor. In this article, we use pyarrow python package to extract the parquet schema.
Posted August 17, 2022 by Rohith ‐ 1 min read
Parquet Avro Type Mapping
Avro and Parquet are used in many data processing frameworks like kafka, spark, etc. It is important to know the data types supported in avro and parquet data format. In this article, we will list the avro and parquet data type mapping.
Posted August 17, 2022 by Rohith ‐ 1 min read
Data Types in Parquet
Parquet is used in many data processing frameworks like apache flink, spark, etc. It is important to know the data types supported in parquet data format.
Posted August 17, 2022 by Rohith ‐ 1 min read