blog

date_trunc in Python Pandas

date_trunc is the date function used to truncate a date or datetime value to the start of a given unit of duration. The function helps in truncating the date column to year, decade, century, quarter, month, week, day, hour, minute, second, or millisecond. In this article, we will truncate the date column using python pandas.

Posted August 23, 2022 by Rohith ‐ 1 min read

quick-references blog python pandas

Coalesce in Python Pandas

The coalesce function returns the first non-null value from a series of given columns in sql. In this article, we will perform coalesce operation on python pandas dataframe.

Posted August 23, 2022 by Rohith ‐ 1 min read

quick-references blog python pandas

Get Parquet Schema Using Python

Parquet is widely used in data transformations. Every parquet file has schema associated with it. As it is a binary file, we cannot read the data using any text editor. In this article, we use pyarrow python package to extract the parquet schema.

Posted August 17, 2022 by Rohith ‐ 1 min read

quick-references python transformations parquet blog

Parquet Avro Type Mapping

Avro and Parquet are used in many data processing frameworks like kafka, spark, etc. It is important to know the data types supported in avro and parquet data format. In this article, we will list the avro and parquet data type mapping.

Posted August 17, 2022 by Rohith ‐ 1 min read

avro parquet data-types transformations

Data Types in Parquet

Parquet is used in many data processing frameworks like apache flink, spark, etc. It is important to know the data types supported in parquet data format.

Posted August 17, 2022 by Rohith ‐ 1 min read

parquet data-types transformations

Subscribe For More Content