Feather — A Fast On-Disk Format for R and Python Data Frames

AI & Data Engineering
2 min readApr 18, 2019

R and Python are two widely used tools or languages by the data analyst and Scientists. So, it will be great if there is any way to exchange data between these two. Here comes “Feather” — A fast, lightweight, language agnostic and easy-to-use binary file format for storing data frames. It is language agnostic!

“Feather” provides binary columnar serialization for data frames designed to make efficient reading and writing of data frames. It uses Apache Arrow columnar memory specification to represent binary data on disk.

Installation

Installation in Python:

conda install -c conda-forge feather-format OR pip install feather-format

Installation in R (Installation of feather goes well from R version ≥ 3.3.0):

install.packages(pkgs=’feather’,dependencies = T) OR devtools::install_github(“wesm/feather/R”)

Now, let's explore through small example codes…

In step1, we will create a pandas dataframe and write the content to disk in feather format.

In step2, we will read the same content in R

Figure1: Writing Pandas dataframe into feather file format
Figure2: Reading Pandas dataframe in R using feather file format

Reference

http://wesmckinney.com/blog/feather-arrow-future/
https://blog.rstudio.com/2016/03/29/feather/

--

--

AI & Data Engineering

A Data Enthusiast, Lead Architect with 17 yrs experience in the field of AI Engineering, BI, Data Warehousing, Dimensional modeling, ML and Big Data