Ankur Ranjan – Medium

Ankur Ranjan

Apache Hudi: Copy on Write(CoW) Table

As Data Engineer, we frequently encounter the tedious task of performing multiple UPSERT(update + insert) and DELETE operations in batch…

Oct 6, 2023

Apache Hudi: Copy on Write(CoW) Table

Oct 6, 2023

Supercharging Apps with Polyglot Persistence: A Simple Guide

After working for more than 4 years on Data Intensive applications in a startup, consultancy and product-based companies. I think that the…

Sep 4, 2023

Supercharging Apps with Polyglot Persistence: A Simple Guide

Sep 4, 2023

Solve Small File Problem using Apache Hudi

One of the biggest pains of Data Engineers is small file problems.

Aug 25, 2023

Solve Small File Problem using Apache Hudi

Aug 25, 2023

Published in
Dev Genius

Optimize MERGE job in BigQuery

I love BigQuery and think It is one of the best products ever made by the Google Cloud Platform.

Jul 31, 2023

Optimize MERGE job in BigQuery

Jul 31, 2023

Stateful transformations in Spark Streaming — Part 1 | Spark Streaming Session 3

In the previous article of this series i.e. Spark Streaming in layman’s terms we have understood the following things.

Feb 26, 2023

Stateful transformations in Spark Streaming — Part 1 | Spark Streaming Session 3

Feb 26, 2023

Spark Streaming: Session 2

In the first article about the spark streaming series, we have understood the following important concept.

Feb 5, 2023

Spark Streaming: Session 2

Feb 5, 2023

Spark Streaming — Part 1

A few months back, I was given a codebase that used Spark Streaming and it was written in scala. We were supposed to make major changes…

Feb 5, 2023

Spark Streaming — Part 1

Feb 5, 2023

Published in
Towards Dev

Disaster Recovery in Kafka Servers

I recently tried onboarding disaster management for our streaming pipeline which involves Kafka, Spark Streaming and MongoDb in one of our…

Dec 25, 2022

Disaster Recovery in Kafka Servers

Dec 25, 2022

File Formats in Big Data World — Part 1

One of the most fundamental decisions to make in the Data Engineering world is to choose the proper file formats in different zones of the…

Sep 12, 2022

Sep 12, 2022

Demystify different compression codec in big data

When we are working with big data files like Parquet, ORC, Avro etc then you will mostly come across different compression codec like…

Mar 22, 2022

Mar 22, 2022

LeetCode Curated SQL Solutions and Discussion — Week 1

SQL is a must when you are in the domain of Data. Let it be Data Engineering, Big Data, Data Analyst or BI Developer, everyone who is…

Mar 21, 2022

LeetCode Curated SQL Solutions and Discussion — Week 1

Mar 21, 2022

Published in
Analytics Vidhya

Apache Sqoop — One smart tool for Big Data World.

When we talk about the big data world then there are always three things involved and they are storage, processing & scalability. Here we…

Apr 1, 2021

Apache Sqoop — One smart tool for Big Data World.

Apr 1, 2021

I think there is one mistake in the definition of Executors i.e.

And I must agree this is really very well written.

Mar 25, 2021

Mar 25, 2021

Published in
Analytics Vidhya

Apache Airflow — Part 1

Every programmer loves automating their stuff. Learning and using any automation tool is fun for us. A few months ago, I came across a…

Jul 28, 2020

Apache Airflow — Part 1

Jul 28, 2020

Published in
Analytics Vidhya

Part 2- A Beginners Guide to Time profiling in Python.

Hello folks, welcome back. If you are joining back from my last blog then pretty much context has been set about time profiling, If you…

Jul 5, 2020

Part 2- A Beginners Guide to Time profiling in Python.

Jul 5, 2020

Published in
Analytics Vidhya

A Beginners Guide to Time profiling in Python.

Writing a python code gives us great power to showcase our idea by easily programming them but like someone has rightly said “With great…

Jun 27, 2020

A Beginners Guide to Time profiling in Python.

Jun 27, 2020

Ankur Ranjan

Ankur Ranjan

Data Engineer III @Walmart | Contributor of The Big Data Show

Following

Help
Status
About
Careers
Press
Blog
Privacy
Rules
Terms
Text to speech