Open in app

Sign In

Write

Sign In

Ankur Ranjan
Ankur Ranjan

50 Followers

Home

About

Sep 4

Supercharging Apps with Polyglot Persistence: A Simple Guide

After working for more than 4 years on Data Intensive applications in a startup, consultancy and product-based companies. I think that the most important design and decision of any successful project is choosing the right Database and Data Storage at different layers of application. Whenever I am discussing my design…

Database

3 min read

Supercharging Apps with Polyglot Persistence: A Simple Guide
Supercharging Apps with Polyglot Persistence: A Simple Guide
Database

3 min read


Aug 25

Solve Small File Problem using Apache Hudi

One of the biggest pains of Data Engineers is small file problems. Let me tell you a short story and explain how one of the efficient tools solves this problem. A few days ago, while UPSERTING data to my Apache Hudi table, I was observing the pipeline result and noticed that my small files were being compacted and converted into larger files…

Data Lakehouse

5 min read

Solve Small File Problem using Apache Hudi
Solve Small File Problem using Apache Hudi
Data Lakehouse

5 min read


Published in

Dev Genius

·Jul 31

Optimize MERGE job in BigQuery

I love BigQuery and think It is one of the best products ever made by the Google Cloud Platform. As someone who works in data engineering, I have found BigQuery to be one of the best products offered by the Google Cloud Platform. …

Bigquery

6 min read

Optimize MERGE job in BigQuery
Optimize MERGE job in BigQuery
Bigquery

6 min read


Feb 26

Stateful transformations in Spark Streaming — Part 1 | Spark Streaming Session 3

In the previous article of this series i.e. Spark Streaming in layman’s terms we have understood the following things. Different streaming sources Stateful vs Stateless transformation For those who are reading this article without reading the previous article of this series, I recommend reading the last article or watching the…

Spark

6 min read

Stateful transformations in Spark Streaming — Part 1 | Spark Streaming Session 3
Stateful transformations in Spark Streaming — Part 1 | Spark Streaming Session 3
Spark

6 min read


Feb 5

Spark Streaming: Session 2

In the first article about the spark streaming series, we have understood the following important concept. We have also written word count code to understand these concepts. For those who are reading this article without reading the first article of this series, I recommend reading the previous article or watching…

4 min read

Spark Streaming: Session 2
Spark Streaming: Session 2

4 min read


Feb 5

Spark Streaming — Part 1

A few months back, I was given a codebase that used Spark Streaming and it was written in scala. We were supposed to make major changes and go for the new version of the project. I had worked on Spark Structure Streaming before but not in Spark Streaming. I found…

6 min read

Spark Streaming — Part 1
Spark Streaming — Part 1

6 min read


Published in

Towards Dev

·Dec 25, 2022

Disaster Recovery in Kafka Servers

I recently tried onboarding disaster management for our streaming pipeline which involves Kafka, Spark Streaming and MongoDb in one of our use cases at Walmart Global Tech The last few weeks were a good learning curve for me and I really enjoyed all these awesome implementations of the streaming pipeline…

Kafka

8 min read

Disaster Recovery in Kafka Servers
Disaster Recovery in Kafka Servers
Kafka

8 min read


Sep 12, 2022

File Formats in Big Data World — Part 1

One of the most fundamental decisions to make in the Data Engineering world is to choose the proper file formats in different zones of the Big Data Pipeline. It helps the team to fetch the data faster and lower the cost of the project. …

Data Engineering

7 min read

Data Engineering

7 min read


Mar 22, 2022

Demystify different compression codec in big data

When we are working with big data files like Parquet, ORC, Avro etc then you will mostly come across different compression codec like snappy, lzo, gzip, bzip2 etc. In this article, we will try to understand some of these compression codecs and discuss basic fundamental differences between them. Before starting…

Big

4 min read

Big

4 min read


Mar 21, 2022

LeetCode Curated SQL Solutions and Discussion — Week 1

SQL is a must when you are in the domain of Data. Let it be Data Engineering, Big Data, Data Analyst or BI Developer, everyone who is working with Data should have a good understanding of SQL. I feel that reading SQL theoretically is not gonna help that much. So…

Leetcode

6 min read

LeetCode Curated SQL Solutions and Discussion — Week 1
LeetCode Curated SQL Solutions and Discussion — Week 1
Leetcode

6 min read

Ankur Ranjan

Ankur Ranjan

50 Followers

Data Engineer III @Walmart | Contributor of The Big Data Show

Following
  • Sadrach Pierre, Ph.D.

    Sadrach Pierre, Ph.D.

  • Maya Shavin

    Maya Shavin

  • Xinran Waibel

    Xinran Waibel

  • 💡Mike Shakhomirov

    💡Mike Shakhomirov

  • Nikita Chaudhary

    Nikita Chaudhary

See all (33)

Help

Status

Writers

Blog

Careers

Privacy

Terms

About

Text to speech

Teams