Intro to Hadoop and It’s Core Components
One person can’t do all jobs. To increase work progress allocate works to different people
Before understanding what Hadoop is, let me know answer for this question
“Does your computer can handle GBs or TBs of Data?”
Business Scenario 1:
When we have 900 MB data, we would like Data Analytics and Machine Learning. Does traditional hardwires capable of handling 900 MB Data?
As a Data Engineer or Data Scientist, how do handle the situation?
So, how Hadoop helping us to handling huge or vast amount of data.
Hadoop is an open-source software that help us to Store and Processing large amount of data. It provides a software framework for distributed storage and processing of big data using its core components. it is a framework that allows for the Distributed Processing of large data sets across clusters of computers using simple programming models
Wait… What is store which is like DATABASE
Is Hadoop DATABASE like MySQL, PostgreSQL, Oracle, Mango DB etc.,?
No. HADOOP is a not Data Base. The principle of Hadoop is Distributed File System that can store and processes a huge amount of data clusters across the computers
Core Components of Hadoop
Hadoop Ecosystem:
- HDFS (Hadoop Distributed File System)
- MapReduce
- YARN
- HBase
- Pig
- Hive
- Sqoop
- Flume
- Kafka
- Zookeeper
Thanks for reading. In next blog we will talk about in-depth understanding of each component with business use case