What is Hadoop
Hadoop comprises a collection of open-source software tools that allow users to solve complex problems. It does so by using a network of many computers to process massive amounts of data and computation. It was released by Yahoo in 2008.
This project is led by Apache Software, and is one of the first free and generally accessible big-data approaches. Hadoop has the ability to store and quickly process large amounts of data. Because of its distributed computing model and many thousands of node clusters, it has very high processing power.
Additionally, it is flexible (data can be stored without pre-processing), and fault-tolerant (the distributed computing system does not fail if a node goes down).