Hadoop – Parallel computing and distributed data storage


For my next blog I will write about Hadoop, what is is and why it is used buy many organizations and it is becoming the standard in data storage and processing.

Hadoop has may advantages over traditional databases computing platforms, it is very scalable and offers rapid data analysis.

The way I do understand Hadoop is, it offers decentralized storage and computing power to the organizations.  With Hadoop the organizations can spread out computing tasks among many nodes in the cluster. Each task is divided into smaller more manageable pieces, that are completed by one node in the cluster where the data is stored. This offers offering increased performance, the and.  Hadoop also has HDFS (hadoop distributed file system) this allows the  data to be stored in multiple notes, for better file read and write speeds. it also has data redundancy built it, it creates 2 additional data copies on separate nodes in the cluster. When the data needs to be processed it can be processed from any from the three copies. And the changes are synced back across.

More to be updated…


Leave a Reply

Your email address will not be published. Required fields are marked *