In the next posts I will focus on Unix required as foundation for Hadoop. I will touch base all the commands needed for Hadoop ech system as well as a bit of Shell Programming which will be required at times to simplify execution of long commands (hadoop is famous for) at the command line.
Tuesday, April 28, 2015
Sunday, April 26, 2015
Five Best Sources for learning Hadoop
Apart from Apache community,I would recommend the below sources/books for learning Hadoop:
1. Data Intensive Text Processing : This is available free of cost at this link
https://lintool.github.io/MapReduceAlgorithms/MapReduce-book-final.pdf
The book focuses on the theoritical aspects of Map reduce rather than programmatical aspect. It has algorithmic approach and dwelves into the various paradigms of Map reduce. This is a must book to start with Map Reduce. A 175 page book but serves as an excellent source to understand the roots of Map Reduce.
2. Yahoo Developer Website:
These are excellent tutorials that focus on the basics of hadoop. Very simple and concise these are a must for beginners.
https://developer.yahoo.com/hadoop/tutorial/
There is an option of downloading the tutorials for off line study.
3. Hadoop A Definitive Guide
Author :Tom White
This book doesnt touch the bare bones of Hadoop but emphasises on programming and vividly focuses on the echo system. The author takes up complex examples to explore the various nuances of Hadoop. Hence this is best studied once we are thorough with the Yahoo Developer tutorials.
4. Hadoop in action
The book contains several Hadoop examples in a problem solution format. However the book assumes some basic knowledge in Hadoop and Map Reduce. Also all the examples are Java based and hence readers need to be thorough in basic Java. But its an excellent source to explore Hadoop.
5. Kaggle.com
Want to be a champ? Then this site is an ultimate source to compete in Hadoop. Its challenging and would generate profound interest in Big Data.
Saturday, April 25, 2015
Gearing up for Hadoop
For learning Hadoop, Java is a must as most of the programming in Mapreduce is done in Java. The basics of Java should be sufficient. The best book I would prefer is of course the complete reference by Herbert Schildt if you are a beginner or can go for Effective Java by Joshua bloch.