Setup a Hadoop cluster at your home and configure R to analyze your data
Here are the steps to follow to setup a single node hadoop cluster, Hive and R
- Install JDK(Version: latest, in my case JDK v7 update 45)
- Install Hadoop (Version: Hadoop-1.2.1)
- Install Hive (Version: Hive-0.11.0)
- Install R with shared libraries
- Configuring and install required packages for RHadoop and RHive
- Testing RHadoop and RHive
Note:
- Check your system type, is it 32 bit or 64 bit (in my case it is 64 bit)
- Be aware of your application versions
Install freshly a new Ubuntu (in my case Version: 13.10. Saucy Salamander on Oracle Virtual Box) with fully upgraded over internet with some basic applications
sudo apt-get install vim # Modified vi editor
sudo apt-get install ssh # Hadoop cluster works on SSH network to manages its resources