Hadoop linux commands every developer will use!

intro to HDFS

The Hadoop Distributed File System (HDFS) is where your data is located. You could probably use HUE to change a few things here and there, but if you learn the HDFS commands, you will better your chances at getting things done quicker.

So here is the Hadoop commands list:

Shell Command Meaning Example
hadoop fs -ls List the files in the root directory hadoop fs -ls /user/hadooper/
hadoop fs -mkdir Make a directory hadoop fs -mkdir /user/hadooper/newdir/
hadoop fs -cp Copy a file into another location hadoop fs -cp /user/hadooper/crackinghadoop.txt /user/hadooper/newdir/
hadoop fs -mv Move a file to a new directory hadoop fs -mv /user/hadooper/crackinghadoop.txt /user/hadooper/newdir/
hadoop fs -put put a file or directory from local linux machine to Hadoop HDFS hadoop fs -put localdir/samplefile.txt /user/hadooper/
hadoop fs -rm Remove all contents in newdir/ hadoop fs -rm user/hadooper/newdir/*
hadoop fs -cat Look at the records in a specific file hadoop fs -cat user/hadooper/crackinghadoop.txt
sudo -u hdfs hadoop fs -chown[:NewGroupName] Change owner of file sudo -u hdfs hadoop fs -chown prod_hadooper /user/hadooper/crackinghadoop.txt
hadoop fsck – / Runs a HDFS filesystem checking utility hadoop fsck – /

 

These HDFS shell commands are very useful on a daily basis. However, there is so much much you can do. Here is the a good link from apache: User Commands or this PDF file: File System Shell Guide