HADOOP FS SHELL COMMANDS
Hadoop file system (fs) shell commands are used to perform various file operations like copying file, changing permissions, viewing the contents of the file, changing ownership of files, creating directories etc.
The syntax of fs shell command is
All the fs shell commands takes the path URI as arguments. The format of URI is sheme://authority/path. The scheme and authority are optional. For hadoop the scheme is hdfs and for local file system the scheme is file. IF you do not specify a scheme, the default scheme is taken from the configuration file. You can also specify the directories in hdfs along with the URI as hdfs://namenodehost/dir1/dir2 or simple /dir1/dir2.
The hadoop fs commands are almost similar to the unix commands. Let see each of the fs shell commands in detail with examples:
hadoop fs ls:
The hadoop ls command is used to list out the directories and files. An example is shown below:
The above command lists out the files in the employees directory.
The output of hadoop fs ls command is almost similar to the unix ls command. The only difference is in the second field. For a file, the second field indicates the number of replicas and for a directory, the second field is empty.
hadoop fs lsr:
The hadoop lsr command recursively displays the directories, sub directories and files in the specified directory. The usage example is shown below:
The hadoop fs lsr command is similar to the ls -R command in unix.
hadoop fs cat:
Hadoop cat command is used to print the contents of the file on the terminal (stdout). The usage example of hadoop cat command is shown below:
hadoop fs chgrp:
hadoop chgrp shell command is used to change the group association of files. Optionally you can use the -R option to change recursively through the directory structure. The usage of hadoop fs -chgrp is shown below:
hadoop fs chmod:
The hadoop chmod command is used to change the permissions of files. The -R option can be used to recursively change the permissions of a directory structure. The usage is shown below:
hadoop fs chown:
The hadoop chown command is used to change the ownership of files. The -R option can be used to recursively change the owner of a directory structure. The usage is shown below:
hadoop fs mkdir:
The hadoop mkdir command is for creating directories in the hdfs. You can use the -p option for creating parent directories. This is similar to the unix mkdir command. The usage example is shown below:
The above command creates the hadoopdemo directory in the /user/hadoop directory.
The above command creates the dir1/dir2/demo directory in /user/hadoop directory.
hadoop fs copyFromLocal:
The hadoop copyFromLocal command is used to copy a file from the local file system to the hadoop hdfs. The syntax and usage example are shown below:
hadoop fs copyToLocal:
The hadoop copyToLocal command is used to copy a file from the hdfs to the local file system. The syntax and usage example is shown below:
The -ignorecrc option is used to copy the files that fail the crc check. The -crc option is for copying the files along with their CRC.
hadoop fs cp:
The hadoop cp command is for copying the source into the target. The cp command can also be used to copy multiple files into the target. In this case the target should be a directory. The syntax is shown below:
hadoop fs -put:
Hadoop put command is used to copy multiple sources to the destination system. The put command can also read the input from the stdin. The different syntaxes for the put command are shown below:
hadoop fs get:
Hadoop get command copies the files from hdfs to the local file system. The syntax of the get command is shown below:
hadoop fs getmerge:
hadoop getmerge command concatenates the files in the source directory into the destination file. The syntax of the getmerge shell command is shown below:
The addnl option is for adding new line character at the end of each file.
hadoop fs moveFromLocal:
The hadoop moveFromLocal command moves a file from local file system to the hdfs directory. It removes the original source file. The usage example is shown below:
hadoop fs mv:
It moves the files from source hdfs to destination hdfs. Hadoop mv command can also be used to move multiple source files into the target directory. In this case the target should be a directory. The syntax is shown below:
hadoop fs du:
The du command displays aggregate length of files contained in the directory or the length of a file in case its just a file. The syntax and usage is shown below:
hadoop fs dus:
The hadoop dus command prints the summary of file lengths
hadoop fs expunge:
Used to empty the trash. The usage of expunge is shown below:
hadoop fs rm:
Removes the specified list of files and empty directories. An example is shown below:
hadoop fs -rmr:
Recursively deletes the files and sub directories. The usage of rmr is shown below:
hadoop fs setrep:
Hadoop setrep is used to change the replication factor of a file. Use the -R option for recursively changing the replication factor.
hadoop fs stat:
Hadoop stat returns the stats information on a path. The syntax of stat is shown below:
hadoop fs tail:
Hadoop tail command prints the last kilobytes of the file. The -f option can be used same as in unix.
hadoop fs test:
The hadoop test is used for file test operations. The syntax is shown below:
Here "e" for checking the existence of a file, "z" for checking the file is zero length or not, "d" for checking the path is a directory or no. On success, the test command returns 1 else 0.
hadoop fs text:
The hadoop text command displays the source file in text format. The allowed source file formats are zip and TextRecordInputStream. The syntax is shown below:
hadoop fs touchz:
The hadoop touchz command creates a zero byte file. This is similar to the touch command in unix. The syntax is shown below:
The syntax of fs shell command is
hadoop fs <args>
All the fs shell commands takes the path URI as arguments. The format of URI is sheme://authority/path. The scheme and authority are optional. For hadoop the scheme is hdfs and for local file system the scheme is file. IF you do not specify a scheme, the default scheme is taken from the configuration file. You can also specify the directories in hdfs along with the URI as hdfs://namenodehost/dir1/dir2 or simple /dir1/dir2.
The hadoop fs commands are almost similar to the unix commands. Let see each of the fs shell commands in detail with examples:
Hadoop fs Shell Commands
hadoop fs ls:
The hadoop ls command is used to list out the directories and files. An example is shown below:
> hadoop fs -ls /user/hadoop/employees Found 1 items -rw-r--r-- 2 hadoop hadoop 2 2012-06-28 23:37 /user/hadoop/employees/000000_0
The above command lists out the files in the employees directory.
> hadoop fs -ls /user/hadoop/dir Found 1 items drwxr-xr-x - hadoop hadoop 0 2013-09-10 09:47 /user/hadoop/dir/products
The output of hadoop fs ls command is almost similar to the unix ls command. The only difference is in the second field. For a file, the second field indicates the number of replicas and for a directory, the second field is empty.
hadoop fs lsr:
The hadoop lsr command recursively displays the directories, sub directories and files in the specified directory. The usage example is shown below:
> hadoop fs -lsr /user/hadoop/dir Found 2 items drwxr-xr-x - hadoop hadoop 0 2013-09-10 09:47 /user/hadoop/dir/products -rw-r--r-- 2 hadoop hadoop 1971684 2013-09-10 09:47 /user/hadoop/dir/products/products.dat
The hadoop fs lsr command is similar to the ls -R command in unix.
hadoop fs cat:
Hadoop cat command is used to print the contents of the file on the terminal (stdout). The usage example of hadoop cat command is shown below:
> hadoop fs -cat /user/hadoop/dir/products/products.dat cloudera book by amazon cloudera tutorial by ebay
hadoop fs chgrp:
hadoop chgrp shell command is used to change the group association of files. Optionally you can use the -R option to change recursively through the directory structure. The usage of hadoop fs -chgrp is shown below:
hadoop fs -chgrp [-R] <NewGroupName> <file or directory name>
hadoop fs chmod:
The hadoop chmod command is used to change the permissions of files. The -R option can be used to recursively change the permissions of a directory structure. The usage is shown below:
hadoop fs -chmod [-R] <mode | octal mode> <file or directory name>
hadoop fs chown:
The hadoop chown command is used to change the ownership of files. The -R option can be used to recursively change the owner of a directory structure. The usage is shown below:
hadoop fs -chown [-R] <NewOwnerName>[:NewGroupName] <file or directory name>
hadoop fs mkdir:
The hadoop mkdir command is for creating directories in the hdfs. You can use the -p option for creating parent directories. This is similar to the unix mkdir command. The usage example is shown below:
> hadoop fs -mkdir /user/hadoop/hadoopdemo
The above command creates the hadoopdemo directory in the /user/hadoop directory.
> hadoop fs -mkdir -p /user/hadoop/dir1/dir2/demo
The above command creates the dir1/dir2/demo directory in /user/hadoop directory.
hadoop fs copyFromLocal:
The hadoop copyFromLocal command is used to copy a file from the local file system to the hadoop hdfs. The syntax and usage example are shown below:
Syntax: hadoop fs -copyFromLocal <localsrc> URI Example: Check the data in local file > ls sales 2000,iphone 2001, htc Now copy this file to hdfs > hadoop fs -copyFromLocal sales /user/hadoop/hadoopdemo View the contents of the hdfs file. > hadoop fs -cat /user/hadoop/hadoopdemo/sales 2000,iphone 2001, htc
hadoop fs copyToLocal:
The hadoop copyToLocal command is used to copy a file from the hdfs to the local file system. The syntax and usage example is shown below:
Syntax hadoop fs -copyToLocal [-ignorecrc] [-crc] URI <localdst> Example: hadoop fs -copyToLocal /user/hadoop/hadoopdemo/sales salesdemo
The -ignorecrc option is used to copy the files that fail the crc check. The -crc option is for copying the files along with their CRC.
hadoop fs cp:
The hadoop cp command is for copying the source into the target. The cp command can also be used to copy multiple files into the target. In this case the target should be a directory. The syntax is shown below:
hadoop fs -cp /user/hadoop/SrcFile /user/hadoop/TgtFile hadoop fs -cp /user/hadoop/file1 /user/hadoop/file2 hdfs://namenodehost/user/hadoop/TgtDirectory
hadoop fs -put:
Hadoop put command is used to copy multiple sources to the destination system. The put command can also read the input from the stdin. The different syntaxes for the put command are shown below:
Syntax1: copy single file to hdfs hadoop fs -put localfile /user/hadoop/hadoopdemo Syntax2: copy multiple files to hdfs hadoop fs -put localfile1 localfile2 /user/hadoop/hadoopdemo Syntax3: Read input file name from stdin hadoop fs -put - hdfs://namenodehost/user/hadoop/hadoopdemo
hadoop fs get:
Hadoop get command copies the files from hdfs to the local file system. The syntax of the get command is shown below:
hadoop fs -get /user/hadoop/hadoopdemo/hdfsFileName localFileName
hadoop fs getmerge:
hadoop getmerge command concatenates the files in the source directory into the destination file. The syntax of the getmerge shell command is shown below:
hadoop fs -getmerge <src> <localdst> [addnl]
The addnl option is for adding new line character at the end of each file.
hadoop fs moveFromLocal:
The hadoop moveFromLocal command moves a file from local file system to the hdfs directory. It removes the original source file. The usage example is shown below:
> hadoop fs -moveFromLocal products /user/hadoop/hadoopdemo
hadoop fs mv:
It moves the files from source hdfs to destination hdfs. Hadoop mv command can also be used to move multiple source files into the target directory. In this case the target should be a directory. The syntax is shown below:
hadoop fs -mv /user/hadoop/SrcFile /user/hadoop/TgtFile hadoop fs -mv /user/hadoop/file1 /user/hadoop/file2 hdfs://namenodehost/user/hadoop/TgtDirectory
hadoop fs du:
The du command displays aggregate length of files contained in the directory or the length of a file in case its just a file. The syntax and usage is shown below:
hadoop fs -du hdfs://namenodehost/user/hadoop
hadoop fs dus:
The hadoop dus command prints the summary of file lengths
> hadoop fs -dus hdfs://namenodehost/user/hadoop hdfs://namenodehost/user/hadoop 21792568333
hadoop fs expunge:
Used to empty the trash. The usage of expunge is shown below:
hadoop fs -expunge
hadoop fs rm:
Removes the specified list of files and empty directories. An example is shown below:
hadoop fs -rm /user/hadoop/file
hadoop fs -rmr:
Recursively deletes the files and sub directories. The usage of rmr is shown below:
hadoop fs -rmr /user/hadoop/dir
hadoop fs setrep:
Hadoop setrep is used to change the replication factor of a file. Use the -R option for recursively changing the replication factor.
hadoop fs -setrep -w 4 -R /user/hadoop/dir
hadoop fs stat:
Hadoop stat returns the stats information on a path. The syntax of stat is shown below:
hadoop fs -stat URI > hadoop fs -stat /user/hadoop/ 2013-09-24 07:53:04
hadoop fs tail:
Hadoop tail command prints the last kilobytes of the file. The -f option can be used same as in unix.
> hafoop fs -tail /user/hadoop/sales.dat 12345 abc 2456 xyz
hadoop fs test:
The hadoop test is used for file test operations. The syntax is shown below:
hadoop fs -test -[ezd] URI
Here "e" for checking the existence of a file, "z" for checking the file is zero length or not, "d" for checking the path is a directory or no. On success, the test command returns 1 else 0.
hadoop fs text:
The hadoop text command displays the source file in text format. The allowed source file formats are zip and TextRecordInputStream. The syntax is shown below:
hadoop fs -text <src>
hadoop fs touchz:
The hadoop touchz command creates a zero byte file. This is similar to the touch command in unix. The syntax is shown below:
hadoop fs -touchz /user/hadoop/filename
No comments:
Post a Comment