Introduction To The Linux Command Line (CLI)

This document is heavily borrowed from The University of Hull's HPC Website and the lab manual for CSE 350 at Auburn University, 1990, by Eric Richards.

Introduction

A computer operating system is a lot like an ice cream cone. There many flavors of things that you want to try (documents, audio, video, internet), and the kernel of an operating system is the cone that helps the hardware do what you want to do.

Like most supercomputers today, the KSU HPC cluster is built on the Linux Kernel, which provides a parallel computing platform vital to supporting complex workloads. Many HPC environments choose Linux because of its stability, reliability, and security. The KSU HPC cluster runs on the RedHat Enterprise Linux Server Operation system.

The KSU HPC cluster consists of a management computer connected to many compute(worker) nodes. All these nodes reside on the same fast network that provides connections to the storage resources and provides an avenue for the interconnect traffic needed to run connected jobs across these nodes. The management node helps provide resource management, job scheduling, development tools, application, and libraries.

While many computers today offer a graphical user interface (GUI), Linux servers are traditionally accessed from a command-line interface (CLI). Linux is modeled after Unix operating systems, so having an extensive set of command line tools and utilities is part of its rich heritage.

Work In Progress

This article is a work in progress. Chances are, information is either incomplete or just plain missing.

Command Line

A Linux distribution's command line interface is a program known as a shell that takes keyboard commands and passes them to the operating system to be carried out with the system's available resources. We will use the bash shell for the following sections, but there are others. The name "bash" is an acronym for "Bourne Again Shell".

The Prompt

Once you are logged in to the KSU HPC cluster, you should see your command line prompt. It should appear to look something like this:

[NETID@hpc ~]$

It should contain your username@machinename, followed by the current working directory (~ if you are in your home directory) and lastly a dollar sign.

Cursor Keys and History

The Up and Down keys allow you to browse your command history. If you press the Up key, you can move backward through the history, while the Down key will move you forwards until you get back to the current empty prompt. The Left and Right keys allow you to edit the currently displayed command. So, pressing the Up followed by the Right would let you edit a previous command. Once you are done editing the command, press Enter to execute the command.

Ending a Session

You can end a shell session by closing your terminal application window or typing exit followed by Enter at your shell prompt.

[yourNETID@hpc scratch]$ exit

Tip

Remember to press the Enter key at the end of each command that you run from the shell. This will be the last time we remind you...

Man Pages

Linux comes with a large set of built-in documentation for most of the commands available on the system. These are known as man pages (also called manual pages). To view the man page for a particular command, type man followed by the name of the command, such as:

[barney@hpc ~]$ man man
# displays the manual page for the command man
[barney@hpc ~]$ man whois
# displays the manual page for the command whois
[barney@hpc ~]$ man lftp.conf
# displays the manual page for the lftp configuration file

To stop viewing the man page, press Q.

Many commands accept the --help or --usage parameters to display built-in documentation of how they work.

[barney@hpc ~]$ man --help
# displays a (somewhat) brief listing of the command-line options to the man command

Working With Directories

Much like working with other operating systems such as MacOS or Microsoft Windows, the Linux filesystem is based around files and directories. You can use several commands to work with the filesystem, such as pwd, cd, ls, mkdir, and rmdir.

About The Linux Filesystem

graph TD
A["/"] --- B[bin]
A --- C[data]
A --- D[etc]
A --- E[tmp]
A --- F[usr]
C --- G[home]
F --- H[bin]
F --- I[lib]
G --- J[barney]
G --- K[ted]
H --- L([ls])
H --- M([who])
J --- N([data1])
J --- O([data2])

The Linux filesystem is sort of like an upside-down tree. The very top of the tree is called the "root" and is represented with the / character. Linux uses directories, which are analogous to Folders in Windows and MacOS. The / character separates each part as you traverse the tree. This representation is called a path. For example, you can find the full path to the who command listed below is /usr/bin/who, and the file data1 is located at /data/home/barney/data1.

pwd

To find out what directory you are currently in, use the pwd command:

[barney@hpc ~]$ pwd
/data/home/username

cd

To change the directory you are currently working in, you use the cd command:

[barney@hpc ~]$ cd /tmp
[barney@hpc tmp]$ pwd
/tmp
[barney@hpc tmp]$ cd /data/home
[barney@hpc home]$ pwd
/data/home

You'll notice that, by default, part of your prompt will change to reflect the last part of your current directory.

There is a shortcut to return to your home directory, so you don't have to remember its full path. You can use the ~ (known as the tilde) character after the cd command:

[barney@hpc home]$ cd ~
[barney@hpc ~]$ pwd
/data/home/username

Sometimes you need to go to the directory above wherever you are (known as the parent directory). An easy way of doing that is to use .., such as:

[barney@hpc home]$ pwd
/data/home
[barney@hpc home]$ cd ..
[barney@hpc data]$ pwd
/data

It's important to note that you can use the . to refer to the current directory. Granted, this isn't particularly useful for the cd command, but it will be helpful later (for instance, when copying files to your current directory).

Absolute v. Relative Paths

As a quick rule of thumb, absolute paths start with either a / or a ~. If the path starts with anything else, it's a relative path. An example of using cd with an absolute path looks like this:

[barney@hpc ~]$ pwd
/data/home/barney
[barney@hpc ~]$ cd /usr/bin
[barney@hpc bin]$ pwd
/usr/bin

On the other hand, an example of using cd with a relative path could look something like:

[barney@hpc bin]$ cd ..
[barney@hpc usr]$ pwd
/usr
[barney@hpc usr]$ cd lib
[barney@hpc lib]$ pwd
/usr/lib

Tab Completion

The command line is capable of helping you type path names. If you type a partial command and hit the Tab key, it will try and autocomplete what you were trying to type. If it can't guess what you are trying to type, it will beep at you. If you hit Tab again, it will give you a list of matches to select from. Here's an example of using tab completion. Lets assume you have several directories under your account that start with the letter D and you want to access one called directory2:

[barney@hpc ~]$ ls
directions  directory1  directory2  directory3  documents
[barney@hpc ~]$ cd di

If you hit Tab now, the shell will complete as much as it can and then it should beep at you. If you hit Tab a second time, it'll show something like:

[barney@hpc ~]$ cd direct
directions/ directory1/ directory2/ directory3/
[barney@hpc ~]$ cd direct

Now, if you hit the O key and then Tab two more times, the following should happen:

[barney@hpc ~]$ cd directory
directory1/ directory2/ directory3/
[barney@hpc ~]$ cd directory

Since you wanted directory2, now you can just hit the 2 key and then enter to finish the command:

[barney@hpc ~]$ cd directory2
[barney@hpc directory2]$ pwd
/data/home/barney/directory2

ls

The ls command will list the files and directories in a directory. By default, it lists the contents of your current directory, although you can specify a different directory:

[barney@hpc ~]$ ls
directions  directory1  directory2  directory3  documents  file1  file2  results.out
[barney@hpc ~]$ ls directory2
file3  file4  subdir1  subdir2  subdir3

The ls command has some useful options that change the information it provides. The -l option gives you a more detailed listing of the directory contents. The -a option shows all the files and directories, including those hidden. The -A works the same as -a except that it ignores . and .. (since they appear in EVERY directory). You can also combine options:

[barney@hpc ~]$ ls
directions  directory1  directory2  directory3  documents  file1  file2  results.out
[barney@hpc ~]$ ls -a
.  ..  .bash_profile  .bashrc  directions  directory1  directory2  directory3  documents  file1  file2  results.out  .vimrc
[barney@hpc ~]$ ls -A
.bash_profile  .bashrc  directions  directory1  directory2  directory3  documents  file1  file2  results.out  .vimrc
[barney@hpc ~]$ ls -l
total 544
drwxrwxr-x 2 barney barney   4096 Apr 26 13:54 directions
drwxrwxr-x 2 barney barney  32768 Apr 26 14:30 directory1
drwxrwxr-x 5 barney barney   4096 Apr 26 14:18 directory2
drwxrwxr-x 2 barney barney  98304 Apr 26 14:27 directory3
drwxrwxr-x 2 barney barney   4096 Apr 26 13:55 documents
-rw-rw-r-- 1 barney barney 332964 Apr 26 14:29 file1
-rw-rw-r-- 1 barney barney  37160 Apr 26 14:29 file2
-rw-rw-r-- 1 barney barney      0 Apr 26 14:15 results.out
[barney@hpc ~]$  ls -Al
total 544
-rw-rw-r-- 1 barney barney      0 Apr 26 14:15 .bash_profile
-rw-rw-r-- 1 barney barney    572 Apr 26 14:25 .bashrc
drwxrwxr-x 2 barney barney   4096 Apr 26 13:54 directions
drwxrwxr-x 2 barney barney  32768 Apr 26 14:30 directory1
drwxrwxr-x 5 barney barney   4096 Apr 26 14:18 directory2
drwxrwxr-x 2 barney barney  98304 Apr 26 14:27 directory3
drwxrwxr-x 2 barney barney   4096 Apr 26 13:55 documents
-rw-rw-r-- 1 barney barney 332964 Apr 26 14:29 file1
-rw-rw-r-- 1 barney barney  37160 Apr 26 14:29 file2
-rw-rw-r-- 1 barney barney      0 Apr 26 14:15 results.out
-rw-rw-r-- 1 barney barney   2642 Apr 26 14:25 .vimrc

mkdir

The mkdir command creates a new directory. For example:

[barney@hpc ~]$ mkdir directory4
[barney@hpc ~]$ cd directory4
[barney@hpc directory4]$ pwd
/data/home/barney/directory4

rmdir

Danger

If you aren't careful, the rmdir command can cause you to lose data. Ensure that the directory you are trying to remove is the one that you want to remove.

The rmdir command removes a directory. The directory you are removing must be empty and you cannot be in that directory when you try and remove it:

[barney@hpc directory4]$ ls -A
[barney@hpc directory4]$ cd ..
[barney@hpc ~]$ rmdir directory4

Working With Files

Wildcards

Needs Improvement

The Wildcards section needs some work to improve it's clarity.

Wildcards are special characters Linux shells use on the command line to represent other characters. You can use them with most commands, such as ls or rm, to list or remove files based on some criteria. There are three major types of wildcards that you will commonly run into:

An asterisk (*) will match zero or more occurrences of any character (letters, numbers, or symbols).
Question mark (?) will match exactly one occurrence of any character.
Bracketed characters ([ ]) will match any single occurrence of a character enclosed in the square brackets. It is possible to use different types of characters (alphanumeric characters) as well as ranges of characters.

Asterisk

Use the asterisk (*) when working with files that all start or end with (or even contain) certain characters. For instance, if you want to see all of the files in the current directory that start with the letter l, you could do something like:

[barney@hpc ~]$ ls
cast.dat  data.00  data.02  data.04  data.06  data.08  data.10   feast.dat  list.dat      lost.dat  New.data
cost.dat  data.01  data.03  data.05  data.07  data.09  fast.dat  last.dat   list.dat.old  new.data
[barney@hpc ~]$ ls l*
last.dat  list.dat  list.dat.old  lost.dat

You could also pull up a list of all of the files that end with .dat, like:

[barney@hpc ~]$ ls *.dat
cast.dat  cost.dat  fast.dat  feast.dat  last.dat  list.dat  lost.dat

Question Mark

The question mark (?) is useful if you want to work with files that differ only by a single character, such as:

[barney@hpc ~]$ ls l?st.dat
last.dat  list.dat  lost.dat
[barney@hpc ~]$ ls data.??
data.00  data.01  data.02  data.03  data.04  data.05  data.06  data.07  data.08  data.09  data.10

Brackets

Brackets are similar to the question mark, but they further limit what you get based on the characters you list in the brackets. For instance, if you wanted to work with last.dat and list.dat but not with lost.dat, you could do something like:

[barney@hpc ~]$ ls l[ai]st.dat
last.dat  list.dat

You could work with all of the files starting with data.0 and ending with an even number, such as:

[barney@hpc ~]$ ls data.0[24680]
data.00  data.02  data.04  data.06  data.08

You can also specify ranges by putting a hyphen (-) between two characters, such as:

[barney@hpc ~]$ ls data.0[2-6]
data.02  data.03  data.04  data.05  data.06
[barney@hpc ~]$ ls l[a-i]st.dat
last.dat  list.dat

Putting Them Together

You can even combine them on the same command, something like:

[barney@hpc ~]$ ls l?st.*
last.dat  list.dat  list.dat.old  lost.dat
[barney@hpc ~]$ ls ?[eia]?[st]*.dat*
cast.dat  fast.dat  feast.dat  last.dat  list.dat  list.dat.old

file

The file command helps you determine the file type of a specified file. Unlike Windows, Linux doesn't use file extensions to determine the file type. Instead, it examines the contents of the file itself:

[barney@hpc ~]$ file logo.png
logo.png: PNG image data, 225 x 221, 8-bit/color RGBA, non-interlaced
[barney@hpc ~]$ file index.php
index.php: PHP script, ASCII text
[barney@hpc ~]$ file seepath
seepath: POSIX shell script, ASCII text executable

touch

The touch command is changes the timestamp on a file to the current time and date. As a side-effect (and probably why most people use the touch command), if the file does not already exist, it will create it:

[barney@hpc ~]$ ls -l
total 544
-rw-rw-r-- 1 barney barney 332964 Apr 26 14:29 file1
-rw-rw-r-- 1 barney barney  37160 May 23  2018 file2
[barney@hpc ~]$ touch file2
[barney@hpc ~]$ touch file3
[barney@hpc ~]$ ls -l
total 544
-rw-rw-r-- 1 barney barney 332964 Apr 26 14:29 file1
-rw-rw-r-- 1 barney barney  37160 Apr 26 15:08 file2
-rw-rw-r-- 1 barney barney      0 Apr 26 15:08 file3

rm

Danger

If you aren't careful, the rm command can cause you to lose data. Ensure that the file(s) and directories you try to remove are what you want to remove! There is no undelete command on the server!

The rm command removes a file (and possibly directories, depending on the options given).

[barney@hpc ~]$ ls 
file1  file2
[barney@hpc ~]$ rm file2
[barney@hpc ~]$ ls
file1

To be a little bit safer, you can add the -i option to have the system prompt you to confirm each file you delete, as in:

[barney@hpc ~]$ ls 
file1  file1.bak  file2  file2.bak
[barney@hpc ~]$ rm file1.bak file2.bak
rm: remove regular file ‘file1.bak’? y
rm: remove regular file ‘file2.bak’? y
[barney@hpc ~]$ ls
file1  file2

cp

Use the cp command to copy files or directories from a source to a destination. You can specify one or more source files or directories, but you can only specify one destination. For example:

[barney@hpc ~]$ ls 
backup  file1  file2  file3
[barney@hpc ~]$ ls backup
[barney@hpc ~]$ cp file1 backup
[barney@hpc ~]$ ls backup
file1
[barney@hpc ~]$ cp file2 file3 backup
[barney@hpc ~]$ ls backup
file1  file2  file3

You can also copy directories by specifying the -r option to make cp perform a recursive copy.

[barney@hpc ~]$ cp -r mydir1 mydir2

mv

The mv command is for moving files and directories to a new destination. It is also used to rename a file (think of it as moving a file to a new name). For example:

[barney@hpc ~]$ ls
file1  file2  mydir1
[barney@hpc ~]$ mv file1 file3
[barney@hpc ~]$ ls
file2  file3  mydir1

You can also rename a directory, such as:

[barney@hpc ~]$ mv mydir1 mydir2
[barney@hpc ~]$ ls
file2  file3  mydir2

Or, you can move a file (or a directory) to another directory:

[barney@hpc ~]$ mv file2 /tmp
[barney@hpc ~]$ ls
file3  mydir2
[barney@hpc ~]$ ls /tmp
file2

Working With File Contents

head

The head command is used to display the first lines of a file. By default, it shows the first ten lines.

[barney@hpc ~]$ head /etc/passwd
root:x:0:0:root:/root:/bin/bash
bin:x:1:1:bin:/bin:/sbin/nologin
daemon:x:2:2:daemon:/sbin:/sbin/nologin
adm:x:3:4:adm:/var/adm:/sbin/nologin
lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin
sync:x:5:0:sync:/sbin:/bin/sync
shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown
halt:x:7:0:halt:/sbin:/sbin/halt
mail:x:8:12:mail:/var/spool/mail:/sbin/nologin
operator:x:11:0:operator:/root:/sbin/nologin

Use the -n K option to show the first K lines of the file.

tail

The tail command is used to display the last lines of a file. By default, it shows the last ten lines.

[barney@hpc ~]$ tail /etc/passwd
rpc:x:32:32:Rpcbind Daemon:/var/lib/rpcbind:/sbin/nologin
ntp:x:38:38::/etc/ntp:/sbin/nologin
nscd:x:28:28:NSCD Daemon:/:/sbin/nologin
rpcuser:x:29:29:RPC Service User:/var/lib/nfs:/sbin/nologin
nslcd:x:65:55:LDAP Client User:/:/sbin/nologin
ldap:x:55:55:OpenLDAP server:/var/lib/ldap:/sbin/nologin
named:x:25:25:Named:/var/named:/sbin/nologin
apache:x:48:48:Apache:/usr/share/httpd:/sbin/nologin
mysql:x:27:27:MariaDB Server:/var/lib/mysql:/sbin/nologin
avahi:x:70:70:Avahi mDNS/DNS-SD Stack:/var/run/avahi-daemon:/sbin/nologin

Similar to the head command, you can use the -n K option to show the last K lines of the file.

cat

The cat command allows you to view the contents of a file (or files). It also allows you to concatenate multiple files into one larger one. Here are a few examples:

[barney@hpc ~]$ cat /etc/issue
***** NOTICE *****
Access to this system is for authorized users of Kennesaw State
University only!  Unauthorized access may be a violation of Federal
and State of Georgia Law, including 18 U.S.C. S1030 and O.C.G.A.
16-9-90 et seq.  Use of this system constitutes consent to monitoring
of such use.  Violators will be prosecuted.
***** NOTICE *****

[barney@hpc ~]$ cat file1 file2 file3 > allfiles

One useful option for the cat command is the -n option, which causes cat to number its output. For instance,

[barney@hpc ~]$ cat -n /etc/issue
     1  ***** NOTICE *****
     2  Access to this system is for authorized users of Kennesaw State
     3  University only!  Unauthorized access may be a violation of Federal
     4  and State of Georgia Law, including 18 U.S.C. S1030 and O.C.G.A.
     5  16-9-90 et seq.  Use of this system constitutes consent to monitoring
     6  of such use.  Violators will be prosecuted.
     7  ***** NOTICE *****
     8

tac

The tac command is similar to cat, except it prints the lines in reverse order. For instance

[barney@hpc ~]$ tac /etc/issue

***** NOTICE *****
of such use.  Violators will be prosecuted.
16-9-90 et seq.  Use of this system constitutes consent to monitoring
and State of Georgia Law, including 18 U.S.C. S1030 and O.C.G.A.
University only!  Unauthorized access may be a violation of Federal
Access to this system is for authorized users of Kennesaw State
***** NOTICE *****

The tac command doesn't support the -n option like cat, unfortunately, but to fake it, you can do something like:

[barney@hpc ~]$ tac /etc/issue | cat -n
     1
     2  ***** NOTICE *****
     3  of such use.  Violators will be prosecuted.
     4  16-9-90 et seq.  Use of this system constitutes consent to monitoring
     5  and State of Georgia Law, including 18 U.S.C. S1030 and O.C.G.A.
     6  University only!  Unauthorized access may be a violation of Federal
     7  Access to this system is for authorized users of Kennesaw State
     8  ***** NOTICE *****

more

The more command is used for displaying files that are longer than one screen. It will allow you to see the file's contents, page by page. Use Space to see the next page or Q to quit.

less

The less command is similar to more, but with additional features. Navigation is basically the same, except you can also go backward through a file using the B key. Some people describe the less command as "less is more more".

Basic Linux Tools

find

The find command is used for finding files in the filesystem. It can find files based on name, age, or size, among other options. Some useful examples are below, but consult the man page for more detailed instructions.

[barney@hpc ~]$ find /usr
# returns a list of all files under the /usr directory
[barney@hpc ~]$ find . -name "*.c"
# finds all files that end in .c starting in the current directory
[barney@hpc ~]$ find . -newer Makefile
# finds all files in the current directory (and below) that are newer than Makefile
[barney@hpc ~]$ find /usr >usrfiles.txt
# find all files as above but puts them in (redirects) to the file usrfiles.txt

locate

The locate command is another tool for finding files on a Linux system. Instead of searching the filesystem, it looks at a pre-generated index to find the files. Since it doesn't have to scan the filesystem every time you run it, locate is significantly faster than the find command. However, this speed has a trade-off; it can also be out of date if the index hasn't been updated recently.

[barney@hpc ~]$ locate lndir
/usr/bin/lndir
/usr/share/man/man1/lndir.1.gz

date

The date command displays the system time and date by default. It is also used to format the time and date into customizable formats:

[barney@hpc ~]$ date
Mon Apr 29 10:22:23 EDT 2019
[barney@hpc ~]$ date +"Today is %A, %B %d, %Y.  The time is %l:%M %P."
Today is Monday, April 29, 2019.  The time is 10:30 am.

Information about the different formatting options is available in the date man page (man date).

cal

The cal command, by default, displays the current month. You can give it a year to have it print the calendar for that entire year or a month and a year to print that particular month.

[barney@hpc ~]$ cal
     April 2019
Su Mo Tu We Th Fr Sa
    1  2  3  4  5  6
 7  8  9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30

[barney@hpc ~]$ cal 2000
                               2000

       January               February                 March
Su Mo Tu We Th Fr Sa   Su Mo Tu We Th Fr Sa   Su Mo Tu We Th Fr Sa
                   1          1  2  3  4  5             1  2  3  4
 2  3  4  5  6  7  8    6  7  8  9 10 11 12    5  6  7  8  9 10 11
 9 10 11 12 13 14 15   13 14 15 16 17 18 19   12 13 14 15 16 17 18
16 17 18 19 20 21 22   20 21 22 23 24 25 26   19 20 21 22 23 24 25
23 24 25 26 27 28 29   27 28 29               26 27 28 29 30 31
30 31
        April                   May                   June
Su Mo Tu We Th Fr Sa   Su Mo Tu We Th Fr Sa   Su Mo Tu We Th Fr Sa
                   1       1  2  3  4  5  6                1  2  3
 2  3  4  5  6  7  8    7  8  9 10 11 12 13    4  5  6  7  8  9 10
 9 10 11 12 13 14 15   14 15 16 17 18 19 20   11 12 13 14 15 16 17
16 17 18 19 20 21 22   21 22 23 24 25 26 27   18 19 20 21 22 23 24
23 24 25 26 27 28 29   28 29 30 31            25 26 27 28 29 30
30
        July                  August                September
Su Mo Tu We Th Fr Sa   Su Mo Tu We Th Fr Sa   Su Mo Tu We Th Fr Sa
                   1          1  2  3  4  5                   1  2
 2  3  4  5  6  7  8    6  7  8  9 10 11 12    3  4  5  6  7  8  9
 9 10 11 12 13 14 15   13 14 15 16 17 18 19   10 11 12 13 14 15 16
16 17 18 19 20 21 22   20 21 22 23 24 25 26   17 18 19 20 21 22 23
23 24 25 26 27 28 29   27 28 29 30 31         24 25 26 27 28 29 30
30 31
       October               November               December
Su Mo Tu We Th Fr Sa   Su Mo Tu We Th Fr Sa   Su Mo Tu We Th Fr Sa
 1  2  3  4  5  6  7             1  2  3  4                   1  2
 8  9 10 11 12 13 14    5  6  7  8  9 10 11    3  4  5  6  7  8  9
15 16 17 18 19 20 21   12 13 14 15 16 17 18   10 11 12 13 14 15 16
22 23 24 25 26 27 28   19 20 21 22 23 24 25   17 18 19 20 21 22 23
29 30 31               26 27 28 29 30         24 25 26 27 28 29 30
                                              31

[barney@hpc ~]$ cal 7 2000
      July 2000
Su Mo Tu We Th Fr Sa
                   1
 2  3  4  5  6  7  8
 9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29
30 31

sleep

The sleep command is commonly used in scripts to wait a set number of seconds before continuing. The following example pauses for 5 seconds.

[barney@hpc ~]$ sleep 5
[barney@hpc ~]$

sort

The sort command will sort lines of text files. The output will go to the screen, but you can use redirection to send its output to a file or another command.

[barney@hpc ~]$ cat testfile.txt
orange
banana
apple
kiwi
pineapple
[barney@hpc ~]$ sort testfile.txt
apple
banana
kiwi
orange
pineapple

uniq

The uniq command takes a sorted file as input and strips out repeated lines. It can also show only the repeated lines.

[barney@hpc ~]$ cat testfile.txt
apple
banana
banana
kiwi
kiwi
kiwi
kiwi
orange
pineapple
[barney@hpc ~]$ uniq testfile.txt
apple
banana
kiwi
orange
pineapple
[barney@hpc ~]$ uniq --repeated testfile.txt
banana
kiwi

time

The time command displays how long a command takes to run. In this example, the date command runs in a short amount of time.

[barney@hpc ~]$ time date
Mon Apr 29 11:11:57 EDT 2019

real    0m0.001s
user    0m0.000s
sys 0m0.001s

tar

The tar program bundles a set of files for archiving, similar to a zip archive in Windows. The most significant difference between tar`` andzip` is that a file created bytar` isn't compressed; it is only a bundle of other files. The following examples will create an archive.

[barney@hpc ~]$ ls
file1  file2  file3
[barney@hpc ~]$ tar -cvf  archive.tar file1 file2 file3
file1
file2
file3
[barney@hpc ~]$ ls
archive.tar  file1  file2  file3

You can also view the contents of an archive created by tar:

[barney@hpc ~]$ tar -tvf archive.tar
-rw-r--r-- barney/barney  1374 2019-04-26 15:22 file1
-rw-rw-r-- barney/barney  1374 2019-04-26 15:23 file2
-rw-rw-r-- barney/barney     0 2019-04-29 11:31 file3

And finally, you can extract the contents of an archive:

[barney@hpc ~]$ ls
archive.tar
[barney@hpc ~]$ tar -xvf archive.tar
file1
file2
file3
[barney@hpc ~]$ ls
archive.tar  file1  file2  file3

gzip/gunzip

gzip and gunzip are a set of compression and decompression programs (similar to zip and unzip, which are also available in Linux). Unlike the zip programs, gzip can't bundle several files together and compress them; it only compresses one file at a time. However, you can use tar to create an archive first and then compress that archive.

[barney@hpc ~]$ gzip archive.tar
[barney@hpc ~]$ ls
archive.tar.gz  file1  file2  file3
[barney@hpc ~]$ gunzip archive.tar.gz
[barney@hpc ~]$ ls
archive.tar  file1  file2  file3

bzip2/bunzip2

The bzip and bunzip tools behave similarly to the gzip and gunzip commands from above. The main difference between the gzip and bzip tools is that the bzip2 tools compress somewhat better than gzip, but they're also slower.

[barney@hpc ~]$ bzip2 archive.tar
[barney@hpc ~]$ ls
archive.tar.bz2  file1  file2  file3
[barney@hpc ~]$ bunzip2 archive.tar.bz2
[barney@hpc ~]$ ls
archive.tar  file1  file2  file3

zip / unzip

The zip / unzip commands are compression and archiving tools compatible with other zip programs, such as those found in Microsoft Windows and other Operating Systems.

[barney@hpc ~]$ ls
file1  file2  file3
[barney@hpc ~]$ zip archive.zip file1 file2 file3
  adding: file1 (deflated 53%)
  adding: file2 (deflated 53%)
  adding: file3 (stored 0%)
[barney@hpc ~]$ ls
archive.zip  file1  file2  file3

To view the archive, use the unzip command with the -l option:

[barney@hpc ~]$ unzip -l archive.zip
Archive:  archive.zip
  Length      Date    Time    Name
---------  ---------- -----   ----
     1374  04-26-2019 15:22   file1
     1374  04-26-2019 15:23   file2
        0  04-29-2019 11:31   file3
---------                     -------
     2748                     3 files

And finally, to extract the files from a zip archive, use the unzip command:

[barney@hpc ~]$ ls
archive.zip
[barney@hpc ~]$ unzip archive.zip
Archive:  archive.zip
  inflating: file1
  inflating: file2
 extracting: file3
[barney@hpc ~]$ ls
archive.zip  file1  file2  file3

grep

The grep command is used for searching a file to see if it contains a string:

[barney@hpc ~]$ grep bin /etc/passwd
bin:x:1:1:bin:/bin:/sbin/nologin

Like other Linux commands, grep has a lot of options to tailor its behavior:

[barney@hpc ~]$ grep -i bin /etc/passwd
# search for bin in /etc/passwd ignoring upper and lower case
[barney@hpc ~]$ grep -r bin /etc/
# search for bin recursively in /etc and all the files and directories below it
[barney@hpc ~]$ grep -v bin /etc/passwd
# search for any lines in /etc/passwd that don't have the string bin in them

wc

The wc command returns the number of lines, words, and characters in a test file.

[barney@hpc ~]$ wc /etc/passwd
  60  137 2345 /etc/passwd
[barney@hpc ~]$ wc -l /etc/passwd
60 /etc/passwd
[barney@hpc ~]$ wc -w /etc/passwd
137 /etc/passwd
[barney@hpc ~]$ wc -c /etc/passwd
2345 /etc/passwd

Standard Input / Output Paths

Every program running under Linux has three pathways associated with it to handle Input, Output, and Errors. These are known as Standard In (STDIN), Standard Out (STDOUT), and Standard Error (STDERR). Normally, STDIN comes from the keyboard, while STDOUT and STDERR go to the display. However, Linux allows you to modify where STDIN, STDOUT, and STDERR point, using two features known as Redirection and Pipes.

Redirection

Redirection allows you to point one or more of the Input/Output paths to a file. For instance, you can have a command that normally expects keyboard entry to read its input from a file, have a command send its output to a file, or have a command send error messages to one file and its normal output to another file. To redirect the input path of a command, you use the < character; to redirect the output path of a command, you use the > character; and to redirect the error path, you use 2>&. If you are redirecting STDOUT and STDERR to the same file, you can use >&. When you redirect STDOUT or STDERR, if the file you are redirecting to already exists, the shell will generate an error and not let you do it. You can either use a different file, delete the file (rm), or append the output instead of overwriting it. To append the output, use the >> string instead of > (or 2>>& instead of 2>&).

[barney@hpc ~]$ wc -l < /etc/motd
# This causes wc to read the file /etc/motd and return the number of lines.
[barney@hpc ~]$ ls > /tmp/filelist.txt
# STDOUT gets redirected to /tmp/filelist.txt.  Any error messages would still go to the display
[barney@hpc ~]$ find /data/home -name testfile.txt > outputfile 2>& errorfile
# STDOUT gets redirected to outputfile while STDERR gets redirected to errorfile
[barney@hpc ~]$ find /data/home -name testfile.txt >& outputfile
# STDOUT and STDERR gets redirected to outputfile

Pipes

Instead of sending the output of a command to a file, you can process the command's output with another command. Chaining together multiple commands in sequence uses a feature known as pipes. When you create a pipe between two programs, you send the STDOUT of the first program to the STDIN of the second program. To do this, separate the two commands you want to run with the | character. A fairly typical example is if you are doing a long directory listing (ls -l) on a large directory (let's look at /usr/bin), it's nice to be able to see the output one screen at a time (using less or more, for instance):

[barney@hpc ~]$ ls -l /usr/bin | less
total 455296
-rwxr-xr-x    1 root root       41544 Dec  4  2017 [
-rwxr-xr-x    1 root root       10568 Jan 15  2018 411toppm
-rwxr-xr-x    1 root root          40 Feb  6  2018 7z
-rwxr-xr-x    1 root root          41 Feb  6  2018 7za
-rwxr-xr-x    1 root root          41 Feb  6  2018 7zG
-rwxr-xr-x    1 root root      107856 Mar  2  2017 a2p
-rwxr-xr-x    1 root root       52728 Jan  8  2018 ab
-rwxr-xr-x    1 root root        1661 Jul  2  2015 abs2rel
-rwxr-xr-x    2 root root       36734 Dec 27  2013 aclocal
-rwxr-xr-x    2 root root       36734 Dec 27  2013 aclocal-1.13
-rwxr-xr-x    1 root root       11488 Sep 16  2017 acyclic
-rwxr-xr-x    1 root root       29200 May 30  2018 addr2line
-rwxr-xr-x    1 root root       39312 Sep 22  2015 afm2tfm
-rwxr-xr-x    1 root root       19712 Apr 23  2018 agentxtrap
:

At this point, it's waiting for you to hit Space to go to the next page, or Q to quit.

Another common use would be to find out how many unique lines are in an unsorted file. We've looked at sort, which will sort a file, uniq which returns the unique lines in a sorted file, and wc that will count the number of lines. We could sort the file to a temporary file, then run uniq on that temporary file to output a second temporary file that we could run wc on to get the number of lines:

[barney@hpc ~]$ cat textfile
orange
kiwi
banana
kiwi
kiwi
apple
kiwi
banana
pineapple
[barney@hpc ~]$ sort textfile > tempfile1.txt
[barney@hpc ~]$ uniq tempfile1.txt > tempfile2.txt
[barney@hpc ~]$ wc -l tempfile2.txt
5

Or we could string them all together with one command:

[barney@hpc ~]$ sort textfile | uniq | wc -l
5

File Permissions

Introduction

Linux was not really designed to be a heavily protected system. Protections for files and directories exist, allowing the owner to set some access permissions. There are three types of access permissions (they mean slightly different things for files vs directories):

Permission	Effect on Files	Effect on Directories
Read	The file can be opened for read (and thus copied)	The contents of the directory can be read (you can do an `ls` on it)
Write	The file can be opened for write (and can be modified or emptied)	The files in the directory can be deleted
Execute	The file can be executed	You can `cd` into the directory

Info

Note that having write permission on a file does not permit you to delete the file. You have to have write permission to the directory it's in to actually delete it. You can, however, overwrite the file and remove whatever is in the file.

These permissions are independent of one another (something can be granted read access but not write or execute). The permissions also exist in three different levels:

Level	Description
User	Access privileges for the file's (or directory's) owner
Group	Access privileges for other members of the file's (or directory's) group
Other	Access privileges for everyone else using the system

Therefore, the file (or directory) owner can decide what read, write, and execute privileges can be given to him or herself, what privileges are allowed to other members in the file's group, and what privileges are allowed to everyone else in the system. Typically, the owner has total access, the group has read access, and the world has read access as well. Execute access on files really only applies to files that are either binary programs or shell scripts.

To find out what privileges have been set for a file or directory, use the ls -l command:

[barney@hpc ~]$ ls -l
total 576
-rw-r--r-- 1 barney barney 332964 Apr 29 14:57 file1
-rw-rw-r-- 1 barney barney  37160 Apr 29 14:58 file2
-rw-rw-r-- 1 barney barney 147393 Apr 29 14:58 file3
-rw-rw-r-- 1 barney barney     57 Apr 29 13:05 file4

The first group of characters on each line represents the access privileges for each file. The first character tells you whether the line is a file (-), a directory (d), or some other special types of files. The next three characters refer to the User level of permissions, the next three are the Group level, and the final three are the Other level. Each group of three characters represents the Read (r) permission, the Write permission (w), and the Execute permission (x). If there is a hyphen (-) in any of those positions, permission is denied.

Info

You can assign less restrictive permissions to the Other level than the Group level (or even User). This is useful if you want to prevent a certain group of people from accessing a file.

id

The id command displays your username and the groups that you are a member of.

[barney@hpc ~]$ id
uid=1234(barney) gid=1234(barney) groups=1234(barney),20(games),100(users)

This indicates that the user barney's primary group (gid) is barney, but he's also a member of the games and users groups.

chgrp

The chgrp command allows you to change the group ownership of any file or directory you own to any group you are a member of. If we want to change the group of a file to the users group, we could do something like this:

[barney@hpc ~]$ ls -l
total 416
-rw-r--r-- 1 barney barney 332964 Apr 29 14:57 file1
-rw-r--r-- 1 barney barney  37160 Apr 29 14:58 file2
-rwxr--r-- 1 barney barney     44 Apr 30 11:22 file3
-rw-r--r-- 1 barney barney     57 Apr 29 13:05 file4
[barney@hpc ~]$ chgrp users file3
[barney@hpc ~]$ ls -l
total 416
-rw-r--r-- 1 barney barney 332964 Apr 29 14:57 file1
-rw-r--r-- 1 barney barney  37160 Apr 29 14:58 file2
-rwxr--r-- 1 barney users      44 Apr 30 11:22 file3
-rw-r--r-- 1 barney barney     57 Apr 29 13:05 file4

chmod

The chmod command is used to change permissions. The basic format of the command is chmod permissions file(s). There are two different ways of representing permissions, either symbolically or numerically.

Symbolic

The symbolic representation of permissions is best if you want to add or remove a permission instead of just setting a permission explicitly. Sometimes you are working on several files that might have different permissions and you want to add (or remove) a permission without modifying the other permissions.

The permissions argument in this version is composed of three different parts:

The level(s) you want to change the permission of
The action you want to take (add or remove a permission)
The permission(s) you want to change.

The levels are represented by one or more of the following options: u for User, g for Group, o for Other, and a for all three levels. The action is represented by either a + for adding or a - for removing a permission. Finally, the permissions are represented by the following options: r for Read, w for Write, and x for Execute. For example, say we have a file that everyone can read, but nobody can write. If we wanted to allow everyone in the group to write to it, we could do something like the following:

[barney@hpc ~]$ ls -l
total 416
-rw-r--r-- 1 barney barney 332964 Apr 29 14:57 file1
-rw-r--r-- 1 barney barney  37160 Apr 29 14:58 file2
-rwxr--r-- 1 barney barney     44 Apr 30 11:22 file3
-rw-r--r-- 1 barney barney     57 Apr 29 13:05 file4
[barney@hpc ~]$ chmod g+w file1
[barney@hpc ~]$ ls -l 
total 416
-rw-rw-r-- 1 barney barney 332964 Apr 29 14:57 file1
-rw-r--r-- 1 barney barney  37160 Apr 29 14:58 file2
-rwxr--r-- 1 barney barney     44 Apr 30 11:22 file3
-rw-r--r-- 1 barney barney     57 Apr 29 13:05 file4

If we wanted to add the execute permission to the group and other levels of file3, we could do something like this:

[barney@hpc ~]$ chmod go+x file3
[barney@hpc ~]$ ls -l 
total 416
-rw-rw-r-- 1 barney barney 332964 Apr 29 14:57 file1
-rw-r--r-- 1 barney barney  37160 Apr 29 14:58 file2
-rwxr-xr-x 1 barney barney     44 Apr 30 11:22 file3
-rw-r--r-- 1 barney barney     57 Apr 29 13:05 file4

And if we later change our mind and decide that we don't want ANYBODY to have the write or execute permissions to file3, we could do something like this:

[barney@hpc ~]$ chmod a-wx file3
[barney@hpc ~]$ ls -l 
total 416
-rw-rw-r-- 1 barney barney 332964 Apr 29 14:57 file1
-rw-r--r-- 1 barney barney  37160 Apr 29 14:58 file2
-r--r--r-- 1 barney barney     44 Apr 30 11:22 file3
-rw-r--r-- 1 barney barney     57 Apr 29 13:05 file4

Numeric

Using the numeric representation of permissions is best when you explicitly want to set the permission but don't care what the current permissions are. This requires a (very) little math to use. Numeric permissions are represented as a three-digit number, with the first (left-most) digit representing the user level, the middle digit representing the group level, and the last digit representing the other level. Each permission is assigned a value, as shown in the table below:

Permission	Value
Execute	1
Write	2
Read	4

To assign a permission, add the appropriate values in the table for each level. For instance, let's say we have a file that we want the user to have Read, Write, and Execute permission, the group to have Read and Execute, and everyone else to have just Execute permission. You could use something like:

[barney@hpc ~]$ chmod 751 file3
[barney@hpc ~]$ ls -l 
total 416
-rw-rw-r-- 1 barney barney 332964 Apr 29 14:57 file1
-rw-r--r-- 1 barney barney  37160 Apr 29 14:58 file2
-rwxr-xr-x 1 barney barney     44 Apr 30 11:22 file3
-rw-r--r-- 1 barney barney     57 Apr 29 13:05 file4

If we wanted to change all of the files in this directory to give ourselves just Read and Write permission, the group Read, and no permission to everyone else, you could do:

[barney@hpc ~]$ chmod 640 file3
[barney@hpc ~]$ ls -l 
total 416
-rw-r----- 1 barney barney 332964 Apr 29 14:57 file1
-rw-r----- 1 barney barney  37160 Apr 29 14:58 file2
-rw-r----- 1 barney barney     44 Apr 30 11:22 file3
-rw-r----- 1 barney barney     57 Apr 29 13:05 file4