Command line shortcuts
Up/Down arrows: Previous commands
!!: Reruns previous command
Tab: Auto complete
Tab+Tab: All available options
Ctrl+a: Move cursor to start of line
Ctrl+e: Move cursor to end of line
Alt+: Alternates between terminals
Ctrl+l: Clear screen (or Command+k on Mac)
Ctrl+c: Terminates the running program
Ctrl+z: Suspends the running program
Ctrl+w: Removes a previous word
Ctrl+d: Logout
Ctrl+d(in a command): Removes a character
Ctrl+u: Removes till the beginning
Linux is an open-source operating system (OS) developed based on the kernel created by Linus Benedict Torvalds. In the last two decades, Linux has gained much popularity and now is being used on many platforms. Nowadays, most high-end servers to mobile phones (Android OS or iOS) run on different variants of Linux.
Linux computers/servers are installed for multi-user usage. In this course, we will work on a high performance cluster machine running Ubuntu server edition. Most of the commands specified in this manual can be used in any other distribution (i.e., CentOS, Debian, etc.) of Linux operating system.
To install a desktop edition of Ubuntu on personal computers, please follow the instructions in the following link. Install Ubuntu
We use terminal (AKA command line interface) to interact with the operating system. The terminal by default runs one of the “shells”. Shell is a program that sits between the user and the kernel and translates user commands (text) into machine code. The advantages of using command line are greater control and flexibility over the system or software and multiple commands can be saved in a file and executed as a program.
The most common shells are:
Bourne Shell
Bourne Again Shell – BASH (variant is Z Shell)
C Shell (variant is T Shell)
K Shell
Among these Bourne Again Shell (BASH) is the most popular one. This is the default shell on the system, and we will be using it throughout this course.
In this course, we will be using MobaXterm application to access the Ubuntu OS. Please use the provided username and password to login into your account.
Open MobaXterm -> Sessions -> New session -> ssh -> add remote host and username -> OK -> Enter password -> Don’t save password
Files can be downloaded and uploaded into the server
When you open a terminal in Linux, (MobaXterm by default opens a terminal), you will see a command prompt, ready to take commands. The default location on the terminal is your “home directory”. It is represented with ~ (tilde) symbol.
Copy the command below and paste it into your command line to copy the contents of the directory Linux to your home directory.
cp -R /home4/VBG_data/Linux .
All Linux commands are single words (can be alpha-numeric), with optional parameters followed by arguments. For historical reasons, some of the early commands are only two letter long and case sensitive. Most of the command options (also called flags) are single letters. They should be specified after the command before giving any input.
ls -l Linux
“ls” is the command to list the contents of the directory, “-l” is the option for long listing and “Linux” is the input, which is optional in this case. Without the input, “ls” shows all the contents of the current directory (Type ls -l).
To clear the terminal screen,
clear
Directories are the Unix equivalent of folders on a PC or a Mac. They are organized in a hierarchy, so directories can have sub-directories and so on. Directories, like folders, are useful to keep your data files organized. The location or directory you are currently in, is called the current working directory. The location or “full pathname” of the file SARS-CoV-2.fa in the ‘Linux’ directory can be expressed as:
Do not type this - won't work
/home_location/username/Linux
Typing out longer file names can be boring, and you are likely to make typos that will, at best, make your command fail with a strange error and at worst, overwrite some of your carefully crafted analysis.
Tab completion is a trick that normally reduces this risk significantly. Instead of typing out “ls Interesting_stuff/”, try typing “ls Int” and press the Tab button (instead of Enter). The rest of the folder/file names that begin with “Int” should be listed. If you have two folders/files with similar names (e.g., my_awesome_scripts/ and my_awesome_results/) then you might need to give your terminal a bit of a hand to work out which one you want. In this case if you type “ls –l m”, when you press Tab the terminal would read “ls –l my_awesome_”. You could then type “s” followed by another press of Tab button and it would figure out that you meant “my_awesome_scripts/”.
Terminology | Description |
---|---|
Linux | Unix derivative, most popular variant of Unix |
OS | Software that commands the hardware and make the computer work |
Ubuntu | Free Linux distribution (distro) based on Debian (an oldest OS based on Linux kernel) |
Kernel | Core interface between a computer’s hardware and its processes, manages available resources |
ssh | Program for logging in to a remote machine specified with a host name |
PC | A personal computer |
Mac | A Macintosh computer |
Linux commands are case sensitive and are always single words
Options follow the command - and they start with a single hyphen (-) and a character or a double hyphen (- -) and a word
Single character options can be combined
Argument can be one or two inputs
You can write more than one command separating with a semicolon; You can use “tab” to auto-fill the command ***
(a) ls
Lists information about the files/directories. Default is the current directory. Sorts entries alphabetically.
Commonly used options:
-l long list
-a show all files (including hidden files)
-t sort based on last modified time
ls -l
Information (from left to right):
• File permissions
• Number of links
• Owner name
• Group name
• Number of bytes
• Abbreviated month, last modified date and time
• File/Directory name
(b) pwd
Returns the path of the current working directory (print working directory) to the standard output.
pwd
(c) cd
Change current working directory to the specified directory.
cd Linux/
pwd
We are now in the directory “Linux”. Typing the command “cd ..” changes it to the parent directory from which the previous command was typed in. Typing “cd” will change the current directory to the home directory.
cd
cd Linux/
(d) mkdir
This command creates a directory in the current working directory if no directory exists with the specified name.
mkdir Practice
ls -l
(e) rmdir
This command is used to remove directories.
rmdir Practice
ls -l
(f) touch
It is file’s time-stamp changing command. However, it can be used to create an empty file. This command is generally used to check if there is write permission for the current user.
touch temp-file
ls -l
(g) rm
rm is used for removing files and directories.
rm temp-file
ls -l
[!WARNING] To remove directories use “-r” option. Please remember once a file or directory is deleted, it will not go to “Recycle bin” in Linux and there is no way you can recover it.
(h) cp
Copies the content of the source file/directory to the target file/directory. To copy directories, use “-r” option.
touch temp1
cp temp1 temp2
ls -l
(i) mv
To move/rename a file or a directory.
mkdir temp
mv temp1 temp/.
mv temp2 temp3
ls -l
The second command moves the “temp1” file into the directory “temp”. The “.” (dot) at the end of the command retains the name of the file, whereas the third command renames the file “temp2” to “temp3”.
(j) ln
Link command is used to make links to files/directories. We encourage you to create links rather than copying data in order to save space.
ln -s temp/temp1 .
ls -l
(a) cat
The concatenate command combines files (sequentially) and prints on the screen (standard output).
cat SARS-CoV-2.fa
(b) more/less
These commands are used for viewing the content of the files; faster with large input files than text editors; not the entire file is read at the beginning.
more SARS-CoV-2.fa
Press “Enter” to view lines further and “q” to quit the program
(c) head/tail
These commands show first/last 10 lines (default) respectively from a file.
head SARS-CoV-2.fa
There are many non-graphical text editors like ed, emacs, vi and nano available on most Linux distributions. Some of them are very sophisticated (e.g., vi) and for advanced users.
Nano (earlier called pico) is like any graphical editor without a mouse. All commands are executed using the keyboard, using the
At the bottom of the screen, there are commands with a symbol in front. The symbol tells that you need to hold down the Control (Ctrl) key, and then press the corresponding letter of the command you wish to use.
Ctrl+X will exit nano and return you to the command line.
Nano quick reference
Ctrl+X: Exit the editor. If you’ve edited text without saving, you’ll be prompted as to whether you really want to exit.
Ctrl+O: Write (output) the current contents of the text buffer to a file. A filename prompt will appear; press Ctrl+T to open the file navigator shown above.
Ctrl+R: Read a text file into the current editing session. At the filename prompt, hit Ctrl+T: for the file navigator.
Ctrl+K: Cut a line into the clipboard. You can press this repeatedly to copy multiple lines, which are then stored as one chunk.
Ctrl+J: Justify (fill out) a paragraph of text. By default, this reflows text to match the width of the editing window.
Ctrl+U: Uncut text, or rather, paste it from the clipboard. Note that after a Justify operation, this turns into unjustify.
Ctrl+T: Check spelling.
Ctrl+W: Find a word or phrase. At the prompt, use the cursor keys to go through previous search terms, or hit Ctrl+R to move into replace mode. Alternatively, you can hit Ctrl+T to go to a specific line.
Ctrl+C: Show current line number and file information.
Ctrl+G: Get help; this provides information on navigating through files and common keyboard commands
All Linux commands have manual pages. To access them, use “man” or “info” command. The manual page gives a detailed explanation of the command, all available options and sometimes, also provides examples. For example, to view the manual page for “ls” command:
man ls
Please explore manual pages of all the above commands for available options.
(a) cut
The cut command is a command line utility to cut a section from a file. Please see “man cut” for available options.
To cut a section of file use “-c” (characters)
cut -c1-10 SARS-CoV-2.fa
The option “-c1-10” will output first 10 characters from the input file.
Few options:
-c: cut based on character position
-d: cut based on delimiter
-f: field number
We have a file named “human_viruses.txt” with some information including the names of the viruses, GenBank ids and genome length. These fields are separated by “ | ” symbol. |
head human_viruses.txt
To get a list of the GenBank id,
cut -d "|" -f2 human_viruses.txt
(b) sort
The sort command is used to sort the input content.
Few options:
-t: field separator
-n: numeric sort
-k: sort with a key (field)
-r: reverse sort
-u: print unique entries
sort -t "|" -nrk6 human_viruses.txt
(c) grep
grep searches the input for a given pattern.
Few options:
-A: after context
-B: before context
-C: before and after context
-c: count
-l: file with match
-i: ignore case
-o: only match
-v: invert match
-w: word match
To get the list of all Influenza D viruses from ‘human_viruses.txt’ file,
grep "Influenza D" human_viruses.txt
(d) wc
The command “wc” can be used in 2 ways, which counts lines, words or characters.
wc -l outbreak.csv
cat outbreak.csv | wc -l
(e) uniq
The uniq command extracts unique lines from the input. It is usually used in combination with sort to count unique values in the input.
To get the list of countries that has had an outbreak in 2023:
cut -d, -f3 outbreak.csv | sort | uniq
Other text processing commands worth looking at are: tr, rev, sed and paste.
When you run a command, the output is usually sent to standard output (stdout) ie. the terminal. However, we can redirect the standard output to a file using “>”.
ls > list
cat list
The first command creates a new file called list with all the file names in the directory. If there exists a file already named “list”, it is overwritten with the output of the command. Instead, we can append to a file using “»” redirection.
Another kind of output that is generated by programs is standard error. We must use “2>” to redirect it.
ls foo 2> error
To redirect stdout and stderr to a file use “&>”.
Piping in Linux is a very powerful and efficient way to combine commands. Pipes ( | ) in Linux act as connecting links between commands. Pipe redirects output of the first command as an input to the next command. We can nest as many commands as we want using pipes. They ensure smooth running of the command flow and reduces the execution time. |
To print 10 smallest viruses,
sort -t"|" -nk6 human_viruses.txt | head -10
We will be working on other examples during the course, where we use pipes to combine more than two commands.
Some commands take time to finish the assigned job. For example, if you would like to compress a huge file with gzip command that takes a few minutes to finish running, you can run it in the background by appending the command with “&” (Another way is to suspend a command by pressing Ctrl+Z and typing “bg”). The completion of the task is indicated by “Done”.
gzip list &
We can get list of currently running jobs in the terminal by “jobs” command. This will give you all the background jobs running in the current terminal. If you want to see all the running processes in the system, use “top”. You can get user specific details in top using “-U” option.
top
Few of the important columns in top output:
If you want to stop a running background job use “kill” command followed by the process id.
kill 1234
This command kills the job with the process id 1234. As a user you can kill only your jobs. You do not have permission to run this command on the process ids of other users.