Unlocking the Power of Shell: Loops, Scripts, and More

slide1 n.w
1 / 36
Embed
Share

Delve into the world of Unix shell with a workshop covering loops, shell scripts, and essential commands like ls, pwd, mkdir, rm, wc, sort, cat, head, and tail. Explore the versatility of loops for repetitive tasks and the efficiency of shell scripts for streamlining commands.

  • Unix Shell
  • Loops
  • Shell Scripts
  • Command Line
  • Programming

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. UNIX SHELL WORKSHOP PART 2 LOOPS, SHELL SCRIPTS, AND FINDING THINGS

  2. QUICK REVIEW OF UNIX COMMANDS ls list all files and directories in current directory pwd shows current location/directory mkdir name creates a directory called name rm filename or dirname removes files, -r removes directories and all files inside them wc gets the line, word, and character count in a file, with the l it just returns the number of lines (wc -l filename) sort sort the content of a file or output created by a certain command cat output the content of a file to the terminal head/tail show either the first few lines or the last few lines respectively

  3. LOOPS Loops allow us to run the same set of commands many times without having to retype them Imagine having several hundred genome files with the same endings The wildcard (*) operator will not work on some commands because of how it expands EX: cp *.dat original-*.dat becomes cp basilisk.dat unicorn.dat original-*.dat Instead we can use a loop

  4. LOOPS Type the following: $ for filename in basilisk.dat unicorn.dat > do > head -n 3 $filename > done

  5. LOOPS What does this loop do?

  6. LOOPS Output: COMMON NAME: basilisk CLASSIFICATION: basiliscus vulgaris UPDATED: 1745-05-02 COMMON NAME: unicorn CLASSIFICATION: equus monoceros UPDATED: 1738-11-24

  7. LOOPS What else can loops be used for? Well anything really! Now to the fun part!!

  8. SHELL SCRIPTS Now you will see the true power of the shell. We will write our first shell script. The purpose is to condense many shell commands into a single place.

  9. SHELL SCRIPTS Navigate back to the top directory (type cd at the prompt) and then type: cd molecules nano middle.sh

  10. SHELL SCRIPTS Inside nano type head -n 15 octane.pdb | tail n 5 Hit CTRL-O and then CTRL-X to exit Nano. Check the directory to confirm that middle.sh exists Run this simple script with the following command: bash middle.sh

  11. SHELL SCRIPTS Now, what if we wanted to give/pass the script the filename instead of having it hard-coded ? Open up middle.sh with nano Change the line in the file to head n 15 $1 | tail n 5 Now you run the script with bash middle.sh octane.pdb Or on a different file with bash middle.sh pentane.pdb

  12. SHELL SCRIPTS However, we still have to edit middle.sh every time we want to change the range of the lines found! In the name of making everything as easy as possible, we should change this script to take inputs for the range values as well. Open it again in Nano Change the command to head n $2 $1 | tail n $3

  13. SHELL SCRIPTS Now the script can be executed like this: bash middle.sh octane.pdb 15 5 You can also add any file name and any other range

  14. SHELL SCRIPTS What if we wanted to run this script on everything in a directory? We would make another script with a loop in it that we could give the directory name to and then run middle.sh on each file. This is sort of like what we just did with loops Scripts can do everything you have learned so far, they just make it easier to do many repetitive commands at once Imagine processing hundreds of thousands of sequences. You certainly do not want to type commands that many times. Computers do that very efficiently.

  15. BEFORE WE MOVE ON Any questions?

  16. FINDING THINGS grep this command is used to find matching text within a file, and it s very powerful when combined with regular expressions find this command is used to find files and directories Now return to main directory by typing cd and pressing the Return key Type cd writing Type cat haiku.txt

  17. Output: FINDING THINGS The Tao that is seen Is not the true Tao, until You bring fresh toner. With searching comes loss and the presence of absence: My Thesis not found. Yesterday it worked Today it is not working Software is like that.

  18. FINDING THINGS Type grep not haiku.txt This will output every line that contains not Now type grep day haiku.txt Should be two lines but day is within larger words! If we want to match day exactly we need to add the w flag Try it with grep w day haiku.txt Should get no output

  19. FINDING THINGS If you want to look for a phrase you have to use double quotes ( ) So for instance type grep w is not haiku.txt Other options are n which numbers lines that match, -i to ignore the case, and v to find the lines where the word or phrase doesn t exist Type grep n w the haiku.txt You should get lines 2 and 6 If you type n w -i the haiku.txt then you will also get line 1 If you add a v, by typing n w v the haiku.txt the output should be every line but 2 and 6

  20. FINDING THINGS Now navigate to the top level directory Type find . type d This command finds all directories in the current directory (.) If we changes the d to an f (find . type f) we get a listing of all of the files This automatically goes into folders and finds every file there

  21. FINDING THINGS A more useful approach is just finding all files in the current folder This can be done with maxdepth: find . maxdepth 1 type f If you use mindepth you can have it return only things that are at or below a certain depth: find . mindepth 2 type f To search for things by name you can use name: find . name *.txt Notice that we use the wildcard character here to return all files in the current directory and all subdirectories that end in the .txt extension

  22. BEFORE WE MOVE ON Any questions?

  23. LETS WRITE A SHELL SCRIPT Now we are going to write our own shell script that will combine a few of the things we ve learned in order to count the number of lines in every file in the current directory and subdirectories and output that into a summary file First I will walk you through the steps of what we need to do, then I will have you type the script, then we will go through it line by line so you know what each part is doing

  24. LETS WRITE A SHELL SCRIPT First we are going to use find to get a list of all files in the directory and all subdirectories Then we are going to have it call the wc l command on each file in the directory

  25. LETS WRITE A SHELL SCRIPT FILES=$(find . type f) for FILE in $FILES do wc l $FILE >> line_number_summary.txt done

  26. LETS WRITE A SHELL SCRIPT The first line (FILES=$(find . type f)) puts a list of all the files into a variable called $FILES The next line (for FILE in $FILES) starts a loop to step through each file in the list, the do and done lines show the beginning and end of the loop Finally, the line inside the loop (wc l $FILE >> line_number_summary.txt) runs the word count command for each file and appends the output to the file. We are appending, or adding to the end of the file, with the >> operator rather than overwriting with the > operator we used earlier to create files

  27. LETS WRITE A SHELL SCRIPT Now let s write a script to set up a file structure for research Lets say you have standard file structure set up you like to work with, we will make this script so that you can set up a file structure with your own name

  28. LETS WRITE A SHELL SCRIPT #this will create a top-level directory named after the first argument #then it will create as many folders as you give it arguments in that directory #the first argument is the top level of the file structure mkdir ${args[0]}

  29. LETS WRITE A SHELL SCRIPT #use a for loop to go through the rest of the arguments and create folders for them for FOLDER in ${@:2} do mkdir ${args[0]}/$FOLDER done

  30. LETS WRITE A SHELL SCRIPT All the lines starting with # are comments, which are there to explain what you are doing First we make the initial directory with mkdir $1 The we use a for loop to go over all of the arguments from the second one on (for FOLDER in ${@:2}) Finally inside the loop we do mkdir $1/$FOLDER to create the rest of the file structure

  31. PRACTICAL EXAMPLE For this example we want to pull all the molecule pdb files which have our data in them with a find statement. We then want to grab only the lines marked with an H using a grep statement. We will use a loop to process all our files without having to write repeating commands. Finally as part of our grep statement we will write the output to a new file for us to read.

  32. PRACTICAL EXAMPLE # grab pdb files from molecules folder FILES=$(find ~/data-shell/molecules -name "*.pdb") # going file by file for FILE in $FILES do # grab all the lines that have H in it, write it out to an output grep -w "H" $FILE > "$FILE.out" done

  33. PRACTICAL EXAMPLE Our first command FILES=$(find ~/data-shell/molecules -name "*.pdb") will grab all files, by name in the molecules directory by name that have the extension .pdb and store it to the FILES variable Our loop is the same as we used in the previous example. Our second command is the heart of the script: grep -w "H" $FILE > "$FILE.out" In this command we search the file for all the lines that have H in it, then write this output to an output file that makes use of the name of the file we search with .out appended to the end.

  34. THIS IS IT FOR TODAY Please, see your email for information about the next meeting. Questions? Comments?

More Related Content