=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- * UNIX in "N" Lessons * * Part 2 * * by Charles Lin * =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Last edited: October 14, 1999 [Please send email to: clin@umd5.umd.edu if there are errors, corrections, or any suggestions for improvement] Copyright 1999 by Charles Lin. All rights reserved. ---------------------------------- Lesson 2.1: Background Processes ---------------------------------- Most of the times, when you run a program, you can not enter any UNIX commands. For example, after typing a.out, you must enter inputs until the program ends. As long as you are entering commands, you can't type ls or more or other UNIX commands. Normally, this isn't a problem. It's probably difficult to imagine how one might be entering input into a program, and then asking to list a file at the same time. It would lead to messy looking output. However, there are occasions where you might want to run a program, and yet still enter in UNIX commands. The most common occasion for that is running emacs on a Sun workstation (don't do this on a PC or a Mac). When you run emacs in a Sun, it normally pops up an additional emacs window, separate from the window you used to type in commands. However, while the emacs window is active, your other window can't be used, since it is waiting for you to finish emacs. This seems like a silly predicament since there are two windows. It seems like you should be able to use both, and, of course, you can. % emacs test.c & [1] 23320 % By putting & at the end, you run the program in the "background". This means that you allow the program to run as well as your shell to run. You will see some funny numbers. For the time being, don't worry about what they mean (if you're curious, one number is called the "job id" and the other is called the "process id"). The prompt will be available to you again, and emacs should hopefully pop up in a separate window. This allows you to edit the program, and compile without having to close emacs, which is normally a tedious process. Using Control-Z to suspend a process ------------------------------------ You can almost get the same effect using Ctrl-Z. Suppose you are running emacs, this time on a Mac or PC. The emacs window is now in the same window as the telnet window. Using & wouldn't help since you only have one window to work with. (The & is NOT the same as in scanf.) Type Ctrl-Z (you can use a lowercase z to type this). This will put the current process (say, emacs) in the background. At this point, emacs is not running (unlike the &, where it's still running). emacs has been "suspended" in the background. The prompt should reappear. At this point, you can, say, compile the code. % cc -std1 foo.c -o foo.x Then, to get emacs back in the "foreground" (it's in the background now), type "fg" which means foreground. % fg The emacs window will pop up again, at exactly the last point you suspended it with Ctrl-Z. UNIX Command: fg What it does: Puts a background process (program) into the foreground Note: Only one program can be in the foreground. Ctrl-Z is a better way ----------------------- In the past, when computers were much slower, and emacs was a huge program (it's still huge, but computers have gotten faster), starting emacs could mean waiting a few seconds, and therefore it was somewhat slow to start. Bearable, but slow. In order to start up emacs, a lot of computing resources have to be given to emacs, which takes time. When you quit emacs, those computing resources are returned back to the operating system. By using Ctrl-Z to suspend emacs (so you can compile or do other things) and then fg to restore emacs, you don't give up the resources, nor reclaim it again, and so it usually restarts quite quickly, and at the last spot you were working. Don't forget to quit -------------------- Sometimes when using Ctrl-Z, you may forget to quit out of emacs. You will use emacs, put it in the background, and then try to logout. It will often say you have "suspended processes", which is a reminder to put emacs in the foreground, and then quit emacs using C-x C-c. You could try to logout twice in a row (the second time, it will simply "kill" emacs, and exit). The only problem with this approach is that your work might not be saved. It is usually better to put the process in the foreground, and exit as usual. Using Ctrl-Z elsewhere ----------------------- Ctrl-Z will usually suspend any running program (though it's better to use Ctrl-C to quit out of a program with an infinite loop), not just emacs. So, you can give it a try (it can't suspend telnet, though). Some programs will prevent you from using Ctrl-Z, but it works pretty well for many programs (including a.out). ---------------------------------- Lesson 2.2: Repeating commands ---------------------------------- Programming and using computers, especially in UNIX, is often aimed at the lazy. Saving a few letters of typing is seen as a great victory. Fortunately, there are ways to save some typing: !!: repeating the last command ------------------------------ Suppose you have typed in a program to compile, run emacs, then suspended it (or put it in the background using &), and wish to recompile. Instead of retyping in: cc -std1 proj3.c -o proj3.x, you can use !!. % !! cc -std1 proj3.c -o proj3.x % This will usually echo (repeat) the last command typed in. This saves you typing. !cc: repeating the last command starting with cc ------------------------------------------------ A second way to repeat the compiling command is to do: % !cc cc -std1 proj3.c -o proj3.x % Exclamation mark followed by one or more letters of a command will repeat the most recent command that used cc. Most likely, this will be the last time you tried to compile your code. You could also do % !c cd ~jm106001 % However, if the last command was to change directorires, you would be doing that a second time. So, it's often a good idea to use two letters. The only problem with this approach is if you had compiled two different programs, say proj2a.c, then proj2b.c, and the latest command was compiling proj2b.c. You couldn't use !cc to recompile proj2a.c, and there isn't a convenient way around it, using this technique. History ------- % history 133 13:14 cc -std1 proj2a.c -o proj2a.x 134 14:35 cc -std1 proj2b.c -o proj2b.x 135 14:35 cd ~qq106001/Tutorials 136 14:35 more emacs.tutorial 137 14:38 cd 138 14:56 history UNIX command: history What it does: Lists the most recent commands you have typed. When you type history, you will see three columns. The first column is the command number. For example, 133 means it's the 133rd command you've typed since using that window (it resets the command number each time you log in). Once you see this history, you know you are on the 139th command (the 138th command was the history command and will almost always be the last command you typed in, since you need to type history to see the history). The second column is the time you typed the command, and the third column is what command you typed in. There are two ways to repeat the command. The easier way is to type ! followed by the history number. For example, to recompile proj2a.c, you could type in: % !133 Notice you wouldn't have been able to recompile with !cc, since that would run command 134. You could also do: % !-6 which says rerun the sixth previous command (if the current command is N, then it would rerun command N - 6). However, most people don't want to compute this number, and it's not used all that often. Note that !-1 is the same as the shorter (and more commonly used) !!. ------------------------------- Lesson 2.3: Wildcards ------------------------------- Suppose you are in a directory that contains the following files: % ls unix.tutorial emacs.tutorial emacs.quick test.1 test.2 test.3 test.10 test.11 You would like to copy all the test files to your home directory. Normally, you would type: % cp test.1 ~ % cp test.2 ~ % cp test.3 ~ % cp test.10 ~ % cp test.11 ~ You can do a little better by listing out every file, then the directory all the files go to: % cp test.1 test.2 test.3 test.10 test.11 ~ This will copy all 5 test files to ~. Careful: If you wish to copy all files to a directory, it can only be ONE directory, and the directory must be listed last. Of course, this could get tedious very fast. If you had one hundred test files, you might be typing forever. The * wildcard operator ----------------------- Wildcards are special symbols used to make doing any UNIX operation quicker. The first wildcard is called "*" (star). * is a special symbol when used in a shell. It does not mean "*" (a plain asterisk). It means, match any file. Let's see how this works. Suppose you want to copy all files that begin with test. (including the .). You can write: % cp test.* ~ The shell sees the * and finds all files in the current directory that match the pattern. It translates the above command to: cp test.1 test.10 test.11 test.2 test.3 ~ In fact, you can use the "echo" command (another UNIX command) to see how it translates your command: % echo cp test.* ~ Suppose your username was qq106003. Then you would might see something like: cp test.1 test.10 test.11 test.2 test.3 /home/fcsmc106qq/qq106003 Notice that the * is gone, and for that matter, so is ~ (to a full path). These have been expanded by the shell, and this is the actual UNIX command that is run. The process of converting files that match * to a list of files is called "globbing". It sounds icky, and I'm not even sure where the name came from. Let's try a few more examples: 1. Copying all files in the current directory to your home directory. ----------------------------------------------------------------- % cp * ~ Note: this doesn't actually copy ALL files. Files that begin with a period (such as .cshrc or .login) are called hidden files. Those files aren't matched with *. 2. Copying all hidden files in the current working directory to your home directory. ----------------------------------------------------------------- % cp .* ~ Note: .* will match all files beginning with ., meaning all hidden files. However, sometimes this matching will include . (the current working directory) and .. (the parent directory). We will discuss how to deal with that in a moment. 3. Copying file that begin with emacs. to your home directory ----------------------------------------------------------- % cp emacs.* ~ 4. Copying files that end in .tutorial to your home directory ---------------------------------------------------------- % cp *.tutorial ~ 5. Copying all files that start with test. in your home directory to a subdirectory in your home directory named Test (assuming you aren't in your home directory) ---------------------------------------------------------------- % cp ~/test.* ~/Test This one's a bit tricky. * will match files that are in other directories, but you need to have a path. This is why you need to write ~/test.* Using the ? operator -------------------- The * operator matches 0 or more characters. For example, suppose you copied the following to your home directory. % cp test.1* This will copy all files that begin with test.1. In the previous example, this would match THREE files: test.1, test.10, and test.11. You might wonder why the pattern matches test.1. Doesn't * need to match at least one character. No, it can match 0 characters, just as it said in the first sentence of this paragraph. Most of the times, * will be what you need to copy all the files you want. However, you may match more files than you need with *. For example, suppose you had 200 test files from test.1 to test.200 You might want to copy only the files that start from test.10 to test.19. If you typed: % cp test.1* ~ you will not only get the files you want (test.10 to test.19) but also test.1 and test.100 through test.199. That's about one hundred files too many. * simply matched too many files. Instead, you can use ?. ? matches exactly one character of any sort. In this case, you would write: % cp test.1? ~ This would match any file that began with test.1 and had one more character beyond that. Thus, test.100 wouldn't match since it has TWO characters after the "test.1". Suppose you wanted to match all files from test.100 to test.199, but not test.10 through test.19. In that acse, you would type: % cp test.1?? ~ You can even combine ? with *. Suppose, for some unusual reason, you name all you files with the name of the file, plus the date you wrote it. Hence, you might have diary.09.19.98 or todo-list.09.12.98. Suppose you wanted to copy all files from the month of September, 1998 to a subdirectory. % cp *.09.??.98 Sept/ This copies any file that ends with .09 (sept), then any day, then the year 1998. ------------------------------- Lesson 2.4: Using grep ------------------------------- grep is a useful command when you are trying to look for a specific phrase in one of several different files, but aren't sure where it is located. It can even be useful when searching through a single file if that file is long. grep stands for "get regular expression pronto" (or maybe it doesn't, but it's easy mnemonic to remember). A regular expression is basically a pattern, or more accurately, a language for a pattern (just like C is a language, there are other "languages". In this case, the language of regular expressions describes patterns). While regular expressions are a powerful way of writing patterns, it is something of an advanced topic. This topic will be treated indirectly in the next programming course (CMSC 114), and more fully in CMSC 330, a course on programming languages. In any case, you will not use the full features of grep, at least not until you learn more about regular expressions. Syntax: grep This is the simplest version of the syntax. You write grep, then the desired pattern, then the name of the file where you want to match the pattern. Let's see how this might be used in practice. Writing comments to yourself ---------------------------- Writing comments in a program isn't only for other people who might read your program at some time in the future. It might be useful to you right now. You might comment your code for two additional reasons beyond merely annotating your code to make it easy for you and others to understand. You might want to do the following: 1. Put a comment that reminds you to fill in missing code later on 2. Put a comment that reminds you to check out code written later on. One way to do this is to use repeated marks. For example: Use To indicate ---- ------------ !!! Need to deal with this ??? Wondering if this is correct---may need to deal with it after giving some thought. Here's how it would look: int main() { int restaurant; int numOmelettes, numCoffees; float costOmelettes, costCoffees; if ( restautant == 1 ) { if ( numOmelettes >= 2 || numCoffees >= 1 ) /* ??? should it be && */ { totalCostOmelettes = numOmelettes * costOmelettes; totalCostCoffees = numCoffees * costCoffees; totalCost = totalCostOmelettes + totalCostCoffees; } /* !!! Missing code for numOmelettes < 2 case! Add soon */ } return 0; } Let's start with something simple to search for. Suppose you're interested in finding which lines contain the variable "restautant" (which happesn to be misspelled) because your compiler says that word is incorrect. You can do a grep for that word: % grep restautant test.c grep responds with: if ( restautant == 1 ) It copies all lines with match the pattern "restautant" from the file, and prints it out. What is grep doing? -------------------- When you give grep a pattern, it searches the file, line by line. For each line, it checks if the pattern is in that line. For example, "restautant" is in the line that was printed. If the pattern isn't in the current line, grep doesn't print it. Note: grep searches all of the lines of the file, and prints ALL the lines that match the pattern. It just happened that it only matched one line. If it matches no lines, grep will print nothing, and you will see the shell prompt once again. Typing something shorter ------------------------ "restautant" is a long word to type, and so you might want to type something shorter. Since the misspelled part of the word is at the end, you might just search for "autant". % grep autant test.c if ( restautant == 1 ) % Again, it prints the right line, because it's unlikely that "autant" would appear anywhere else in the file. Tip: if the word is long, type part of the word, the part that will most likely match the problem. If you had typed "rest" instead of "autant", you would have found not only the incorrectly spelled variable names, but even the correctly spelled names, too! Picking the right pattern means matching the correct lines, and avoiding lines you don't want to match. Showing line numbers -------------------- Notice that grep only prints the matching lines. It doesn't tell you where that line is located. If your file is long enough, you might have many similar such lines, and then have to search through the file to find the correct line. A very useful option is to use -n. UNIX command: grep -n What it does: Looks for lines in the file, , that match the pattern. The line number is printed out, followed by a colon, followed by the line from the file. For example, you might write % grep -n autant test.c 10: if ( restautant == 1 ) The 10: in front of the line means grep found the pattern "autant" in line 10. You can then go into emacs and search for line 10 (of course, you can also use emacs search capabilities, nearly as fast). From now on, we will use the -n option to print line numbers. Why grep doesn't automatically print line numbers ------------------------------------------------ If you're wondering why grep doesn't automatically print line numbers all the time, it's because there are many times where you want to save the result of grep to a file, and want to process that file again. For example, suppose you had a list of employee names and the department they work in. You need to contact someone named Kim in the Marketing department. But the person who called you didn't leave a first name, so you will need to contact everyone to find who you are supposed to talk to. If you do a grep on Kim, you will get Kim in every department. If you do a grep on "Marketing", you will get everyone in the Marketing department. So, you can do a grep on Kim, and save that to a file using output redirection. % grep Kim namelist > kimlist Then a grep on that list, to get the ones from marketing % grep Kim kimlist > finallist Then, contact the people in the finallist. Leaving the line numbers might make the output kind of messy (you might have to further process the file). By avoiding line numbers, you only grab lines that were in the original file, and not the line numbers as well (try using grep with -n and saving the results to a file using output redirection). However, since you won't be doing this kind of processing much, then -n option will be more useful to you. Problems with spaces -------------------- Suppose you wanted to look for all assignment statements in test.c. It may be a silly task, but let's see what happens. % grep -n = test.c 10: if ( restautant == 1 ) 12: if ( numOmelettes >= 2 || numCoffees >= 1 ) /* ??? should it be && */ 14: totalCostOmelettes = numOmelettes * costOmelettes; 15: totalCostCoffees = numCoffees * costCoffees; 16: totalCost = totalCostOmelettes + totalCostCoffees; This time, it matched 5 lines. Of those lines, 10 and 12 are incorrect. In line 10, It matched the == in "restaurant == 1". In line 12, it matched the = in "numOmelettes >= 2". Neither of these are assignment statements. So you need to adjust the pattern. Perhaps you know that you always put one blank space before the assignment operator and one after it. So you wish to search for the pattern space, followed by =, followed by space. You write it as: % grep -n = test.c 10: if ( restautant == 1 ) 12: if ( numOmelettes >= 2 || numCoffees >= 1 ) /* ??? should it be && */ 14: totalCostOmelettes = numOmelettes * costOmelettes; 15: totalCostCoffees = numCoffees * costCoffees; 16: totalCost = totalCostOmelettes + totalCostCoffees; Again, the same lines return. Why didn't it work? Because the shell doesn't care that much about spaces. Just like in C, when you put additional spaces, the compiler will still compiler your program in the same way. The only place spaces matter in a C program (usually) is strings. It turns out that in the UNIX shell, you must also use strings to indicate that spaces are important. You need to search for the pattern " = ". The shell uses the double quotes (and often single quotes) to indicate that spaces are important. The correct way to write the pattern for a space, followed by =, followed by another space is: " = ". % grep -n " = " test.c 14: totalCostOmelettes = numOmelettes * costOmelettes; 15: totalCostCoffees = numCoffees * costCoffees; 16: totalCost = totalCostOmelettes + totalCostCoffees; It now prints the correct lines, since == and >= won't match that pattern. Other problems -------------- Suppose you now wish to find comments that include ???. If you write: % grep -n ??? test.c grep: No match. you won't be successful. Why not? The reason is...wildcards! ??? is the wildcard that matches any file with exactly three characters in it (except . as the first character, since wildcards often don't match hidden files). To solve the problem, you need to put double quotes around the ???, so "globbing" (trying to match) files doesn't take effect. % grep -n "???" test.c 12: if ( numOmelettes >= 2 || numCoffees >= 1 ) /* ??? should it be && */ This time it works. More problems ------------- So far, so good. Now, you wish to find all occurrences of !!! within your files. You ask yourself whether !!! has any special meaning in UNIX. Indeed it does. The !! is to repeat the last command. A shell will often just replace all occurrences of !! with the last command typed in. This could create a problem. Your solution might be to put it in quotes: % grep -n "!!!" test.c However, even with quotes, the shell attempts to replace the !! that appears with the last command. To get out of this mess, you must type backslashes in front, so it inteprets !!! literally. % grep -n "\!\!\!" test.c In fact, once you have the backslashes, you don't need the double quotes anymore. A lesson? --------- One lesson might be that ??? and !!! aren't such good ways to put reminder comments in your code. However, that's only if you intend to use grep. If you use the search facilities in emacs (C-s and C-r), they are not UNIX commands, so such problems won't occur. You might choose to use other notes to yourself that make grep work correctly, or simply remember to add quotes and backslashes. Using grep with multiple files ------------------------------ grep is most often used with several files at once. For example, suppose you had several project descriptions, and wanted to look up the phrase "Academic Honesty". You could do so using the wildcard operator. Assume that each project description is listed as proj1.descr, proj2.descr, proj3.descr. Assume they are in the same directory, and you are currently in that directory. Since there's no reason to write out "Academic Honesty" (because it's rather long, and would need quotes so that the space is treated as a space in the pattern), you can just search for "Acad", which is close enough. It might match more than you want, but probably not much more. % grep Acad proj?.descr p1.descr: Academic Honesty Statement p2.descr: Academic Honesty Statement p3.descr: Academic Honesty Statement If you use grep without -n on multiple files, you will see the name of each file, followed by a colon, followed by the line(s) in the file that match the pattern. If you use grep WITH the -n option, you will see: % grep -n Acad proj?.descr p1.descr:589: Academic Honesty Statement p2.descr:432: Academic Honesty Statement p3.descr:487: Academic Honesty Statement This time it prints the name of the file, then a colon, then the line number, then another colon, then any matching line (there can be more than one matching line per file, although in this example, there is only one). Case sensitivity ---------------- You will notice that the pattern had to be spelled with a capital A. If you had done a search on "acad", you would only have matched lowercase "acad". That's because grep does a "case sensitive" search meaning it cares if the pattern has uppercase or lowercase letters. Suppose, you wanted to find all words relating to academic honesty in the project, regardless of how it was spelled (with uppercase or lowercase letters being irrelevant). Then you would use the -i option. UNIX command: grep -i What it does: Does a case INSENSITIVE search of the pattern in the file. I'll use a complicated example which uses -i (case insensitive) with -n (print line numbers) and write it as -in (case insensitive with line numbers). UNIX commands will often let you combine switches together by having a dash, followed by the letters that make up the switch. Here's how it might work: % grep -in acad proj?.descr p1.descr:589: Academic Honesty Statement p1.descr:597: Projects are to be written INDIVIDUALLY. For academic honesty p1.descr:603: VIOLATIONS OF ACADEMIC HONESTY INCLUDE: p1.descr:633: ANY STUDENT WHO LEARNS OF AN INCIDENT OF ACADEMIC DISHONESTY TO p2.descr:432: Academic Honesty Statement p2.descr:440: Projects are to be written INDIVIDUALLY. For academic honesty p2.descr:446: VIOLATIONS OF ACADEMIC HONESTY INCLUDE: p2.descr:478: ANY STUDENT WHO LEARNS OF AN INCIDENT OF ACADEMIC DISHONESTY TO p3.descr:487: Academic Honesty Statement p3.descr:493: Projects are to be written INDIVIDUALLY. For academic honesty p3.descr:498: VIOLATIONS OF ACADEMIC HONESTY INCLUDE: p3.descr:527: ANY STUDENT WHO LEARNS OF AN INCIDENT OF ACADEMIC DISHONESTY TO Notice that each line contains the word "acad". In some lines, it's "Academic". In others, it's "academic". In still others, it's "ACADEMIC" (editor's note: we have no affiliation with the TV show, It's Academic). ------------------------------- Lesson 2.5: Input redirection ------------------------------- In UNIX in N Lessons, Part 1, there was a discussion of output redirection. Perhaps it's a good idea to refresh your memory on how output redirection works. Most programs, C programs or otherwise, often have no idea they are printing to the screen. For example, when running a printf statement, all the program is "aware" of (admittedly, I'm anthropomorphizing the computer, but hopefully you will permit me that indiscretion) is that it prints to a place called "standard output" +--------------+ Standard Output Stream | Your program |-------------------------> (usually, to screen) +--------------+ printf(), putchar() When your program prints, it's sent along an "electronic" stream called standard output. The program has no idea where this output goes. Normally, the output stream is "directed" to the screen. However, using output redirection, you can redirect this output to a file: % a.out > results The output stream is no longer sent to the screen: it has been "redirected" into a file. The file, results, now contains anything you would have printed to the screen (however, since scanf is not output, the program will still pause to wait for any input). How input works --------------- When you program reads input (with scanf() or getchar()), it has no idea where this input comes from. All it knows it that it gets it from an input stream called "standard input" +--------------+ Standard Input Stream | Your program |<<<------------------------- (usually, from keyboard) +--------------+ scanf(), getchar() This input can be thought of as being stored in an "input buffer". +---------+---+---+---+---+---+---+---+---+ | Index | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | +---------+---+---+---+---+---+---+---+---+ | Input | 1 | 0 |\n | p | i | e |\n | | +---------+---+---+---+---+---+---+---+---+ The stream puts the characters read in into this input buffer. Just as you can redirect output to a file, you can redirect input from a file. % a.out < input_file +--------------+ Standard Input Stream | Your program |<<<------------------------- input_file (redirected) +--------------+ scanf(), getchar() Instead of the input being read in from the keyboard, the input will now be read in from a file. In this case, the file is input_file. The contents of this file will be placed into the input buffer, and scanf will process the input buffer as it always did, oblivious to the fact that it came from a file, instead of a keyboard. How Input Redirection Is Different ---------------------------------- There are two key differences with using input direction and typing the input from the keyboard. The input isn't printed with input redirection ---------------------------------------------- When you use input redirection from a file, you do NOT see the contents of the file on the screen. ALL you see is what your program prints. This can be annoying, but there doesn't seem to be any easy fix. That's why, on your project, you see the list of inputs first, and then your outputs. We use input redirection to test your files, and input redirection won't print what's in the file. Let's see how this might work. Suppose this is your program. #include #define SIZE 20 int main() { int arr[ SIZE ]; int count = 0; /* Read in first input */ printf( "Enter: " ); scanf( "%d", &arr[ count ] ); while ( arr[ count ] != -1 ) /* Stop when -1 is read in */ { /* Echo back the input */ printf( "Just read in: %d\n", arr[ count ] ); count++; /* Increment, for next array element */ /* Read in next input */ scanf( "%d", &arr[ count ] ); } return 0; } Suppose you had a file called test.1, and it contained: 10 3 9 -1 You want to use input redirection: % a.out < test.1 You will see output like: Enter: Just read in: 10 Enter: Just read in: 3 Enter: Just read in: 9 Enter: Notice there's none of the input values from the file. The 10, 3, and 9 that you see is from the printf in the loop. If you had typed it by hand, you would see your input as well as the output. Enter: 10 Just read in: 10 Enter: 3 Just read in: 3 Enter: 9 Just read in: 9 Enter: -1 It's annoying, but you can usually deal with it (and by using test files, you don't have to keep typing things over and over). If you were wondering why you had to print all those blank lines in the project description, there were two reasons. First, because input redirection ONLY prints out what the program prints (nothing from the input file is printed). This means the newlines you would have typed to enter the input DON'T show up. Second, to make the output look a little nicer. The input file has an end-of-file --------------------------------- When you are typing input from a keyboard, it's assumed that you can keep typing input forever (or at least, until the program stops). In general, you can't run out of input. The program simply waits for more input, if it needs it. However, when you use input redirection, the file is always finite, and can run out of input. Your program, even if it expects more input, will not get any more. The program will simply conclude (possibly with garbage data being "read" for the remaining inputs). What happened? You reached the end of the file. A common misconception: Thinking the end of file is a character --------------------------------------------------------------- There is NO end of file character in the ASCII code. (There is an end of line character, but that's really just \n). Why isn't there an end of file character? It would simply lead to too many errors. What would happen if you accidentally typed in this end of file character? How would it know the true end of file? Instead, UNIX keeps track of how many characters the file has, and when it reaches the end of that file, it can alert the scanf (now this can be done by sending a special character, or some other means). Be careful: No end of file characters exist in files! How to check for the end of file -------------------------------- There are two ways to check for the end of file. Method 1: using scanf --------------------- scanf is a function which actually does two things: (1) reads in the value from standard input and places it into variables. Example: scanf( "%d%d", &num1, &num2 ); values are stored into num1 and num2. (2) Determines how many inputs were successfully read in, and "returns" that value. numRead = scanf( "%d%d", &num1, &num2 ); printf( "%d\n", num1 ); /* prints 10 */ printf( "%d\n", num2 ); /* prints -1 */ printf( "%d\n", numRead ); /* prints 2 */ Suppose the user typed in 10 and -1. The 10 is stored in num1, and -1 is stored in num2. numRead, which is the result of the scanf, is the number of items read in by scanf. In this case, there were two numbers read in, so numRead gets 2. If you were reading an input file (through input redirection), you might read in a single number (assume the file had just one number in it) and so num1 might get, say, 10 (if that were the only number in the file) and num2 would probably get nothing (since there was nothing to be read) and retain its previous value. numRead would then be 1. If scanf tries to read in two numbers, and there are no more numbers to be read BECAUSE you've already read all the input in the file, i.e., you are at the end of the file, then scanf will return a special code called EOF. Again, EOF is NOT a character, simply a special number that you can test to determine if you are at the end of file. numRead = scanf( "%d%d", &num1, &num2 ); if ( numRead == EOF ) printf( "I am at the end of the file\n" ); Method 2: using getchar() ------------------------- Normally, ch = getchar(); is the same as: scanf( "%d", &ch ); The only time they are different is when the end of file occurs. Suppose you are reading a file a character at a time, and there are no more characters left (meaning, you have reached the end of file). At that point, getchar() will return back EOF as a special code (not as a character read in as a file. ch = getchar(); if ( ch == EOF ) printf( "Reached end of file\n" ); Technically, this only works if you have signed characters (see Chapter 7). getchar() really returns back an int value, which should be cast as a character when needed. int ch; /* an int, not a char() */ ch = getchar(); /* fine, since getchar() returns an int */ if ( ch == EOF ) /* fine again, since EOF is an int */ printf( "reached EOF\n" ); else printf( "read in '%c'\n", (char) ch ); /* should cast ch to char */ The above code is the "safer" way to do things (will work on more compilers/computers), but many computers will also allow you to declare ch as a char, and test ch against EOF, as in the previous example. Even though getchar() can return an EOF (when it reaches the end of file), scanf does not. As many times as you do the following: do { scanf( "%c", &ch ); } while ( ch != EOF ); ch will never be assigned to EOF (well, that's not always true--sometimes it does, but it's nothing you should rely on). The proper way to test for EOF with scanf is to use the return value, as in: do { numRead = scanf( "%c", &ch ); /* use numRead */ } while ( numRead != EOF ); /* test with numRead */ -------------------------------- Lesson 2.6: Changing your shell -------------------------------- When you enter in commands, a program called the "shell" reads your commands, does some processing with the commands (such as * and ?), and passes that command onto UNIX to handle. The "shell" is kind of a "middleman" allowing you to type commands easily, before passing that command to UNIX. In many UNIX systems, csh (C shell) is the default shell. That is, it's the shell you normally use. csh gives you some nice capabilities However, tcsh (pronounce T shell or T-C shell, or even Tom's C shell) is a more versatile shell. To change the shell on your OIT class account, you need to find where tcsh is located: % which tcsh /usr/local/bin/tcsh UNIX Command: which What it does: Tells you the absolute path of the command. Many UNIX commands are really programs (just like a.out, proj1.x) which are stored in some directory (exceptions are cd, ls, and many of the really basic UNIX commands). You can change your shell by typing chsh, which means change shell (similar to chfn, which means change finger information). UNIX Command: chsh What it does: Allows you to change your login shell % chsh Modifying DCE registry Old DCE shell: /usr/bin/csh New DCE shell: It will ask you for your new shell. Just enter the path you saw before and hit return. Modifying DCE registry Old DCE shell: /usr/bin/csh New DCE shell: /usr/local/bin/tcsh If you log out and log back in, you should see a new prompt (each shell often has a different prompt). In tcsh, the default prompt is the greater-than sign (">") rather than the percent sign in csh. > New features of tcsh -------------------- All the features of csh are still available in tcsh. tcsh has a few other features though. There are two very nice features: Going back to previous commands and editing -------------------------------------------- C-p Go to previous command C-n Go to next command C-a Go to beginning of line C-e Go to end of line C-d Delete character underneath cursor C-b Go back one character C-f Go forward one character If you say these commands look a lot like emacs commands, you're absolutely right. And except for C-n and C-p, the commands mean the same thing. Since shells only process one line at a time, C-n and C-p allow you to go back to earlier commands in your history of commands. Of course, you can still use !! and !cc as before, but now you have the flexibility of changing the command a little, without retyping everything. The arrow keys should also still work. Tab completion -------------- Suppose you have two files, one named emacs.quick and emacs.tutorial, in your current working directory. You want to look at emacs.tutorial. However, you feel lazy because there are so many characters to type. You can use TAB completion. % more e TAB completion attempts to complete both commands and files. After typing e, hit TAB. Since both files start with emacs., it will fill this in for you: % more emacs. Then, type in the t % more emacs.t and hit TAB once again. Since there is only one file that starts with an emacs.t, it will complete the rest of the filename. % more emacs.tutorial Then, hit return to view. It turns out that csh has some limited forms of completion too (not nearly as versatile as tcsh, but works fine). Type ESC to do ESC completion (which is basically the same as TAB completion). Problems with tcsh ------------------ tcsh has some problems running scripts that are protected in certain fashions, and has some problems with path completions. For this course, you should run: % ~jm106001/tcsh_patch once, and programs such as submit should work again. ----------------- Lesson 2.7: Pipes ----------------- Understanding pipes isn't so hard once you understand how output and input redirection works. Recall the picture of output redirection +--------------+ Standard Output Stream | Your program |--------------------------->>> output_file (redirected) +--------------+ printf(), putchar(), puts() Your program generates output using some function, such as printf. This output is sent to the output stream. This output stream is redirected to a file. Recall the picture of input redirection +--------------+ Standard Input Stream | Your program |<<<---------------------------- input_file (redirected) +--------------+ scanf(), getchar(), gets() Your program reads input using some function, such as printf. This input is read from an input stream, which gets its characters from an input file. Scenario -------- Suppose you wanted the output of some other program fed into the your program. For example, suppose you have two files that you want to make into one file. One way to do this is to use "cat". Recall "cat" not only displays the contents of one file to the screen, it also concatenates the contents of two or more files to the screen (leaving the original file untouched). % cat input.1 input.2 This will print the contents of input.1, follwed by the contents of input.2. Now, suppose you wanted that combined output to be the input of your program, runit.x. There's a "slow way" to do this. 1. Save "cat input.1 input.2" to a file using output redirection. 2. Use input redirection to read in the file to your program. For example, % cat input.1 input.2 > combined % runit.x < combined However, it seems like a rather long process. You must create a new file and THEN feed that file into your program. Isn't there someway to redirect the output of "cat" into your program? However, if you write the following, it won't work. % cat input.1 input.2 > runit.x Why not? runit.x is not a file. Recall that output redirection must have a filename after the > sign. Instead, it has the name of an executable, and so that won't work. It assumes you are attempting to save the output of the cat operation to a file called runit.x. There is a solution: pipes -------------------------- The solution is to use a pipe. The pipe operator is written using a single vertical bar, as in |. Here's how to use it: % cat input.1 input.2 | runit.x Semantics of | | -------------- Suppose you have two programs, and . The pipe operator connects the standard output of to the standard input of . This means that 's output, which was originally printing to the screen, is now being fed directly to 's standard input. Also, , which had been receiving input from standard input (the keyboard), is now receiving its input from 's output. 's output is still being sent to its own standard output, and since _its_ standard output hasn't been redirected, its output is sent to the screen. So, what you see on the screen is 's output. So, % cat input.1 input.2 | runit.x takes the output that would have been created by: cat input.1 input.2 and feeds it as input to: runit.x When the command to the left of the pipe operator (in this case, cat input.1 input.2) completes, the "file" sent to runit.x is terminated and thus runit.x can check for end of file, just like it can check for end of file if the input for runit.x had been input redirected from a file. Summary ------- Assume you just ran: | All the output generated by is put into an input buffer. will read all its input from the input buffer which has characters placed in there by .