|
|
c m s c 214
s p r i n g 2 0 0 2 |
© 2001 by Charles Lin. All rights reserved. You must receive explicit written permission to copy information on this webpage.
Note: this was written for the g++ compiler. You should make appropriate changes so your code compiles with the cxx compiler.
With few exceptions, most .cpp files have a corresponding .h file. A .cpp usually consists of:
The corresponding .h file contains
For example, suppose you wrote a Vector class. You would have a .h file which included the class declaration. Suppose you needed a vector in a MovieTheater class. Then, you would #include "Vector.h.
The preprocessor will "insert" the Vector.h file into a copy of the MovieTheater.cpp source code. After the preprocessing phase has completed, the modified copy is sent to the compiler.
What information does the compiler have about Vector when compiling MovieTheater.cpp? Itn only knows the data members of Vector.h and the methods of Vector.h. It does NOT know how the methods are implemented, since this information is not stored in the .h file.
Fortunately, that's all it needs to know to create an object file. To create an object, you need to know how big it is, and what the data members are. You get that information from Vector.h.
The compiler also need to know what are valid methods of Vector, what the parameters are and the return types. Again, this information is also in Vector.h. This is necessary to make sure that methods called on Vector objects are called in a valid manner.
Surprisingly enough, the compiler doesn't need to know how any of Vector's methods are implemented. When compiling MovieTheatern.cpp, the compiler never looks at Vector.cpp.
However, eventually, if the MovieTheater class is to run, there must be a time when the object code (i.e. the .o file) for Vector is integrated with the object code for MovieTheater (and whatever object file contains main()). That process occurs later on, during the linking phase. We'll talk about linking momentarily.
When you compile MovieTheater.cpp, you only create a MovieTheater.o file. This is called an object file, and is also called a .o file. The object file doesn't have enough information to run by itself. It is not an executable.
Typically, when you have many .cpp files, you will end up
creating a corresponding .o file. In g++ and cxx (and most
UNIX C++ compilers), this is done with the -c option.
g++ -Wall -c MovieTheater.cpp
cxx -w0 -std strict_ansi -c MovieTheater.cpp
The -c option can appear anywhere in the command.
When the compiler sees this option, it assumes you will create a .o file from the .cpp file. When compiling a .cpp file, the compiler does not require a main() function to appear in the file, although you can have a main(). Also, any implementations of classes that are NOT part of the .cpp file, but appear in the #include files are not compiled. For example, if Movie.cpp includes Vector.h, the code for Vector.cpp is not compiled when Movie.cpp is being compiled.
Instead, the compiler merely checks that the Vector methods called correctly. It can check that the calls are correct, even without the implementation. All this information that is needed by the compiler appears in the .h files.
"When you compile a .cpp file using -c option, it compiles all other .cpp files mentioned in the #include .h files"
Untrue. It only compiles the single .cpp file being referred to, and only refers to the .h for purposes of creating objects (making sure it's the correct size, etc) and checking that methods are being called correctly.
Even if you remove the -c option, it does not look for all other .cpp files. In fact, if you remove the option, it assumes that the .cpp file is a stand alone file, with everything defined, and includes a main().
"If you write something like:That's not quite true. The compiler compiles each .cpp file separately. If you don't use the -c option, then effectively, it creates internal versions of .o files (they aren't made into real files), and then linked.g++ -Wall foo.cpp bar.cpp baz.cppthe compiler will look at "bar.cpp" and "baz.cpp" when compiling "foo.cpp".
However, this process is basically the same as if you had compiled each .cpp file individually and then linked together at the end.
Back in the old days, a makefile was very convenient. Compiling was slow, and therefore, having to avoid recompiling every single file meant saving a lot of time.
Although it's much faster to compile now, it's still not very fast. If you begin to work on projects with hundreds of files, where recompiling the entire code can take many hours, you will still want a makefile to avoid having to recompile everything.
A makefile is based on a very simple concept. A makefile typically consists of many entries. Each entry has:
Movie.o: Movie.cpp Movie.h Vector.h g++ -Wall -c Movie.cppThe basic syntax of an entry looks like:
<target>: [ <dependency > ]* [ <TAB> <command> <endl> ]+A makefile typically consists of many entries. A target appears before the left of a colon. A target is usually a file, but not always. The entry tells you how to construct a target, and more importantly, when to construct the target.
After the target and the colon, you list out the dependencies. The dependencies are (usually) a list of files which the target file depends on. The dependencies are used by the makefile to detemine when when the target needs to be reconstructed (usually by recompiling code). The make utility decides when to reconstruct the target based on something very simple: a timestamp.
Each file has a timestamp that indicates when the file was last modified. make looks at the timestamp of the file, and then the timestamp of the dependencies (which are also files). The idea is this. If the dependencies have changed, then perhaps the target needs to be updated. More precisely, if the dependent files have a more recent timestamp than the target, then the command lines are run. If written correctly, the commands should update the target.
The updates are done recursively. When checking the dependencies, the makefiles checks if any of the dependent files are targets, and if so, it looks at those dependencies, and, based on the timestamp, it may run commands to update the dependent files, and then update the target.
Here's an example:
Movie.o: Movie.cpp Movie.h Vector.h
g++ -Wall -c Movie.cpp
The target is Movie.o. The dependencies of a .o
file are almost always:
If you recall, when you compile a .cpp file to an object file, the compiler never needs to refer to other .cpp files. That's the benefit of separate compilation. You only need to consider the .cpp file and other .h files. That way, if the .h files that a .cpp depend on don't change, but the corresponding .cpp files do, there's no need to recompile the original .cpp file.
Let's illustrate with an example. Suppose Movie.cpp depends on Vector.h. Perhaps it contains a Vector object in the class. Now, if Vector.cpp changes but Vector.h does not, then there's no need to recompile Movie.cpp. Movie.cpp doesn't need to see the code for Vector.cpp. That's handled later at the linking phase.
That should be easy. Look in the corresponding .h and .cpp files and look for any #include files in double quotes (the ones in angle brackets ought to be files like "iostream", etc., which basically never change), and put those down. Many people forget to put these dependencies down, and therefore, they don't recompile the object file when they should.
However, it's a little more complicated than that. Suppose the Vector.h depends on other .h files. You ought to add the dependencies there too. Why? Suppose Vector has a Foo object as a data member. Now, Movie might include a Vector object as a data member. Even if Vector.h doesn't change, Foo might change (let's say a new data member was added). This makes Foo bigger (since it has a new data member), which makes Vector bigger, which makes Movie bigger. Thus, one needs to recompile Movie to reflect that change.
It can be tough to track all the .h dependencies, but you
can write the dependencies in the makefile.
Vector.h: Foo.h
In this case, you don't need any targets. It just allows the
compiler to determine that a dependency exists (dependencies are
transitive, that is, if X is dependent on Y, and Y is dependent on
Z, then X is dependent on Z).
As it turns out, the answer is that this can be done automatically (i.e., by a program). That program is called makedepend. However, we won't discuss makedepend in this tutorial.
This is, alas, one of the more common errors in making a makefile. The problem, unfortunately, lies not with the user of the makefile, but the person who originally programmed it. That person must have thought that it would be convenient to have each command start with a TAB, because the command would be indented in.
However, the person failed to realize that several spaces look like a TAB, and even a few spaces followed by a TAB, look like a TAB. It's just too hard to see any mistakes, and the person should never have depended on TABs.
Alas, like many UNIX utilities, the original version has managed to hang around forever, despite its flaws. The one advantage of Microsoft programs is that they are always being updated, and that it's hard to find programs that are nearly the same as the original. On the other hand, tools like make and tar with obvious ways to make it better still exists in pretty much the old format.
As a user, you must be careful with TABs.
Ah, back to business. The commands are supposed to generate
a new version of the target. For example,
Movie.o: Movie.cpp Movie.h Vector.h
g++ -Wall -c Movie.cpp
The g++ command does indeed create a Movie.o when done.
Keep that in mind. The commands you list should generate the target.
You don't want to have a command like:
Movie.o: Movie.cpp Movie.h Vector.h
g++ -Wall -c Vector.cpp
This command does not produce Movie.o as a result. This
isn't a good command.
p1 : MovieList.o Movie.o NameList.o Name.o Iterator.o
g++ -Wall MoveList.o Movie.o NameList.o Name.o Iterator.o -o p1
If p1 is the first target in your makefile, then when you
type "make", it will run the commands for p1. "make" always
runs the commands from the first target in the makefile. It
won't run any of the commands from other targets.
Notice that the command to compile will create an executable called p1. It uses the -o option to create an executable with a name other than a.out.
Use MACROS!!!! (We mean it).
Macros allow you to define "variables" which are substituted in.
Here's an example of how to use them:
OBJS = MovieList.o Movie.o NameList.o Name.o Iterator.o
CC = g++
DEBUG = -g
CFLAGS = -Wall -c $(DEBUG)
LFLAGS = -Wall $(DEBUG)
p1 : $(OBJS)
$(CC) $(LFLAGS) $(OBJS) -o p1
Macros are usually put at the beginning of a makefile.
A macro has the following syntax
<macro_name> = <macro_string>
On the left hand side of the equal sign is the macro name.
Basically, you should follow the convention of only using upper case
letters and underscores.
On the right hand side is some string, which will be substituted. A macro is very similar to a #define in C/C++. It allows for substitution. It looks throughout the file for anything starting with a dollar sign and surrounded by either parentheses or braces.
In particular, it looks for things taht look like $(CC) or ${CC}--- both versions are acceptable, although I will only use the one with parentheses.
If the macro string is too long for one line, you can put a backslash, followed immediately by a return (newline). "make" will assume that the next line is a continuation of the line that has the backslash followed by the newline.
In fact, you can always use the backslash followed by a newline to indicate a continuing line. It doesn't just appear in macros. It can also apply to dependencies, commands, etc.
You use macros in makefiles for the same reason you define constants in programs. It's easier to update the files and make it more flexible.
There are three popular dummy targets used in makefiles.
By removing all such files, you force the makefile to recompile all .o files, thus guaranteeing the most recent rebuild. Of course, this is not something that should be necessary if you've created a makefile correctly. It defeats the purpose of creating a makefile in the first place!
Nevertheless, if you find that the code isn't working the way
it should, and you suspect it's due to a buggy makefile, you can
run make clean. This is how it usually looks:
clean:
\rm *.o *~ p1
The backslash prevents "rm" from complaining if you remove
the files (especially, if it's been aliased with the -i
option). Normally, you remove all .o files, all
files ending in ~ (which are emacs backup files), and
the name of the executable (or executables, as the case may
be). In the example above, the executable is called p1.
Notice that "clean" does not cause any files to be created. In particular, clean is not created. There are also no dependencies. Basically, make notices that there is no such file as clean and begins to run the commands in the command lines to create the file (which won't be created).
The effect is that running make clean causes the commands to always run. That's fine, because it's used to remove files, we don't want. Usually, you run make after running make clean to rebuild the executable.
Here's an example:
tar:
tar cfv p1.tar Movie.h Movie.cpp Name.h Name.cpp NameList.h \
NameList.cpp Iterator.cpp Iterator.h
Again, there are no dependencies. However, this time a file
does get created (p1.tar). You could make p1.tar
the target if you want, but this way, it always recreates
p1.tar since tar should not be a file that
exists in your directory.
You can write
all: p1 p2 p3
p1: Foo.o main1.o
g++ -Wall Foo.o main1.o -o p1
p2: Bar.o main2.o
g++ -Wall Bar.o main2.o -o p2
p3: Baz.o main3.o
g++ -Wall Baz.o main3.o -o p3
Notice that the all target has no command lines. However,
it lists the dependencies as p1, p2, and p3. It attempts to build
the most up-to-date versions of those files (if they are up-to-date,
it will let you know). Having an all target is convenient for
making a bunch of executables, without having to type make
on each target separately.
Unfortuantely, it's hard to see TABs. You need to go into the editor, and go to the beginning of command lines, and move the cursor forward. If it jumps several spaces, you know there is a TAB at the beginning. If it doesn't, you know there isn't.
If you have a command that continues on the next line with a backslash (followed immediately by a newline), then the continuation line does not need to start with a TAB (it makes sense, after all, it is treated as a continuation, not as a new command).
Another common error is not hitting return just after the backslash, should you choose to use it.
Of course, another error is not getting the dependencies correct, but that's not exactly a makefile error, per se.
OBJS = MovieList.o Movie.o NameList.o Name.o Iterator.o
CC = g++
DEBUG = -g
CFLAGS = -Wall -c $(DEBUG)
LFLAGS = -Wall $(DEBUG)
p1 : $(OBJS)
$(CC) $(LFLAGS) $(OBJS) -o p1
MovieList.o : MovieList.h MovieList.cpp Movie.h NameList.h Name.h Iterator.h
$(CC) $(CFLAGS) MovieList.cpp
Movie.o : Movie.h Movie.cpp NameList.h Name.h
$(CC) $(CFLAGS) Movie.cpp
NameList.o : NameList.h NameList.cpp Name.h
$(CC) $(CFLAGS) NameList.cpp
Name.o : Name.h Name.cpp
$(CC) $(CFLAGS) Name.cpp
Iterator.o : Iterator.h Iterator.cpp MovieList.h
$(CC) $(CFLAGS) Iterator.cpp
clean:
\rm *.o *~ p1
tar:
tar cfv p1.tar Movie.h Movie.cpp Name.h Name.cpp NameList.h \
NameList.cpp Iterator.cpp Iterator.h