Home arrow Virus Trojans Worms arrow The C in C++

Language Translator

Hacking Zone

Hacking Tools
Attacking

Configure Windows

Windows Configuration

Novels

Mix Novels

Human Personality

Body Language
The C in C++ PDF Print E-mail
Written by Hemanshu Patel   
Saturday, 20 October 2007
Article Index
The C in C++
Page 2
Page 3
Page 4
Page 5
Page 6
Page 7
Page 8


Introduction to pointers

Whenever you run a program, it is first loaded (typically from disk) into the computer’s memory. Thus, all elements of your program are located somewhere in memory. Memory is typically laid out as a sequential series of memory locations; we usually refer to these locations as eight-bit bytes but actually the size of each space depends on the architecture of the particular machine and is usually called that machine’s word size. Each space can be uniquely distinguished from all other spaces by its address. For the purposes of this discussion, we’ll just say that all machines use bytes that have sequential addresses starting at zero and going up to however much memory you have in your computer.

Since your program lives in memory while it’s being run, every element of your program has an address. Suppose we start with a simple program:

//: C03:YourPets1.cpp
#include <iostream>
using namespace std;

int dog, cat, bird, fish;

void f(int pet) {
cout << "pet id number: " << pet << endl;
}

int main() {
int i, j, k;
} ///:~

Each of the elements in this program has a location in storage when the program is running. Even the function occupies storage. As you’ll see, it turns out that what an element is and the way you define it usually determines the area of memory where that element is placed.

There is an operator in C and C++ that will tell you the address of an element. This is the  ‘&’ operator. All you do is precede the identifier name with ‘&’ and it will produce the address of that identifier. YourPets1.cpp can be modified to print out the addresses of all its elements, like this:

//: C03:YourPets2.cpp
#include <iostream>
using namespace std;

int dog, cat, bird, fish;

void f(int pet) {
cout << "pet id number: " << pet << endl;
}

int main() {
int i, j, k;
cout << "f(): " << (long)&f << endl;
cout << "dog: " << (long)&dog << endl;
cout << "cat: " << (long)&cat << endl;
cout << "bird: " << (long)&bird << endl;
cout << "fish: " << (long)&fish << endl;
cout << "i: " << (long)&i << endl;
cout << "j: " << (long)&j << endl;
cout << "k: " << (long)&k << endl;
} ///:~

The (long) is a cast. It says “Don’t treat this as if it’s normal type, instead treat it as a long.” The cast isn’t essential, but if it wasn’t there, the addresses would have been printed out in hexadecimal instead, so casting to a long makes things a little more readable.

The results of this program will vary depending on your computer, OS, and all sorts of other factors, but it will always give you some interesting insights. For a single run on my computer, the results looked like this:

f(): 4198736
dog: 4323632
cat: 4323636
bird: 4323640
fish: 4323644
i: 6684160
j: 6684156
k: 6684152

You can see how the variables that are defined inside main( ) are in a different area than the variables defined outside of main( ); you’ll understand why as you learn more about the language. Also, f( ) appears to be in its own area; code is typically separated from data in memory.

Another interesting thing to note is that variables defined one right after the other appear to be placed contiguously in memory. They are separated by the number of bytes that are required by their data type. Here, the only data type used is int, and cat is four bytes away from dog, bird is four bytes away from cat, etc. So it would appear that, on this machine, an int is four bytes long.

Other than this interesting experiment showing how memory is mapped out, what can you do with an address? The most important thing you can do is store it inside another variable for later use. C and C++ have a special type of variable that holds an address. This variable is called a pointer.

The operator that defines a pointer is the same as the one used for multiplication: ‘*’. The compiler knows that it isn’t multiplication because of the context in which it is used, as you will see.

When you define a pointer, you must specify the type of variable it points to. You start out by giving the type name, then instead of immediately giving an identifier for the variable, you say “Wait, it’s a pointer” by inserting a star between the type and the identifier. So a pointer to an int looks like this:

int* ip; // ip points to an int variable

The association of the ‘*’ with the type looks sensible and reads easily, but it can actually be a bit deceiving. Your inclination might be to say “intpointer” as if it is a single discrete type. However, with an int or other basic data type, it’s possible to say:

int a, b, c;

whereas with a pointer, you’d like to say:

int* ipa, ipb, ipc;

C syntax (and by inheritance, C++ syntax) does not allow such sensible expressions. In the definitions above, only ipa is a pointer, but ipb and ipc are ordinary ints (you can say that “* binds more tightly to the identifier”). Consequently, the best results can be achieved by using only one definition per line; you still get the sensible syntax without the confusion:

int* ipa;
int* ipb;
int* ipc;

Since a general guideline for C++ programming is that you should always initialize a variable at the point of definition, this form actually works better. For example, the variables above are not initialized to any particular value; they hold garbage. It’s much better to say something like:

int a = 47;
int* ipa = &a;

Now both a and ipa have been initialized, and ipa holds the address of a.

Once you have an initialized pointer, the most basic thing you can do with it is to use it to modify the value it points to. To access a variable through a pointer, you dereference the pointer using the same operator that you used to define it, like this:

*ipa = 100;

Now a contains the value 100 instead of 47.

These are the basics of pointers: you can hold an address, and you can use that address to modify the original variable. But the question still remains: why do you want to modify one variable using another variable as a proxy?

For this introductory view of pointers, we can put the answer into two broad categories:

  1. To change “outside objects” from within a function. This is perhaps the most basic use of pointers, and it will be examined here.
  2. To achieve many other clever programming techniques, which you’ll learn about in portions of the rest of the book.

 Modifying the outside object

Ordinarily, when you pass an argument to a function, a copy of that argument is made inside the function. This is referred to as pass-by-value. You can see the effect of pass-by-value in the following program:

//: C03:PassByValue.cpp
#include <iostream>
using namespace std;

void f(int a) {
cout << "a = " << a << endl;
a = 5;
cout << "a = " << a << endl;
}

int main() {
int x = 47;
cout << "x = " << x << endl;
f(x);
cout << "x = " << x << endl;
} ///:~

In f( ), a is a local variable, so it exists only for the duration of the function call to f( ). Because it’s a function argument, the value of a is initialized by the arguments that are passed when the function is called; in main( ) the argument is x, which has a value of 47, so this value is copied into a when f( ) is called.

When you run this program you’ll see:

x = 47
a = 47
a = 5
x = 47

Initially, of course, x is 47. When f( ) is called, temporary space is created to hold the variable a for the duration of the function call, and a is initialized by copying the value of x, which is verified by printing it out. Of course, you can change the value of a and show that it is changed. But when f( ) is completed, the temporary space that was created for a disappears, and we see that the only connection that ever existed between a and x happened when the value of x was copied into a.

When you’re inside f( ), x is the outside object (my terminology), and changing the local variable does not affect the outside object, naturally enough, since they are two separate locations in storage. But what if you do want to modify the outside object? This is where pointers come in handy. In a sense, a pointer is an alias for another variable. So if we pass a pointer into a function instead of an ordinary value, we are actually passing an alias to the outside object, enabling the function to modify that outside object, like this:

//: C03:PassAddress.cpp
#include <iostream>
using namespace std;

void f(int* p) {
cout << "p = " << p << endl;
cout << "*p = " << *p << endl;
*p = 5;
cout << "p = " << p << endl;
}

int main() {
int x = 47;
cout << "x = " << x << endl;
cout << "&x = " << &x << endl;
f(&x);
cout << "x = " << x << endl;
} ///:~

Now f( ) takes a pointer as an argument and dereferences the pointer during assignment, and this causes the outside object x to be modified. The output is:

x = 47
&x = 0065FE00
p = 0065FE00
*p = 47
p = 0065FE00
x = 5

Notice that the value contained in p is the same as the address of x – the pointer p does indeed point to x. If that isn’t convincing enough, when p is dereferenced to assign the value 5, we see that the value of x is now changed to 5 as well.

Thus, passing a pointer into a function will allow that function to modify the outside object. You’ll see plenty of other uses for pointers later, but this is arguably the most basic and possibly the most common use.

Introduction to C++ references

Pointers work roughly the same in C and in C++, but C++ adds an additional way to pass an address into a function. This is pass-by-reference and it exists in several other programming languages so it was not a C++ invention.

Your initial perception of references may be that they are unnecessary, that you could write all your programs without references. In general, this is true, with the exception of a few important places that you’ll learn about later in the book. You’ll also learn more about references later, but the basic idea is the same as the demonstration of pointer use above: you can pass the address of an argument using a reference. The difference between references and pointers is that calling a function that takes references is cleaner, syntactically, than calling a function that takes pointers (and it is exactly this syntactic difference that makes references essential in certain situations). If PassAddress.cpp is modified to use references, you can see the difference in the function call in main( ):

//: C03:PassReference.cpp
#include <iostream>
using namespace std;

void f(int& r) {
cout << "r = " << r << endl;
cout << "&r = " << &r << endl;
r = 5;
cout << "r = " << r << endl;
}

int main() {
int x = 47;
cout << "x = " << x << endl;
cout << "&x = " << &x << endl;
f(x); // Looks like pass-by-value,
// is actually pass by reference
cout << "x = " << x << endl;
} ///:~

In f( )’s argument list, instead of saying int* to pass a pointer, you say int& to pass a reference. Inside f( ), if you just say ‘r’ (which would produce the address if r were a pointer) you get the value in the variable that r references. If you assign to r, you actually assign to the variable that r references. In fact, the only way to get the address that’s held inside r is with the ‘&’ operator.

In main( ), you can see the key effect of references in the syntax of the call to f( ), which is just f(x). Even though this looks like an ordinary pass-by-value, the effect of the reference is that it actually takes the address and passes it in, rather than making a copy of the value. The output is:

x = 47
&x = 0065FE00
r = 47
&r = 0065FE00
r = 5
x = 5

So you can see that pass-by-reference allows a function to modify the outside object, just like passing a pointer does (you can also observe that the reference obscures the fact that an address is being passed; this will be examined later in the book). Thus, for this simple introduction you can assume that references are just a syntactically different way (sometimes referred to as “syntactic sugar”) to accomplish the same thing that pointers do: allow functions to change outside objects.

Pointers and references as modifiers

So far, you’ve seen the basic data types char, int, float, and double, along with the specifiers signed, unsigned, short, and long, which can be used with the basic data types in almost any combination. Now we’ve added pointers and references that are orthogonal to the basic data types and specifiers, so the possible combinations have just tripled:

//: C03:AllDefinitions.cpp
// All possible combinations of basic data types,
// specifiers, pointers and references
#include <iostream>
using namespace std;

void f1(char c, int i, float f, double d);
void f2(short int si, long int li, long double ld);
void f3(unsigned char uc, unsigned int ui,
unsigned short int usi, unsigned long int uli);
void f4(char* cp, int* ip, float* fp, double* dp);
void f5(short int* sip, long int* lip,
long double* ldp);
void f6(unsigned char* ucp, unsigned int* uip,
unsigned short int* usip,
unsigned long int* ulip);
void f7(char& cr, int& ir, float& fr, double& dr);
void f8(short int& sir, long int& lir,
long double& ldr);
void f9(unsigned char& ucr, unsigned int& uir,
unsigned short int& usir,
unsigned long int& ulir);

int main() {} ///:~

Pointers and references also work when passing objects into and out of functions; you’ll learn about this in a later chapter.

There’s one other type that works with pointers: void. If you state that a pointer is a void*, it means that any type of address at all can be assigned to that pointer (whereas if you have an int*, you can assign only the address of an int variable to that pointer). For example:

//: C03:VoidPointer.cpp
int main() {
void* vp;
char c;
int i;
float f;
double d;
// The address of ANY type can be
// assigned to a void pointer:
vp = &c;
vp = &i;
vp = &f;
vp = &d;
} ///:~

Once you assign to a void* you lose any information about what type it is. This means that before you can use the pointer, you must cast it to the correct type:

//: C03:CastFromVoidPointer.cpp
int main() {
int i = 99;
void* vp = &i;
// Can't dereference a void pointer:
// *vp = 3; // Compile-time error
// Must cast back to int before dereferencing:
*((int*)vp) = 3;
} ///:~

The cast (int*)vp takes the void* and tells the compiler to treat it as an int*, and thus it can be successfully dereferenced. You might observe that this syntax is ugly, and it is, but it’s worse than that – the void* introduces a hole in the language’s type system. That is, it allows, or even promotes, the treatment of one type as another type. In the example above, I treat an int as an int by casting vp to an int*, but there’s nothing that says I can’t cast it to a char* or double*, which would modify a different amount of storage that had been allocated for the int, possibly crashing the program. In general, void pointers should be avoided, and used only in rare special cases, the likes of which you won’t be ready to consider until significantly later in the book.

You cannot have a void reference, for reasons that will be explained in Chapter 11.

 

 
< Prev   Next >
Your Ad Here

Donate us!!

Enter Amount:

RSS socialnet

Add to MyYahoo!
Subscribe in NewsGator Online
Add to Newsburst
Add to Google
Add to My AOL
Add to Pluck
Subscribe in FeedLounge
Add to Windows Live
Add to NetVibes
Subscribe in Rojo
Subscribe in Bloglines
Add to MyMSN
Add to Plusmo for your cellphone
Add to PageFlakes
Add to Technorati
Add to BlinkBits