Data and Memory Management in C

Developed by Matt Gingerich for CPSC 259 at UBC. Last modified March 1, 2013.

This page was tested with Chrome 24, Internet Explorer 9, and Firefox 18. Other browsers are not guaranteed to work.

This tutorial is designed to give you a solid understanding of how data is manipulated by computers. In particular, I hope that it will clarify what is actually happening in hardware when you use complex data structures. There are many interactive demos embedded throughout the tutorial; feel free to experiment with these! Several quizzes are provided to help you check your own understanding (don't worry, we're not recording your results). Finally, though the code snippets in this tutorial use C syntax, the concepts discussed here are not tied to a single programming language.

Contents

Note: even if you're only interested in the later questions, it's helpful to have a deep understanding of data, data types, and addresses before trying to understand the answers to the more advanced questions. For that reason, I recommend reading the sections in this tutorial in order.

What is data?

Before discussing data structures, we should clarify what is meant by data. A single piece of data is called a datum. A datum is a piece of information that represents a value, a selection from a set, or some other state. In a computer, information is stored by setting the state of electrical switches (for more info, see the Wikipedia pages on flip-flops, registers, and RAM). The number of switches (bits) available determines how many unique patterns can be formed.

Demo: Bits are represented by blue and orange rectangles. Blues are "off" (0) and oranges are "on" (1). The initial pair of bits are both off, but you can click them to toggle their values. Use the "Add to Set" button (or press enter) to record the pattern of bits shown. How many different patterns can you make? Try increasing or decreasing the number of bits available (doing so will reset your set of patterns).

Each time a bit is added, the number of unique patterns that can be formed with the bits doubles. If you try to form all the possible 8-bit patterns, you'll need to find 256 different combinations (this is obviously extremely tedious; don't try it unless you can write a script to do it for you). In general, n bits can form 2n patterns.

Each unique pattern of bits can be used to store a unique value. What can we conclude from this? The size of a type of data can be quantified by the number of bits needed to store all of its possible values. In the next section, we'll see how bit patterns can be mapped to different values for different types of data.

Quick Quiz:

How many bits are needed to represent the set {0, 1, 2, 3, 4, 5, 6}?

2     3     6     7    

If data is the plural of datum, shouldn't you ask "what are data?" instead of "what is data?"

This a little off topic, but you certainly could write "what are data?" instead of "what is data?" The reason I wrote "what is data?" is that data can also refer to the collection of data pieces. Gramatically, this sense of the word data is a mass noun (also known as an uncountable noun) and it's synonymous with information. However, as we've just seen, data (and information) is countable in a way: based on the number of bits required to represent it!

What is a data type?

The previous section demonstrated that a number of switches can create a much larger number of patterns. The problem is that all these patterns are in the form of binary numbers while we're usually interested in non-binary values. To deal with this issue, we need to be able to associate different types of values with our binary patterns.

Let's say we wanted to map a data type to binary. One way to do that would be to enumerate all the possible values the data could take and match up each possibility with a list of binary patterns (the answer to the quiz question in the previous section gives an example of that sort of enumeration). Unfortunately, enumeration is usually not practical because there are far too many possible values for data. Instead, we need systematic rules for matching binary patterns to the possible data values. A data type is formed when a rule set can match each possible binary permutation to exactly one possible interpretation.

Demo: Bits are again represented by blue and orange rectangles and can be toggled with clicks. You can switch between five different interpretations of the same set of bits with the green buttons. Changing the data type of a variable without changing the underlying set of bits is known as type punning or a reinterpret cast.

There are many different ways in which numbers can be represented in binary. The signed integer type used in the demo above is called two's complement. Two's complement is identical to an unsigned binary number except that the most significant bit subtracts instead of adds. The float type used in the demo is based on the binary32 format.

Did you try setting specific values in the demo? It can be difficult to figure out what bit patterns correspond to values. Float values are particularly tricky: the closest you can get to the value 0.8, for example, is 0.7999999523162842 or 0.800000011920929. with the patterns 00111111 01001100 11001100 11001100 and 00111111 01001100 11001100 11001101 respectively. You might notice that these values aren't as close to 0.8 as you would expect. That's because we usually use double precision floats (type double) which use twice the number of bits for more precision. Even with more bits, values like 0.8 will still be approximated; however, the approximation will be closer.

The number 10 can be represented exactly with a float. Can you figure out the corresponding pattern? [Hint: 1.25 * (2^3) = 10, so the mantissa is 1.25 and the exponent is 3.]

What is an address?

Computers have a lot of memory (a huge block of switches), but we want to keep track of many small pieces of data (so we need many little blocks of switches). In addition to specifying how bit patterns map to values, data types also specify how many bits (switches) are needed for one variable. Data types don't specify which bits are used to represent data and that's where addresses come into play.

Bits are grouped into groups of eight and each group of eight bits forms one byte. Addresses are simply a way of numbering bytes so we can refer to a particular group of eight bits by its number. By convention, addresses are written in hexadecimal (base 16), although you may occasionally see addresses interpreted in base 10.

Demo: Use the increase and decrease buttons to change the address. A light blue rectangle highlights a set of four bytes starting at the specified address. These four bytes are interpreted as an unsigned decimal integer. Notice how the value of the integer changes as different starting addresses are chosen.

Note: In C, the address of any variable can be obtained by prepending the operator "&" in front of any variable name.

Let's review a few points:

All variables have a size (measured in bytes), an address (specifying the location of a byte), and a value (determined by the data type).

What is a pointer?

A pointer is a variable that stores an address as its value. Like any other type of data, when addresses are stored in memory they are stored as a bit pattern of a certain size with a certain address. The address that the pointer stores as its value and the address that the pointer is stored at do not need to be the same. If this is confusing to you, consider the following statement: the number "325" only has three digits, despite representing a number larger than one hundred. Notice the difference between quantities used to represent a value and the represented value itself.

Much of the confusion surrounding pointers comes from the fact that when we refer to the "pointer's address" we could either be talking about the address that the pointer is storing or the address that the pointer is stored at. For a pointer named "ptr" the address of the pointer itself can be obtained in C by writing "&ptr". As was noted previously, the "&" operator simply returns the address of the variable it precedes. The value of the pointer, which is itself another address, is obtained using the pointer's name, "ptr", without any additional operators.

Being able to store hexadecimal numbers is nice, but pointers would be rather useless for tracking variables if we couldn't use a hexadecimal address to get or set the value of a variable. To put it another way, without a method to get the value stored at an address, pointers could only tell us where data is stored but not what the data is. Fortunately, there is a way to move from an address to a value and that is to use the dereferencing operator (*). For a pointer named "ptr", typing "*ptr" will yield the value stored at the address specified by the value of "ptr". That's a lot of values/addresses to keep track of, so if you're totally confused at this point, try the demo below.

Demo: A pointer is stored at the address 0x1000. You can change the value of the pointer and dereference the value of the pointer to change other values. The values listed under the heading "Watch" show the value of the pointer, the address of the pointer, and the dereferenced value of the pointer. Watch how these values change as you run commands. Try assigning the pointer its own address then dereferencing the pointer and assigning it a new value. Can you explain what happens?

   Value  Address

0x10140x1000

2520x1004

-13210x1008

917320x100C

5120x1010

89260x1014

43310x1018

90x101C

00x1020

00x1024

2684354590x1028

8915610x102C

45640x1030

-789420x1034

655360x1038

1021030x103C

Watch
ptr: 1014
&ptr: 0x1000
*ptr: 8926

Run Command (see below for sample commands)


ERROR!


Available Commands
ptr = [Hex Address]
*ptr = [Number Value]

0x10 is not the same value as 10. 0x10 is the hexadecimal value for seventeen, while 10 is the decimal value for ten. 0x1000 is the same number as 4096. Arithmetic and other operators are not supported in this demo (it is a very rudimentary parser).

Sample Commands
ptr = 0x1008
*ptr = 46
ptr = 0x1000
*ptr = 4108

There are a few things you may have noticed while playing with this demo. First, the address of the pointer (&ptr) never changed. When variables are declared, they are allocated memory on the stack by the compiler and we can't move data on the stack around (we can use dynamic memory allocation to handle heap memory, but we'll get to that in a few sections). You also should have noticed that you could select and change values by assigning addresses to the pointer and assigning new values to the dereferenced pointer.

If you tried double-dereferencing the pointer or using the pointer on the right side of the equation, you would have received an error message. This just reflects the fact that my parser is really weak; to test more complicated scenarios, I'd recommend using a real C debugger.

What is an array?

An array is a special data type that puts smaller blocks of data of the same type back to back. In C, square brackets get the value of the n-th datum in the array while not putting square brackets returns the address of the first datum. The char [4] data type in the second demo of this tutorial was an array of four characters. We can declare and initialize an array as follows:

int demoArray[] = {10, 12, 15, 18};

 ValueAddress

100x003DFD3C

120x003DFD40

150x003DFD44

180x003DFD48

The table listed above shows how this array might be stored in memory. In this case, "demoArray" would return the address 0x003DFD3C, "demoArray[0]" would return 10, "demoArray[1]" would return 12, "&demoArray[1]" would return 0x003DFD40, and so on.

How do pointers relate to arrays?

Pointers can be used for more than finding the value at the single address that they store. Pointer arithmetic allows programmers to manipulate addresses to access other related addresses. This University of Maryland page does a great job of explaining pointer arithmetic and the relationship between pointers and arrays.

Pointer arithmetic really only supports two operations: addition and subtraction. These operations increment and decrement the address as you would expect, with one caveat: they are influenced by the size of the pointer type. Let's look at some examples.

Demo: Three pointers are represented here using coloured rectangles under the row of addresses. The rectangles are colour-coded to different types of pointers. Each rectangle reflects the region of memory that could be accessed by dereferencing the corresponding pointer. Use the green buttons to increment and decrement pointers. Note how the values of the pointers change by different amounts for different pointer types, even though we're incrementing/decrementing by one unit in all cases.

Using pointer arithmetic, we can access any element of an array using the index of the element in the array and a pointer to the first element in the array.

How can we access element i from an array of type int if we have a pointer of type int* named address_of_first that points to the first element in the array?

element_i = *address_of_first + i
element_i = &address_of_first + i
element_i = *(address_of_first + i)
element_i = &(address_of_first + i)
element_i = *address_of_first + i*sizeof(int)
element_i = &address_of_first + i*sizeof(int)

We can also use the notation element_i = address_of_first[i] to achieve the same effect. The square brackets are a syntactic shortcut for performing pointer arithmetic and dereferencing.

The simple answer to the question "how do pointers relate to arrays" is that an array is a pointer to the first of several data blocks that all have the same type.

A more nuanced answer to the question "how do pointers relate to arrays" is that a statically created array almost always behaves like a pointer to its first element, except when the sizeof and address-of operators are applied. Applying sizeof to a statically created array will return the size of the full array (the sum of all the element sizes) while obtaining the address of the array will again yield the address of the first element of the array (in contrast, the address of a pointer to a dynamically allocated array will not be the same address as the first element of that array).

How do double pointers work?

A double pointer is a pointer whose value is equal to the address of another pointer. When the double pointer is dereferenced the value of the other pointer can be modified or accessed.

The dereferencing operator allows us to move from an address to a value. If the value happens to be an address, we can go to that address and thus build a chain that alternates between addresses and values. In the figure below, rectangles represent variables and each variable has both a value and an address. Black arrows indicate places where we move from a value to an address because the value is set to the address. Orange arrows indicate places where we move from an address to a value using the dereference operator.

Figure: A pointer to a pointer to an integer.

We can build arbitrarily long chains of pointers in this manner. Why would we ever want pointers to pointers? There are a couple of reasons. We may want to pass a pointer by reference to a function so that local changes to the pointer within the function are propagated to the calling function. If we pass the value of a pointer to a function, the function only obtains a copy of the pointer's value and can't change the value of the pointer in the calling function.

We can also use double pointers to form multidimensional arrays. In a multidimensional array, value-address links are used to tie together the different dimensions of the array, while pointer arithmetic is used to index into the array at each dimension. In the figure below, we start with a pointer at address 0x018 which points to an array that begins at address 0x02C. The array contains three pointers, which each point to a different array. Selecting one of the three indices of the first table brings us to one of the second-level arrays where we can choose one of two values.

An illustration of a tree or a multidimensional array formed with pointers.

In this example, the first-level array has length 3 and all the second-level arrays have length 2, so our data is equivalent to a static array of type int[3][2].

What is dynamic memory allocation?

When C code is compiled, memory for variables is allocated on the program's stack. You can freely use these variables without needing to know where they're located in memory. The problem is that sometimes programs don't realize how many variables they need until they're run.

If it's not known at compilation time that a variable of a certain type needs to be created, the compiler won't automatically create room for it on the stack. Instead, you will need to allocate memory for the variable while the program is running by calling a memory allocation function (malloc or calloc, for example). Dynamic memory allocation functions try to find room for data on the heap. If the search was successful, the address of the beginning of the allocated region is returned. Otherwise, the function can't allocate enough memory and is forced to return a null pointer.

Stepping back for a minute, what does it mean for an allocation function to "find room" and "allocate" memory? In general, allocation is not the same thing as initialization (although calloc both allocates and initializes memory). Allocation doesn't change the value in memory; it just provides a guarantee that no part of the allocated region of memory has been previously allocated. This guarantee lets us use heap memory safely, without worrying that we might accidentally store several variables at the same address (which would lead to them overwriting each other).

What is a memory leak?

There's a finite amount of memory in a computer, so if all memory could only be allocated once we would eventually run out of memory. To prevent that, we need to free memory that we have finished using. The parameter of the free(address) function must be an address that was returned by a previous call to an allocation function. Free reverses the effect of the original allocation and makes the entire block that was allocated available for future allocation.

It is not possible to "partially" free memory by passing an address to free that is somewhere in the middle of an allocated region. The address to be freed must be the start of an allocation.

Because all calls to free must use an address that was returned from an allocation function, it is very important that all the valid addresses that are returned from allocation must be stored until free is called. If one of these addresses is not stored, or if free is never called on that address, the address will never available for re-use. Failure to ever free memory is a memory leak.

Memory leaks are particularly difficult errors to find, because they don't cause immediate errors. The program may run correctly for hours before running out of available memory and crashing. Finding the error can require searching through all the memory handling in the program to find where there is a mismatch between allocation and freeing.

Hold on! Where is the data for memory allocation being stored? How does free know how much to free? (Advanced)

The short answer is that this is handled by the operating system and you do not need to be too concerned about it for CPSC 259. However, trying to understand how the operating system handles memory management is a great way to test your knowledge of all the material in this tutorial.

Each time memory is allocated, the system will need to store information about the allocation. Storing this information requires memory. It might strike you as a circular dependency that we need memory in order to track our memory usage, but this is not the case. You can think of the operating system as a giant bureaucracy controlling access to memory. Every time a process requires memory, the operating system finds an available slot of memory, then "taxes" the transaction by allocating itself some memory to store a record about the allocation.

The operating system can keep track of where its own records are by carefully structuring the records themselves. A linked list is a common structure used for this purpose: each record will contain a link to the next record in a list, enabling the operating system to track all dynamic memory allocations by statically storing a pointer to the first element in the linked list.

What is a structure?

A structure is a compound data type consisting of several other data types stitched together. Structures have member variables or components. Components can be any existing data type and, unlike arrays, they do not need to be of the same type. Each component is given a name within the structure.

Here are a couple of examples of struct declarations with a variety of different variable types.

struct example {
    int a;
    char b;
    double c[2];
    int *d;
    struct example *e;
};

struct example2 {
    struct example myExampleStruct;
    struct example2 *child;
};

In the preceding examples, notice that structures can contain pointers to their own data type. A pointer is already a defined data type (as we've seen, it stores a hexadecimal address) so this is not a recursive definition. The definition for struct example2 contains a variable of type struct example; this is not a pointer, it's the whole data type.

Here are some examples of these structs being used.

// Declarations and initialization
struct example demoStruct1 = {1, 'B', {3.0, 3.5}, NULL, NULL};
struct example2 demoStruct2;
struct example2 demoStruct3 = {{6, 'G', {8.0, 8.5}, NULL, NULL}, NULL};
demoStruct2.myExampleStruct = demoStruct1;
demoStruct2.child = NULL;

// Printing data
printf("demoStruct1.a = %d\n", demoStruct1.a);
printf("demoStruct1.b = %c\n", demoStruct1.b);
printf("demoStruct1.c[0] = %f\n", demoStruct1.c[0]);

// The values in demoStruct1 should match those in demoStruct2.myExampleStruct
printf("demoStruct2.myExampleStruct.a = %d\n", demoStruct1.a);

// demoStruct2.myExampleStruct is a copy of demoStruct1, not a link: changing the values in one doesn't change the values in the other
demoStruct1.a = 16;
printf("demoStruct1.a = %d\n", demoStruct1.a);
printf("demoStruct2.myExampleStruct.a = %d\n", demoStruct2.myExampleStruct.a);

// Assigning demoStruct3 to the child pointer of demoStruct2
demoStruct2.child = &demoStruct3;

// The members of demoStruct3 can now be accessed through demoStruct2
printf("(*demoStruct2.child).myExampleStruct.a = %d\n", (*demoStruct2.child).myExampleStruct.a);

// The arrow operator is a shortcut for accessing the members of a struct from a pointer to it.
printf("demoStruct2.child->myExampleStruct.a = %d\n", demoStruct2.child->myExampleStruct.a);

You can see the output of this code (and try modifying it yourself) at: codepad.org.

Quiz!

Can a type named "struct signal" contain a member of type "struct oscillation"?

Yes No

Can a type named "struct signal" contain a member of type "struct signal *"?

Yes No

Can a type named "struct oscillation" contain a member of type "struct oscillation"?

Yes No

How do structures relate to pointers?

Structures should not be confused with pointers. A structure stores a collection of values, while a pointer stores an address as a hexadecimal value. Pointers can be part of structures (since a hexadecimal value can be part of a collection of values). We also often use pointers to structures (since passing around the address of a large block of data is easier than copying the entire block of data). The dot and arrow operators used with structures return the value of the named component, not its address. The only exception to this are cases where the value of a component is also its address (as in an array, or a pointer that happens to point to its own location).

The End