Back to main

Simple Data Types

In this section we examine the four primitive data types in the C programming language and their variants. The concept of the size and precision are introduced.

Overview

The four basic types available to the C programmer are:

A fifth type, the enumerated type, is a special case of the integer type.

Each data type occupies a number of bytes in memory, dependent on the type of machine. In C, variables are allocated sufficient memory to hold the type the programmer specifies when they are declared. Declaration can happen at the beginning of a file, or in a compound statement before any executable code.

It is up to the programmer to choose the correct type appropriate to the problem in hand. With modern computers, memory is not usually a limiting factor nor in most cases is the speed of the calculations crucial. The compiler is built with the concept of a "natural size" of variable appropriate to the machine: for a most modern machines, integers ("int") occupy 32 bits or 4 bytes, and floating point numbers ("double") occupy 64 bits or 8 bytes.

There are further distinctions between signed and unsigned, and short and long data types which are included for completeness. One should be aware of these in case it is necessary to represent very large values, or to reduce memory usage at the expense of execution efficiency, but they will not be necessary for the implementation of the minefield game. The keywords which change the size or signedness of a type are referred to as qualifiers.

Character type

Declaring a variable of type char indicates that it can store one ASCII character.

TypeStorageRange of Values
char1 byte-128 to 127
signed char1 byte-128 to 127
unsigned char1 byte0 to 255
Most platforms adopt the American Standard Code for Information Interchange (ASCII) to represent characters. Characters are stored NOT as letters, but according to the codes in the ASCII character set. For example, 'A' = 65, 'Z' = 90, 'a'= 97 and 'z' =122. All character operations are then performed using this value.

ASCII codes 32-127 include the printable character set; codes 0-31 provide control characters such as tab and line feed; and codes above 128, although not defined in the ASCII standard, provide access to non-printable characters.

char Declaration with Initialisation

char c = 'a';       /* The VALUE of c is 97 */
c = c + 1;          /* ... now it is 98, corresponding to 'b' */
In fact, character values are usually read from/written to the console or a file directly rather than being assigned individually and arithmetically manipulated in this fashion.


Aritmetic manipulations on characters are occasionally but widely used even by experienced programmers, but can be a source of problems. A common trick used to convert an integer in the range 0-9 to a printable character is to add '0' (the ASCII code for the printable character 0) to it. Although almost all computer systems use ASCII these days, where the codes for the digits are consecutive and in order, such a program would fail if the computer system used a non-ASCII character representation such as the obsolescent EBCDC originally developed by IBM. C just considers the codes as numbers: it doesn't "understand" what they mean.

Integers

A simple int is the most common numerical declaration. It indicates that integers (whole numbers) will be stored in the variable. The size of an integer is not defined in the C standard: it is guaranteed to be so shorter than a short int, and no longer than a long int! On most modern computers, the length of an integer is the same as a long integer, namely 4 bytes.

TypeStorageRange of Values
long
signed long
int
signed int
4 bytes-2147483648 to 2147483647
unsigned long
unsigned int
4 bytes0 to 4294967295
short
signed short
2 bytes-32768 to 32767
unsigned short2 bytes0 to 65535
Some compilers, including gcc, additionally provide an extra type called long long int occupying 8 bytes and with a range of roughly +/- 9223372036854775808, along with its unsigned varient. It is unlikely you will need integers that big -- even counting upwards from zero one hundred million times a second, it would still take almost three thousand years to get to a number that large -- and in any case the type is not portable and therefore best avoided.

Floating Point

The default type for storing floating point data is double: "double precision floating point". The C standard says that all calculations done on floating point values will be done in double precision, so using the single precision float type is ill-advised unless memory space is at an absolute premium because it will actually involve an additional type conversion and actually slow the program down.

Double precision floating point is accurate to roughly 15 digits and single precision to roughly six. An additional type, long double, is available, with even higher precision.

TypeStorageRange of Values
float 4 bytes 3.4E ±38,
c. 6 sig figs
double 8 bytes 1.7E ±308
c. 15 sig figs
long double 10 bytes1.2E ±4932
c. 19 sig figs.

Void

The keyword void has three uses:

  1. To specify a function with no return value
  2. To specify a function that has no formal parameters
  3. To specify a pointer to an unspecified data type.

Example of a void function

void printHello(void) /* Takes no araguments; returns no value */
{
    printf("Hello\n");
    /* We don't strictly need this line, but it's good practice */
    return;
}

For the advanced programmer, a significant use of the void keyword is to declare a pointer to an unspecified type. For example, a function might be written which can operate on more than one type by passing a pointer to void. Otherwise, one would end up providing a huge number of functions for all conceivable combination of types.

Example using void *

void copySomeStuff(void *destination, void *source, int howMany, int howBig)
{
    /* Copy some "things" from one place to another */
    /* We don't know what a "thing" is, but we do know
       how big it is */
    ...
}

int main()
{
    /* We could use the standard library function memcpy()
       but that wouldn't be a good demonstration! */
    struct myStruct d, s;
    /* Suppose s gets initialised somehow... */
    ...
    /* Copy the contents of s into d */
    copySomeStuff((void *)&d, (void *)&s, 1, sizeof(stuct myStruct));
    ...
}

Type void expressions are evaluated for side effects. You cannot use the (non-existent) value of an expression that has type void in any way, nor can you convert a void expression (by implicit or explicit conversion) to any type except void. If you do use an expression of any other type in a context where a void expression is required, its value is discarded. To conform to the ANSI specification, void ** cannot be used as int ** . Only void * can be used as a pointer to an unspecified type.


The "const" Modifer

Any of these types can be prefixed by the modifer const, which "fixes" the value of a variable and makes it illegal for C code to modify it. For example, one could declare const int one = 1; and it would be illegal for the program to change it later with the code one = 2.

One might be tempted to use a constant whereever the C preprocessor's #define directive was used previously. However, they work in very different ways. If the same variable was declared at the top of two different source files making up the same program, this would be an error even if they were the same value because the compiler wouldn't know which one was "the right one" when the object files were linked together. Preprocessor macros on the other hand (e.g. #define one 1) work by simply substituting "1" every time "one" appears in the source code. Use the one appropriate for the circumstances.

The modifer appears just before the type declarator when a variable is declared.


Enumerated types

It is possible to cause the compiler to assoiciate values with words to form a type, referred to as an enumerated type. For example, consider the international colour code used to label resistors, where a particular colour represents a particular component value.

An Enumerated Type

enum {black=0, brown, red, orange,
      yellow, green, blue, purple,
      grey, white} r;

r = red; /* sets r to 2 */

The compiler normally allocates 1 to the first word, 2 to the second and so on, but black indicates a zero in resistor colour codes so this type overrides the default behavior. It is possible to allocate numbers for each of the words arbitrarily, so long as the same value is not given to more than one different word. The compiler will silently ignore such an inconsistency, choosing a number of its own instead.

It is important to realise that using an enumerated type does not tell the computer how to print out a value of that type. If asked to print the value of r after executing the above code fragment, the computer will simply print out 2, not red.


volatile and register Modifers

Volatile Variables

As the section heading suggests, volatile isn't a type, it's a modifier. It can precede any type declaration, except a const one. It instructs the compiler to make no assumptions about the varaiable having changed or not since the last time it was examined. This is very useful when the variable being examined isn't a "normal" one: perhaps it is a counter inside a real-time clock chip which is onboard a microcontroller, for example. One might see the following code:
volatile TIMEREG timer;
int ticks, startTime;
...
startTime = timer
...
ticks = timer - startTime;
Now, if the keyword volatile was omitted from the declaration of timer, an optimising compiler would be within its rights to read the value from the hardware only once, assuming that it was unchanged the next time it was used. The program would always appear to run instantaneously!

Register variables

Volatile is a command which the compiler always obeys, but register is a hint which the compiler is free to ignore. Almost all microprocessors have registers inside them, capable of holding any of C's primitive data types. Because the number of memory fetches and stores is reduced, it is faster to perform operations on variables held in registers rather than in memory. Preceding a variable declaration with the word register hints to the compiler that it would be a good idea to use an on-chip register to hold a particular value; perhaps it is forseen that this variable will be used more intensively than others.

In fact, it is a bad idea to declare variables as register when using a modern compiler like gcc. The compiler is suprisingly intelligent, and makes use of the on-chip registers for storing intermediate results in a way which is designed to speed up execution. It is actually far better than most humans at deciding how the microprocessor's resources should be used. By hinting that a register is used to store a particular variable, the code might actually run slower because the compiler now has fewer on-chip registers available for other uses. If storing a variable in a register would have speeded up execution, the chances are the compiler would have done that anyway without the hint. register is therefore best thought of as a throw-back to the days of inefficient C compilers, which has been left in for the sake of comatibililty and the ability to compile old C programs without having to modify them.

Because register variables aren't stored in memory, you can't take their address with the & operator.

Don't bother with register variables in your programs: it's just more to type when entering the code, and rarely results in any improvement in performance.


Type Casting

Converting between types in C is automatic, so if x is declared as a double, writing an assignment like x = 2 is fine because 2 is implicitly cast from an integer to a double precision floating point number before it is assigned. Occasionally, it is useful to force the issue, and instruct the compiler to consider a value as a particular type. This is achieved as a type cast, by preceding the term to be cast with the target type enclosed in (parentheses).

How to Use Type Casting

double x = 7.4;
double y = 2.5;
int i = 8;

(int)x/y; /* Value is 2.96 */
(int)(x/y); /* Value is 2 */
(int)x/(int)y; /* Value is 3 (7/2 = 3) */
(double)i; /* Value is 8.0 */

Summary:

The simple types of the C language permit the storage and manipulation of floating point and integer numbers. In addition to the default format for such data, qualifiers permit storage at greater or lesser precision, or as signed or unsigned values. In general, qualifers should be omitted, and the compiler will "do the right thing" according to the machine for which the code is being produced.

Any variable can be declared const. Once initialiased, such a variable can then no longer be changed.

The void type is a special valueless type which can be used to declare, for example, functions which return and/or accept no values. A pointer-to-void can be used to pass the address of a variable of any type.

System Requirements

To view this web resource, you will need:

Copyright and Acknowledgements

The tools upon which this course relies are Copyright the Free Software Foundataion where they are made available under the GPL (GNU Public Licence).

The content of this course was derived from that generated by many ex-colleagues at the University of Leeds, Department of Electronics and Electrical Engineering. Much of the content has been reworked, and substantially augmented, but Dr N J Bailey, Centre for Music Technology, The University of Glagsow. This manifestation is Copyright N J Bailey; some of the content is Copyright The University of Leeds.

Diagrams on this resource are drawn in XFig and are rendered by the browser using The University of Hamburg's Simple FIG viewer applet which is Copyright (C) 1996-2002 F.N.Hendrich, hendrich@informatik.uni-hamburg.de.

The source code, programming examples and exercises are all specific to this course, and are Copyright, Dr N J Bailey.

The applet for viewing and demonstrating C programs is Copyright Dr N J Bailey, and is to be found documented and with its source code on the Centre for Music Technology website under Software