An object in C++ is a region of storage with a type, a value, and possibly a name. In traditional object-oriented programming, "object" means an instance of a class, but in C++ the definition is slightly broader to include instances of any data type.
An object (variable or constant) declaration has two parts: a series of specifiers and a list of comma-separated declarators. Each declarator has a name and an optional initializer.
Each declaration begins with a series of specifiers. The series can contain a storage class,
const
and volatile
qualifiers, and the object's type,
in any order.
A storage class specifier can specify scope linkage and
lifetime. The storage class is optional. For function parameters and
local variables in a function, the default storage class specifier
is auto
. For declarations at
namespace scope, the default is usually an object with static
lifetime and external linkage. C++ has no explicit storage class for
such a declaration. (See Section 2.6.4 later in this
chapter and Section 2.4
earlier in this chapter for more information.) If you use a storage
class specifier, you must choose only one of the following:
auto
Denotes an automatic variable—that is, a variable with a
lifetime limited to the block in which the variable is
declared. The auto
specifier is the default for function parameters and local
variables, which are the only kinds of declarations for which
it can be used, so it is rarely used explicitly.
extern
Denotes an object with external linkage, which might be
defined in a different source file. Function parameters cannot
be extern
.
mutable
Denotes a data member that can be modified even if the
containing object is const
.
See Chapter 6 for more
information.
register
Denotes an automatic variable with a hint to the
compiler that the variable should be stored in a fast
register. Many modern compilers ignore the register
storage class because the
compilers are better than people at determining which
variables belong in registers.
static
Denotes a variable with a static lifetime and internal
linkage. Function parameters cannot be static
.
The const
and volatile
specifiers are optional. You can use either one, neither, or both in
any order. The const
and volatile
keywords can be used in other
parts of a declaration, so they are often referred to by the more
general term qualifiers; for brevity, they are
often referred to as cv- qualifiers.
const
Denotes an object that cannot be modified. A const
object cannot ordinarily be
the target of an assignment. You cannot call a non-const
member function of a const
object.
volatile
Denotes an object whose value might change unexpectedly.
The compiler is prevented from performing optimizations that
depend on the value not changing. For example, a variable that
is tied to a hardware register should be volatile
.
Every object must have a type in the form of one or more type specifiers. The type specifiers can be any of the following:
The name of a class, enumeration, or typedef
An elaborated name
A series of fundamental type specifiers
A class definition
An enumerated type definition
Enumerated and fundamental types are described earlier in this
chapter in Section 2.5.
Class types are covered in Chapter
6. The typename
keyword is
covered in Chapter 7.
Specifiers can appear in any order, but the convention is to list the storage class first, followed by the type specifiers, followed by cv-qualifiers.
extern long int const mask; // Conventional int extern const long mask; // Valid, but strange
Many programmers prefer a different order: storage class, cv-qualifiers, type specifiers. More and more are learning to put the cv-qualifiers last, though. See the examples under Section 2.6.2.2 later in this chapter to find out why.
The convention for types that require multiple keywords is to place the base type last and the modifiers first:
unsigned long int x; // Conventional int unsigned long y; // Valid, but strange long double a; // Conventional double long b; // Valid, but strange
You can define a class or enumerated type in the same declaration as an object declaration:
enum color { red, black } node_color;
However, the custom is to define the type separately, then use the type name in a separate object declaration:
enum color { red, black }; color node_color;
A declarator declares a single name within a declaration. In a declaration, the initial specifiers apply to all the declarators in the declaration, but each declarator's modifiers apply only to that declarator. (See Section 2.6.2.2 in this section for examples of where this distinction is crucial.) A declarator contains the name being declared, additional type information (for pointers, references, and arrays), and an optional initializer. Use commas to separate multiple declarators in a declaration. For example:
int plain_int, array_of_int[42], *pointer_to_int;
An array is declared with a constant size specified in
square brackets. The array size is fixed for the lifetime of the
object and cannot change. (For an array-like container whose size
can change at runtime, see <vector>
in Chapter 13.) To declare a
multidimensional array, use a separate set of square brackets for
each dimension:
int point[2]; double matrix[3][4]; // A 3 × 4 matrix
You can omit the array size if there is an initializer; the number of initial values determines the size. In a multidimensional array, you can omit only the first (leftmost) size:
int data[] = { 42, 10, 3, 4 }; // data[4] int identity[][3] = { { 1,0,0 }, {0,1,0}, {0,0,1} }; // identity[3][3] char str[] = "hello"; // str[6], with trailing \0
In a multidimensional array, all elements are stored contiguously, with the rightmost index varying the fastest (usually called row major order).
When a function parameter is an array, the array's size is ignored, and the type is actually a pointer type, which is the subject of the next section. For a multidimensional array used as a function parameter, the first dimension is ignored, so the type is a pointer to an array. Because the first dimension is ignored in a function parameter, it is usually omitted, leaving empty square brackets:
long sum(long data[], size_t n); double chi_sq(double stat[][2]);
A pointer object stores the address of another object. A
pointer is declared with a leading asterisk (*
), optionally followed by
cv-qualifiers, then the object name, and
finally an optional initializer.
When reading and writing pointer declarations, be sure to keep
track of the cv-qualifiers. The
cv- qualifiers in the declarator apply to the pointer
object, and the cv-qualifiers in the
declaration's specifiers apply to the type of the pointer's target.
For example, in the following declaration, the const
is in the specifier, so the pointer
p
is a pointer to a const
int
. The pointer object is modifiable, but
you cannot change the int
that it
points to. This kind of pointer is usually called a pointer to
const
.
int i, j; int const *p = &i; p = &j; // OK *p = 42; // Error
When the cv-qualifier is part of the
declarator, it modifies the pointer object. Thus, in the following
example, the pointer p
is
const
and hence not modifiable,
but it points to a plain int
,
which can be modified. This kind of pointer is usually called a
const
pointer.
int i, j; int * const p = &i; p = &j; // Error *p = 42; // OK
You can have pointers to pointers, as deep as you want, in which each level of pointer has its own cv-qualifiers. The easiest way to read a complicated pointer declaration is to find the declarator, work your way from the inside to the outside, and then from right to left. In this situation, it is best to put cv-qualifiers after the type specifiers. For example:
int x; int *p; // Pointer to int int * const cp = &x; // const pointer to int int const * pc; // Pointer to const int int const * const cpc = &x; // const pointer to const int int *pa[10]; // Array of 10 pointers to int int **pp; // Pointer to pointer to int
When a function parameter is declared with an array type, the actual type is a pointer, and at runtime the address of the first element of the array is passed to the function. You can use array syntax, but the size is ignored. For example, the following two declarations mean the same thing:
int sum(int data[], size_t n); int sum(int *data, size_t n);
When using array notation for a function parameter, you can omit only the first dimension. For example, the following is valid:
void transpose(double matrix[][3]);
but the following is not valid. If the compiler does not know
the number of columns, it does not know how to lay out the memory
for matrix
or compute array
indices.
void transpose(double matrix[][]);
A useful convention is to use array syntax when declaring
parameters that are used in an array-like fashion—that is, the
parameter itself does not change, or it is dereferenced with the
[]
operator. Use pointer syntax
for parameters that are used in pointer-like fashion—that is, the
parameter value changes, or it is dereferenced with the unary
*
operator.
A function pointer is declared with an asterisk
(*
) and the function signature
(parameter types and optional names). The declaration's specifiers
form the function's return type. The name and asterisk must be
enclosed in parentheses, so the asterisk is not interpreted as part
of the return type. An optional exception specification can follow
the signature. See Chapter 5
for more information about function signatures and exception
specifications.
void (*fp)(int); // fp is pointer to function that takes an int parameter // and returns void. void print(int); fp = print;
A declaration of an object with a function pointer type can be
hard to read, so typically you declare the type separately with a
typedef
declaration, and then
declare the object using the typedef
name:
typedef void (*Function)(int); Function fp; fp = print;
Example 2-11 shows
a declaration of an array of 10 function pointers, in which the
functions return int*
and take
two parameters: a function pointer (taking an int*
and returning an int*
) and an integer. The declaration is
almost unreadable without using typedef
declarations for each part of the
puzzle.
Example 2-11. Simplifying function pointer declarations with typedef
// Array of 10 function pointersint* (*fp[10])(int*(*)(int*), int);
// Declare a type for pointer to int.typedef int* int_ptr;
// Declare a function pointer type for a function that takes an int_ptr parameter // and returns an int_ptr.typedef int_ptr (*int_ptr_func)(int_ptr);
// Declare a function pointer type for a function that returns int_ptr and takes // two parameters: the first of type int_ptr and the second of type int.typedef int_ptr (*func_ptr)(int_ptr_func, int);
// Declare an array of 10 func_ptrs.func_ptr fp[10];
Pointers to members (data and functions) work differently from other pointers. The syntax for declaring a pointer to a nonstatic data member or a nonstatic member function requires a class name and scope operator before the asterisk. Pointers to members can never be cast to ordinary pointers, and vice versa. You cannot declare a reference to a member. (See Chapter 3 for information about expressions that dereference pointers to members.) A pointer to a static member is an ordinary pointer, not a member pointer. The following are some simple examples of member pointers:
struct simple { int data; int func(int); }; int simple::* p = &simple::data; int (simple::*fp)(int) = &simple::func; simple s; s.*p = (s.*fp)(42);
A reference is a synonym for an object or function. A
reference is declared just like a pointer, but with an ampersand
(&
) instead of an asterisk
(*
). A local or global reference
declaration must have an initializer that specifies the target of
the reference. Data members and function parameters, however, do not
have initializers. You cannot declare a reference of a reference, a
reference to a class member, a pointer to a reference, an array of
references, or a cv-qualified reference. For
example:
int x; int &r = x; // Reference to int int& const rc = x; // Error: no cv qualified references int & &rr; // Error: no reference of reference int& ra[10]; // Error: no arrays of reference int*&* rp = &r; // Error: no pointer to reference int* p = &x; // Pointer to int int*&* pr = p; // OK: reference to pointer
A reference, unlike a pointer, cannot be made to refer to a different object at runtime. Assignments to a reference are just like assignments to the referenced object.
Because a reference cannot have
cv-qualifiers, there is no such thing as a
const
reference. Instead, a
reference to const
is often
called a const
reference for the sake of brevity.
References are often used to bind names to temporary objects, implement
call-by-reference for function parameters, and optimize
call-by-value for large function parameters. The divide
function in Example 2-12 demonstrates the
first two uses. The standard library has the div
function, which divides two integers
and returns the quotient and remainder in a struct
. Instead of copying the structure
to a local object, divide
binds
the return value to a reference to const
, thereby avoiding an unnecessary
copy of the return value. Furthermore, suppose that you would rather
have divide
return the results as
arguments. The function parameters quo
and rem
are references; when the divide
function is called, they are bound
to the function arguments, q
and
r
, in main
. When divide
assigns to quo
, it actually stores the value in
q
, so when divide
returns, main
has the quotient and
remainder.
Example 2-12. Returning results in function arguments
#include <cstdlib> #include <iostream> #include <ostream> void divide(long num, long den, long& quo, long& rem) { const std::ldiv_t& result = std::div(num, den); quo = result.quot; rem = result.rem; } int main( ) { long q, r; divide(42, 5, q, r); std::cout << q << " remainder " << r << '\n'; }
The other common use of references is to use a const
reference for function parameters,
especially for large objects. Function arguments are passed by value
in C++, which requires copying the arguments. The copy operation can
be costly for a large object, so passing a reference to a large
object yields better performance than passing the large object
itself. The reference parameter is bound to the actual argument,
avoiding the unnecessary copy. If the function modifies the object,
it would violate the call-by-value convention, so you should declare
the reference const
, which
prevents the function from modifying the object. In this way,
call-by-value semantics are preserved, and the performance of
call-by-reference is improved. The standard library often uses this
idiom. For example, operator<<
for std::string
uses a const
reference to the string to avoid
making unnecessary copies of the string. (See <string>
in Chapter 13 for details.)
If a function parameter is a non-const
reference, the argument must be an
lvalue. A const
reference, however, can bind to an rvalue, which permits temporary objects to be passed
to the function, which is another characteristic of call-by-value.
(See Chapter 3 for the
definitions of "lvalue" and "rvalue.")
A reference must be initialized so it refers to an
object. If a data member is a reference, it must be initialized in
the constructor's initializer list (see Chapter 6). Function parameters that
are references are initialized in the function call, binding each
reference parameter to its corresponding actual argument. All other
reference definitions must have an initializer. (An extern
declaration is not a definition, so
it doesn't take an initializer.)
A const
reference can be initialized to refer to a temporary
object. For example, if a function takes a const
reference to a float
as a parameter, you can pass an
integer as an argument. The compiler converts the integer to
float
, saves the float
value as an unnamed temporary
object, and passes that temporary object as the function argument.
The const
reference is
initialized to refer to the temporary object. After the function
returns, the temporary object is destroyed:
void do_stuff(const float& f); do_stuff(42); // Equivalent to: { const float unnamed = 42; do_stuff(unnamed); }
The restrictions on a reference, especially to a reference
of a reference, pose an additional challenge for template authors.
For example, you cannot store references in a container because a
number of container functions explicitly declare their parameters as
references to the container's value type. (Try using std::vector<int&>
with your
compiler, and see what happens. You should see a lot of error
messages.)
Instead, you can write a wrapper template, call it rvector<typename
T>
, and specialize the template
(rvector<T&>
) so
references are stored as pointers, but all the access functions hide
the differences. This approach requires you to duplicate the entire
template, which is tedious. Instead, you can encapsulate the
specialization in a traits template called Ref<>
(refer to Chapter 7 for more information about
templates and specializations, and to Chapter 8 for more information about
traits), as shown in Example
2-13.
Example 2-13. Encapsulating reference traits
// Ref type trait encapsulates reference type, and mapping to and from the type // for use in a container. template<typename T> struct Ref { typedef T value_type; typedef T& reference; typedef const T& const_reference; typedef T* pointer; typedef const T* const_pointer; typedef T container_type; static reference from_container(reference x) { return x; } static const_reference from_container(const_reference x) { return x; } static reference to_container(reference x) { return x; } }; template<typename T> struct Ref<T&> { typedef T value_type; typedef T& reference; typedef const T& const_reference; typedef T* pointer; typedef const T* const_pointer; typedef T* container_type; static reference from_container(pointer x) { return *x; } static const_reference from_container(const_pointer x) { return *x; } static pointer to_container(reference x) { return &x; } }; // rvector<> is similar to vector<>, but allows references by storing references // as pointers. template<typename T, typename A=std::allocator<T> > class rvector { typedef typename Ref<T>::container_type container_type; typedef typename std::vector<container_type> vector_type; public: typedef typename Ref<T>::value_type value_type; typedef typename Ref<T>::reference reference; typedef typename Ref<T>::const_reference const_reference; typedef typename vector_type::size_type size_type; . . . // Other typedefs are similar. class iterator { ... }; // Wraps a vector<>::iterator class const_iterator { ... }; . . . // Constructors pass arguments to v. iterator begin( ) { return iterator(v.begin( )); } iterator end( ) { return iterator(v.end( )); } void push_back(typename Ref<T>::reference x) { v.push_back(Ref<T>::to_container(x)); } reference at(size_type n) { return Ref<T>::from_container(v.at(n)); } reference front( ) { return Ref<T>::from_container(v.front( )); } const_reference front( ) const { return Ref<T>::from_container(v.front( )); } . . . // Other members are similar. private: vector_type v; };
An initializer supplies an initial value for an object
being declared. You must supply an initializer for the definition of a
reference or const
object. An
initializer is optional for other object definitions. An initializer
is not allowed for most data members within a class definition, but an
exception is made for static
const
data members of integral or
enumerated type. Initializers are also not allowed for extern
declarations and function parameters.
(Default arguments for function parameters can look like initializers.
See Chapter 5 for
details.)
The two forms of initializers are assignment-like and function-like . (In the C++ standard, assignment-like is called copy initialization, and function-like is called direct initialization.) An assignment-like initializer starts with an equal sign, which is followed by an expression or a list of comma-separated expressions in curly braces. A function-like initializer is a list of one or more comma-separated expressions in parentheses. Note that these initializers look like assignment statements or function calls, respectively, but they are not. They are initializers. The difference is particularly important for classes (see Chapter 6). The following are some examples of initializers:
int x = 42; // Initializes x with the value 42 int y(42); // Initializes y with the value 42 int z = { 42 }; // Initializes z with the value 42 int w[4] = { 1, 2, 3, 4 }; // Initializes an array std::complex<double> c(2.0, 3.0); // Calls complex constructor
When initializing a scalar value, the form is irrelevant. The initial value is converted to the desired type using the usual conversion rules (as described in Chapter 3).
Without an initializer, all non-POD class-type objects are
initialized by calling their default constructors. (See Chapter 6 for more information about
POD and non-POD classes.) All other objects with static lifetimes are
initialized to 0
; objects with
automatic lifetimes are left uninitialized. (See Section 2.6.4 later in this
chapter.) An uninitialized reference or const
object is an error.
You must use a function-like initializer when constructing an object whose
constructor takes two or more arguments, or when calling an explicit
constructor. The usual rules for
resolving overloaded functions apply to the choice of overloaded
constructors. (See Chapter 5
for more information about overloading and Chapter 6 for more information about
constructors.) For example:
struct point { point(int x, int y); explicit point(int x); point( ); ... }; point p1(42, 10); // Invokes point::point(int x, int y); point p2(24); // Invokes point::point(int x); point p3; // Invokes point::point( );
Empty parentheses cannot be used as an initializer in an
object's declaration, but can be used in other initialization
contexts (namely, a constructor initializer list or as a value in an
expression). If the type is a class type, the default constructor is
called; otherwise, the object is initialized to 0
. Example 2-14 shows an
empty initializer. No matter what type T
is, the wrapper<>
template can rely on
T( )
to be a meaningful default
value.
Example 2-14. An empty initializer
template<typename T> struct wrapper { wrapper( ) : value_(T( )) {} explicit wrapper(const T& v) : value_(v) {} private: T value_; }; wrapper<int> i; // Initializes i with int( ), or zero enum color { black, red, green, blue }; wrapper<color> c; // Initializes c with color( ), or black wrapper<bool> b; // Initializes b with bool( ), or false wrapper<point> p; // Initializes p with point( )
In an assignment-like initializer, if the object is of class type, the value to the right of the equal sign is converted to a temporary object of the desired type, and the first object is constructed by calling its copy constructor.
The generic term for an array or simple class is aggregate because it aggregates multiple values into a single object. "Simple" in this case means the class does not have any of the following:
User-declared constructors
Private or protected nonstatic data members
Base classes
Virtual functions
To initialize an aggregate, you can supply multiple values in curly braces, as described in the following sections. A POD object is a special kind of an aggregate. (See Section 2.5.3 earlier in this chapter for more information about POD types; see also Chapter 6 for information about POD classes.)
To initialize an aggregate of class type, supply an initial
value for each nonstatic data member separated by commas and
enclosed in curly braces. For nested objects, use nested curly
braces. Values are associated with members in the order of the
members' declarations. More values than members results in an error.
If there are fewer values than members, the members without values
are initialized by calling each member's default constructor or
initializing the members to 0
.
An initializer list can be empty, which means all members are initialized to their defaults, which is different from omitting the initializer entirely. The latter causes all members to be left uninitialized. The following example shows several different initializers for class-type aggregates:
struct point { double x, y, z; }; point origin = { }; // All members initialized to 0.0 point unknown; // Uninitialized, value is not known point pt = { 1, 2, 3 }; // pt.x==1.0, pt.y==2.0, pt.z==3.0 struct line { point p1, p2; }; line vec = { { }, { 1 } }; // vec.p1 is all zero. // vec.p2.x==1.0, vec.p2.y==0.0, vec.p2.z==0.0
Only the first member of a union can be initialized:
union u { int value; unsigned char bytes[sizeof(int)]; }; u x = 42; // Initializes x.value
Initialize elements of an array with values separated by commas and enclosed in
curly braces. Multidimensional arrays can be initialized by nesting
sets of curly braces. An error results if there are more values than
elements in the array; if an initializer has fewer values than
elements in the array, the remaining elements in the array are
initialized to zero values (default constructors or 0
). If the declarator omits the array
size, the size is determined by counting the number of values in the
initializer.
An array initializer can be empty, which forces all
elements to be initialized to 0
.
If the initializer is empty, the array size must be specified.
Omitting the initializer entirely causes all elements of the array
to be uninitialized (except non-POD types, which are initialized
with their default constructors).
In the following example the size of vec
is set to 3
because its initializer contains three
elements, and the elements of zero
are initialized to 0
s because an empty initializer is
used:
int vec[] = { 1, 2, 3 }; // Array of three elements // vec[0]==1 ... vec[2]==3 int zero[4] = { }; // Initialize to all zeros.
When initializing a multidimensional array, you can flatten the curly
braces and initialize elements of the array in row major order (last
index varies the fastest). For example, both id1
and id2
end up having the same values in their
corresponding elements:
// Initialize id1 and id2 to the identity matrix. int id1[3][3] = { { 1 }, { 0, 1 }, { 0, 0, 1 } }; int id2[3][3] = { 1, 0, 0, 0, 1, 0, 0, 0, 1 };
An array of char
or
wchar_t
is special because you
can initialize such arrays with a string literal. Remember that
every string literal has an implicit null character at the end. For
example, the following two char
declarations are equivalent, as are the two wchar_t
declarations:
// The following two declarations are equivalent. char str1[] = "Hello"; char str2[] = { 'H', 'e', 'l', 'l', 'o', '\0' }; wchar_t ws1[] = L"Hello"; wchar_t ws2[] = { L'H', L'e', L'l', L'l', L'o', L'\0' };
The last expression in an initializer list can be followed by a comma. This is convenient when you are maintaining software and find that you often need to change the order of items in an initializer list. You don't need to treat the last element differently from the other elements.
const std::string keywords[] = { "and", "asm", ... "while", "xor", };
Because the last item has a trailing comma, you can easily
select the entire line containing "xor
" and move it to a different location
in the list, and you don't need to worry about fixing up the commas
afterward.
Every object has a lifetime , that is, the duration from when the memory for the object is allocated and the object is initialized to when the object is destroyed and the memory is released. Object lifetimes fall into three categories:
Objects are local to a function body or a nested block within a function body. The object is created when execution reaches the object's declaration, and the object is destroyed when execution leaves the block.
Objects can be local (with the static
storage class specifier) or
global (at namespace scope). Static objects are constructed at
most once and destroyed only if they are successfully
constructed. Local static objects are constructed when execution
reaches the object's declaration. Global objects are constructed
when the program starts but before main
is entered. Static objects are
destroyed in the opposite order of their construction. For more
information, see "The main Function" in Chapter 5.
Objects created with new
expressions are dynamic. Their
lifetimes extend until the delete
expression is invoked on the
objects' addresses. See Chapter
3 for more information about the new
and delete
expressions.