I have unintentionally raised a large debate recently concerning the question of whether it is legal in C/C++ to use the &P->m_foo expression with P being a null pointer. The programmers' community divided into two camps. The first claimed with confidence that it isn't legal, while the others were as sure that it is. Both parties gave various arguments and links, and it occurred to me that at some point I had to make things clear. For that purpose, I contacted Microsoft MVP experts, and the Visual C++ Microsoft development team communicating through a closed mailing list. They helped me to prepare this article and now everyone interested is welcome to read it. For those who can't wait to learn the answer: That code is NOT correct.
It all started with an article about a Linux kernel check with the PVS-Studio analyzer. But the issue doesn't have anything to do with the check itself. The point is that in that article I cited the following fragment from Linux' code:
static int podhd_try_init(struct usb_interface *interface,
struct usb_line6_podhd *podhd)
{
int err;
struct usb_line6 *line6 = &podhd->line6;
if ((interface == NULL) || (podhd == NULL))
return -ENODEV;
....
}
I called this code dangerous because I thought it to cause undefined behavior.
After that, I got a pile of emails and comments, readers objecting to that idea of mine, and was even close to giving in to their convincing arguments. For instance, as proof of that code being correct they pointed out the implementation of the offsetof macro, typically looking like this:
#define offsetof(st, m) ((size_t)(&((st *)0)->m))
We deal with null pointer dereferencing here, but the code still works well. There were also some other emails reasoning that since there had been no access by null pointer, there was no problem.
Although I tend to be gullible, I still try to double-check any information I may doubt. I started investigating the subject, and eventually wrote a small article: "Reflections on the Null Pointer Dereferencing Issue".
Everything suggested that I had been right: One cannot write code like that. But I didn't manage to provide convincing proof for my conclusions, and cite the relevant excerpts from the standard.
After publishing that article, I was again bombarded by protesting emails, so I thought I should figure it all out once and for all. I addressed language experts with a question, to find out their opinions. This article is a summary of their answers.
The '&podhd->line6' expression is undefined behavior in the C language when 'podhd' is a null pointer.
The C99 standard says the following about the '&' address-of operator (6.5.3.2 "Address and indirection operators"):
The operand of the unary & operator shall be either a function designator, the result of a [] or unary * operator, or an lvalue that designates an object that is not a bit-field and is not declared with the register storage-class specifier.
The expression 'podhd->line6' is clearly not a function designator, the result of a [] or * operator. It is an lvalue expression. However, when the 'podhd' pointer is NULL, the expression does not designate an object since 6.3.2.3 "Pointers" says:
If a null pointer constant is converted to a pointer type, the resulting pointer, called a null pointer, is guaranteed to compare unequal to a pointer to any object or function.
When "an lvalue does not designate an object when it is evaluated, the behavior is undefined" (C99 6.3.2.1 "Lvalues, arrays, and function designators"):
An lvalue is an expression with an object type or an incomplete type other than void; if an lvalue does not designate an object when it is evaluated, the behavior is undefined.
So, the same idea in brief:
When -> was executed on the pointer, it evaluated to an lvalue where no object exists, and as a result the behavior is undefined.
In the C++ language, things are absolutely the same. The '&podhd->line6' expression is undefined behavior here when 'podhd' is a null pointer.
The discussion at WG21 (232. Is indirection through a null pointer undefined behavior?), to which I referred to in the previous article, brings in some confusion. The programmers participating in it insist that this expression is not undefined behavior. However, no one has found any clause in the C++ standard permitting the use of "podhd->line6" with "podhd" being a null pointer.
The "podhd" pointer fails the basic constraint (5.2.5/4, second bullet) that it must designate an object. No C++ object has nullptr as address.
struct usb_line6 *line6 = &podhd->line6;
This code is incorrect in both C and C++, when the podhd pointer equals 0. If the pointer equals 0, undefined behavior occurs.
The program running well is pure luck. Undefined behavior may take different forms, including program execution in just the way the programmer expected. It's just one of the special cases of undefined behavior, and that's all.
You cannot write code like that. The pointer must be checked before being dereferenced.
This article was made possible thanks to the experts whose competence I can see no reason to doubt. I want to thank the following people for helping me in writing it:
0