Thursday, November 20, 2003 6:02 AM
pdbartlett
C++ string literals are NOT const
I guess I always subconciously knew this was true, but had never really thought about the consequences.
First up is a variation of the infamous K&R "Hello world!" example program:
#include
void DoIt(char* sz)
{
puts(sz);
*sz = 'J';
}
int main(int argc, char** argv)
{
DoIt("Hello world!");
DoIt("Hello world!");
return 0;
}
The output of this, under MS VC++ 6.0 at least, is dependent on the compiler's string pooling setting:
- if disabled, the output will be two instances of "Hello world!"
- if enabled, and using writable memory (/Gf), the output will be "Hello world!" followed by "Jello world!"
- if enabled, and using read-only memory (/GF), an access violation will occur
So far nothing should be a real surprise to anyone who knew the C++ standard a bit better than I did.
However, the circumstances where I came across this non-constness were somewhat different, and illustrate a couple of unrelated coding issues. A had a function similar to this:
void MyFunc(LPTSTR tszMsg)
{
CComVariant var = tszMsg;
...
}
which I was calling with a string literal.
The first error was mine, in that I should have made the function parameter const as the function did not change it.
However there is also, IMHO, a mistake on the part of the ATL developer who coded the CComVariant class's constructors. They include versions which take a BSTR, which is just a typedef for unsigned short*, and an LPCOLESTR, which is a typedef for const unsigned short*. They are therefore (ab)using the constness of the parameter to convey semantic information about the type (i.e. whether it has a length prefix), presumably because the types are syntactically indistinguishable.
CComBSTR does the safe thing and assumes any such string is NUL terminated unless a length is specifically passed. It also uses differently named methods, such as AppendBSTR, where overloading would not work.
As an aside, CComVariant jumps through lots of preprocessor hoops (UNICODE, OLE2ANSI, etc) in order to provide distinct constructors and operators for "narrow" and "wide" strings. Much simpler, and less error prone (especially as the reasoning behind OLE2ANSI almost certainly dropped out of everyone's L2 cache a long time ago, except maybe for Raymond Chen) would be to use LPCSTR and LPCWSTR directly.
Of course, none of this would be an issue if I were using .NET...
UPDATE: The throw-away line at the end was meant in the sense of "I wouldn't be doing stuff with CComBSTR if I was using .NET", but in fact it's true in another sense. MS have "fixed" this problem in a breaking change to the C++ compiler in VS.NET 2003. More details here.