I was reading a thread titled “strlen vs sizeof” on CodeGuru, and one of the replies states that “it’s anyways [sic] bad practice to initialie [sic] a char
array with a string literal.”
Is this true, or is that just his (albeit an “elite member”) opinion?
Here is the original question:
#include <stdio.h>
#include<string.h>
main()
{
char string[] = "october";
strcpy(string, "september");
printf("the size of %s is %d and the length is %dnn", string, sizeof(string), strlen(string));
return 0;
}
right. the size should be the length plus 1 yes?
this is the output
the size of september is 8 and the length is 9
size should be 10 surely. its like its calculating the sizeof string before it is changed by strcpy but the length after.
Is there something wrong with my syntax or what?
Here is the reply:
It’s anyways bad practice to initialie a char array with a string literal. So always do one of the following:
const char string1[] = "october";
char string2[20]; strcpy(string2, "september");
0
It’s anyways bad practice to initialie a char array with a string literal.
The author of that comment never really justifies it, and I find the statement puzzling.
In C (and you’ve tagged this as C), that’s pretty much the only way to initialize an array of char
with a string value (initialization is different from assignment). You can write either
char string[] = "october";
or
char string[8] = "october";
or
char string[MAX_MONTH_LENGTH] = "october";
In the first case, the size of the array is taken from the size of the initializer. String literals are stored as arrays of char
with a terminating 0 byte, so the size of the array is 8 (‘o’, ‘c’, ‘t’, ‘o’, ‘b’, ‘e’, ‘r’, 0). In the second two cases, the size of the array is specified as part of the declaration (8 and MAX_MONTH_LENGTH
, whatever that happens to be).
What you cannot do is write something like
char string[];
string = "october";
or
char string[8];
string = "october";
etc. In the first case, the declaration of string
is incomplete because no array size has been specified and there’s no initializer to take the size from. In both cases, the =
won’t work because a) an array expression such as string
may not be the target of an assignment and b) the =
operator isn’t defined to copy the contents of one array to another anyway.
By that same token, you can’t write
char string[] = foo;
where foo
is another array of char
. This form of initialization will only work with string literals.
EDIT
I should amend this to say that you can also initialize arrays to hold a string with an array-style initializer, like
char string[] = {'o', 'c', 't', 'o', 'b', 'e', 'r', 0};
or
char string[] = {111, 99, 116, 111, 98, 101, 114, 0}; // assumes ASCII
but it’s easier on the eyes to use string literals.
EDIT2
In order to assign the contents of an array outside of a declaration, you would need to use either strcpy/strncpy
(for 0-terminated strings) or memcpy
(for any other type of array):
if (sizeof string > strlen("october"))
strcpy(string, "october");
or
strncpy(string, "october", sizeof string); // only copies as many characters as will
// fit in the target buffer; 0 terminator
// may not be copied, but the buffer is
// uselessly completely zeroed if the
// string is shorter!
6
The only problem I recall is assigning string literal to char *
:
char var1[] = "september";
var1[0] = 'S'; // Ok - 10 element char array allocated on stack
char const *var2 = "september";
var2[0] = 'S'; // Compile time error - pointer to constant string
char *var3 = "september";
var3[0] = 'S'; // Modifying some memory - which may result in modifying... something or crash
For example take this program:
#include <stdio.h>
int main() {
char *var1 = "september";
char *var2 = "september";
var1[0] = 'S';
printf("%sn", var2);
}
This on my platform (Linux) crashes as it tries to write to page marked as read-only. On other platforms it might print ‘September’ etc.
That said – initialization by literal makes the specific amount of reservation so this won’t work:
char buf[] = "May";
strncpy(buf, "September", sizeof(buf)); // Result "Sep"
But this will
char buf[32] = "May";
strncpy(buf, "September", sizeof(buf));
As last remark – I wouldn’t use strcpy
at all:
char buf[8];
strcpy(buf, "very long string very long string"); // Oops. We overwrite some random memory
While some compilers can change it into safe call strncpy
is much safer:
char buf[1024];
strncpy(buf, something_else, sizeof(buf)); // Copies at most sizeof(buf) chars so there is no possibility of buffer overrun. Please note that sizeof(buf) works for arrays but NOT pointers.
buf[sizeof(buf) - 1] = '';
4
Primarily because you won’t have the size of the char[]
in a variable / construct that you can easily use within the program.
The code sample from the link:
char string[] = "october";
strcpy(string, "september");
string
is allocated on the stack as 7 or 8 characters long. I can’t recall if it’s null-terminated this way or not – the thread you linked to stated that it is.
Copying “september” over that string is an obvious memory overrun.
Another challenge comes about if you pass string
to another function so the other function can write into the array. You need to tell the other function how long the array is so it doesn’t create an overrun. You could pass string
along with the result of strlen()
but the thread explains how this can blow up if string
is not null-terminated.
You’re better off allocating a string with a fixed size (preferably defined as a constant) and then pass the array and fixed size to the other function. @John Bode’s comment(s) are correct, and there are ways to mitigate these risks. They also require more effort on your part to use them.
In my experience, the value I initialized the char[]
to is usually too small for the other values I need to place in there. Using a defined constant helps avoid that issue.
sizeof string
will give you the size of the buffer (8 bytes); use the result of that expression instead of strlen
when you’re concerned about memory.
Similarly, you can make a check before the call to strcpy
to see if your target buffer is large enough for the source string: if (sizeof target > strlen(src)) { strcpy (target, src); }
.
Yes, if you have to pass the array to a function, you’ll need to pass its physical size as well: foo (array, sizeof array / sizeof *array);
. – John Bode
4
One thing that neither thread brings up is this:
char whopping_great[8192] = "foo";
vs.
char whopping_great[8192];
memcpy(whopping_great, "foo", sizeof("foo"));
The former will do something like:
memcpy(whopping_great, "foo", sizeof("foo"));
memset(&whopping_great[sizeof("foo")], 0, sizeof(whopping_great)-sizeof("foo"));
The latter only does the memcpy. The C standard insists that if any part of an array is initialized, it all is. So in this case, it’s better to do it yourself. I think that may have been what treuss was getting at.
For sure
char whopping_big[8192];
whopping_big[0] = 0;
is better than either:
char whopping_big[8192] = {0};
or
char whopping_big[8192] = "";
p.s. For bonus points, you can do:
memcpy(whopping_great, "foo", (1/(sizeof("foo") <= sizeof(whopping_great)))*sizeof("foo"));
to throw a compile time divide by zero error if you’re about to overflow the array.
I think the “bad practise” idea comes from the fact that this form :
char string[] = "october is a nice month";
makes implicitly a strcpy from the source machine code to the stack.
It is more efficient to handle only a link to that string. Like with :
char *string = "october is a nice month";
or directly :
strcpy(output, "october is a nice month");
(but of course in most code it probably doesn’t matter)
2
Never is really long time, but you should avoid initialization char[] to string, because, “string” is const char*, and you are assigning it to char*.
So if you pass this char[] to method who changes data you can have interesting behavior.
As commend said I mixed a bit char[] with char*, that is not good as they differs a bit.
There’s nothing wrong about assigning data to char array, but as intention of using this array is to use it as ‘string’ (char *), it is easy to forget that you should not modify this array.
8