Blog

"Why can't I compare strings with ==?" (and other common C beginner pitfalls)

February 12, 2026

A common place of confusion with devs coming from Python are details related to pointers and their implications. The purpose of this blog is not to give you facts about C to memorize, but to explain the logic behind things work the way that they do.

Pointers

The memory space of a program can be seen as being split into bytes. Each byte has a location within that space and at that location data is stored. A pointer is just a variable that can store a memory address. Pointers exist to point to the locations where memory is stored. #include <stdio.h> int main() { int i = 10; // & gets the memory location of i to be stored in p int *p = &i; // The next line prints the memory location of i, // as expected (hopefully). printf("%p", p); return 0; }

So what about strings?

Unlike the higher level languages you may be more familiar with, C does not have support for complex datatypes like the string. There are a couple of ways that strings can be used, and they are all related to pointers in some way. With one caveat (look into the "struct hack" technique), strings are chars in sequence with a null character at the end indicating the end of the string.

One string representation is the char[] character array. One element should be the null character indicating the end of the string.

Another is the char* pointer to a char. The char that it points to is the first character is the string and the string ends where the next null character shows up. As you may have noticed, this is quite similar to the character array. This is explained in more detail later.

The actual storage for a string is based on singular elements. It is a complex data type made of the primitive data type of char which is why there is a null character at the end. It is also why strings are accessed one char at time if done directly. A single element can be accessed or a pointer to a character can be accessed which means the entire string cannot be compared with a single comparison using ==.

How to Actually Compare Strings

Strings must be compared character by character which means the comparison of strings is an O(n) operation. Luckily, this does not have to be done manually as the <string.h> header contains many useful string functions.

The one we are currently interested in has the prototype: int strcmp(const char* str1, const char* str2). In other words, that is the skeleton of the function. It takes in two strings and returns 0 if the strings are equal. It returns a positive value if str1 is greater than str2, and it returns a negative value if str1 is less than str2.

Using strcmp we can finally compare two strings: #include <stdio.h> #include <string.h> int main() { char str1[] = "hello"; char str2[] = "hello"; if (strcmp(str1, str2) == 0) { printf("The strings are equal.\n"); } return 0; }

Arrays as Pointers (kind of)

Array names in many contexts can be seen as pointers, and it is not only true for character arrays. Take the situation before of defining a string using a pointer to the first character of the string. We can subscript this just like we would do with an array. #include <stdio.h> #include <string.h> int main() { char* str1 = "hello"; // Prints 'h'. printf("%c", &str1[0]); return 0; }

The same holds true for using an array name (just the name - no subscripts) as if it was a pointer. This concept can be referred to as the array "decomposing" into a pointer. One thing to note is that an array is NOT a pointer, however, you can safely use array names as if they were pointers and vice versa.

A compiler detail that may help clear up some confusion with this is the way that subscripting is handled. When dealing with pointer arithmetic, as long as the pointer is a pointer to an array element, it behaves the most convenient way it could. If p is a pointer to the first element of an array, then p+1 is a pointer to the second element of an array. For subscripting, in modern C, a[0] is equivalent to *(a + 0) which is the pointer to the first element plus 0 with the pointer dereference operator, *, to access the value stored at the location. That pattern does also mean 0[a] works since addition is commutative, but I would not recommend using that in any codebase that other people may have to see.

A lot of this explanation assumes a base level of knowledge in C which the reader may not have. Check the further reading section for recommended readings if any part of this post confuses you or you would like to know more.

restrict

If you clicked on this blog post, you probably do not have enough experience with C to care about this section or any upcoming sections. Nevertheless, if you are interested, feel free to continue. This section mentions the restrict keyword.

restrict is used like: int * restrict p. The point of it is to discourage the usage of other pointer variables to change the values stored at the memory location saved in the restrict pointer. restrict is limited based on scope, so it is still technically possible to update the memory with a different pointer variable.

It is considered by most to be an advanced C feature, and it may seem pointless, but it allows the compiler to make performance optimizations.

Flexible Array Members and High-Performance Code

On the topic of performance, for readers familiar with manual memory allocation and structs, I have a technique to recommend: the briefly mentioned "struct hack". #include <stdio.h> #include <string.h> struct String { int STR_LEN; char str1[]; }; int main() { int n = 10; struct String *s = malloc(sizeof(struct String) * n); s->STR_LEN = n; return 0; } Rather than needing to define the character array length ahead of time and wasting space, we can dynamically allocate space for the string. It seems like we could have just put a string in the struct using a normal pointer inside of the struct. Yes, this would have allowed us to dynamically allocate the memory for the string, but we would have lost the benefit of keeping the memory sequential.

A struct's members are always stored in memory sequentially. Memory allocated with malloc is also sequential. However, in our example, if the character array of no length was instead a character pointer, the first integer member and the pointer itself would be sequential. The data then referenced by the pointer would not be sequential in memory space. By using the struct hack, we keep the struct members and the data that ends up becoming part of the struct in sequence.

This is valuable for performance. The CPU has basically two relevant patterns which are based on time, how recent a variable was used influences the chance of it being used again, and space, data being located nearby has a higher likelihood of being referenced. In our case, we are letting the CPU take advantage of having the data in sequence. This decreases the chance of cache misses and saves CPU cycles.