ABSTRACT
Managing strings in C often requires dynamic allocation because the length of input is rarely known at compile-time. By using a struct to wrap a heap-allocated char*, we can create persistent, flexible string objects.
1. Struct Alignment and Size
When defining a struct that contains a pointer, the compiler must ensure that members are correctly “aligned” in memory for efficient CPU access.
typedef struct String {
int len; // 4 bytes
char* contents; // 8 bytes (on a 64-bit system)
} String;sizeof(String): Even though ,sizeof(String)will often return 16.- Alignment: The
char* contentsmember is 8 bytes long (since it is an address). C requires this member to start at an address that is a multiple of 8 (8-byte aligned). This results in 4 bytes of “padding” after theint len.
2. Implementing readline
In high-level languages like Python, reading input is handled automatically. In C, we use a temporary buffer on the Stack to capture input and then move it to a precisely sized block on the Heap.
String readline(){
char input[1000]; // Temporary stack buffer
if (fgets(input, sizeof(input), stdin) == NULL) {
return (String){0, NULL};
}
int length = strlen(input);
// Allocate exactly enough space on the heap (+1 for null terminator)
char* contents = malloc(length + 1);
strcpy(contents, input);
String s = { length, contents };
return s; // The struct is copied to the caller, but the 'contents' pointer still points to the heap.
}- Why Malloc?: Using
mallocensures the string data persists afterreadlinereturns. - Copying Behavior: When
sis returned, thelenand thecontentsaddress are copied to the caller’s stack, but the actual text remains in the same spot on the heap.
3. The Warning: Memory Leaks
Because malloc does not automatically clean up, this implementation currently suffers from a Memory Leak.
while(1){
String s = readline(); // A new heap block is created every loop
if(strstr(s.contents, to_find) != NULL){
printf("%s\n", s.contents);
}
// WARNING: 's.contents' is never freed!
}WARNING
Every call to readline allocates new memory. Once the loop repeats and the variable s is overwritten, the previous pointer is lost. Those heap blocks become unreachable, wasting memory until the program terminates.
Key Takeaway
To fix this leak, the caller must manually call free(s.contents) once they are finished with the string. This ensures the heap space is reclaimed for future allocations.