When I first wrote a small parser in C to count some metadata operators, it worked perfectly fine on macOS. But as soon as I compiled the same code on Linux, it blew up with a scary error:
malloc(): memory corruption
Aborted (core dumped)
At first, I thought it was a compiler issue or some subtle platform difference. But after digging deeper, I realized it was my mistake all along. Let me walk you through the journey, what went wrong, and how I fixed it.
The Original Code
Here’s the function I started with:
int meta_counter(FILE *meta_file){
int counter = 0;
char *c;
char line[1024];
while ((c = fgets(line, sizeof(line), meta_file)) != NULL)
{
char *first = malloc(sizeof(c));
strcpy(first,c);
char *rest = strchr(first, ' ');
*rest = 0;
if (strcmp(first,"Start") != 0 && strcmp(first,"End") != 0) {
//handle typos
char *d = remove_white_spaces(c);
replace_string(d,';',':');
replace_string(d,'.',':');
char *e = (char*)malloc(sizeof(d) + 1);
remove_string(e, d, ' ');
// put a ':' at the end of the line
if (e[strlen(e)-1] != ':') e[strlen(e)] = ':';
//count operators in line 'e'
char *key = ":";
char *ptr = e;
while((ptr = strchr(ptr, ':')) != NULL) {
counter++;
ptr++;
}
}
}
rewind(meta_file);
return counter;
}
Looks innocent, right But it hides a very subtle C trap.
The Real Error: sizeof
Misuse
The root bug is here:
char *first = malloc(sizeof(c));
c
is a pointer.sizeof(c)
gives me the size of the pointer (8 bytes on 64-bit Linux).- I then
strcpy(first, c)
, which copies the entire string into an 8-byte buffer. Boom—heap corruption.
I repeated the same mistake later with:
malloc(sizeof(d) + 1);
which again only gave me space for a pointer, not for the string.
Why didn’t this crash on macOS? Well, macOS’s memory allocator is more forgiving. It sometimes lets buffer overruns slip by until they clobber something critical. Linux/glibc, on the other hand, catches heap corruption faster and aborts the process. Same bug, different behavior.
Other Hidden Bugs
While debugging, I found a few more issues:
*rest = 0;
without checking ifrest
isNULL
.- Appending
':'
without making space for a'\0'
. - Memory leaks (
first
,d
,e
never freed). - Treating the result of
fgets
as permanent storage (it just points toline
).
The Fix and Safer Version
I rewrote the parser with more safety in mind:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
typedef struct {
size_t colon_count;
size_t lines_read;
size_t lines_skipped; // "Start"/"End"
size_t malformed; // lines missing expected pieces
} MetaStats;
static char *xstrdup(const char *s) {
size_t n = strlen(s) + 1;
char *p = (char *)malloc(n);
if (!p) { perror("malloc"); exit(EXIT_FAILURE); }
memcpy(p, s, n);
return p;
}
static void rstrip(char *s) {
size_t n = strlen(s);
while (n && (unsigned char)s[n-1] <= ' ') s[--n] = '\0';
}
static char *lstrip_inplace(char *s) {
while (*s && (unsigned char)*s <= ' ') s++;
return s;
}
static void replace_chars(char *s, char from, char to) {
for (; *s; ++s) if (*s == from) *s = to;
}
static void remove_char_copy(char *dst, const char *src, char ch) {
while (*src) {
if (*src != ch) *dst++ = *src;
src++;
}
*dst = '\0';
}
static size_t count_char(const char *s, char ch) {
size_t n = 0;
for (; *s; ++s) if (*s == ch) ++n;
return n;
}
MetaStats meta_counter2(FILE *meta_file) {
MetaStats stats = {0,0,0,0};
char line[1024];
while (fgets(line, sizeof line, meta_file)) {
stats.lines_read++;
// normalize line
rstrip(line);
char *cur = lstrip_inplace(line);
if (*cur == '\0') continue; // empty
if (*cur == '#') continue; // comment
// copy for tokenizing
char *first = xstrdup(cur);
// isolate first token
char *space = strchr(first, ' ');
if (space) *space = '\0';
if (strcmp(first, "Start") == 0 || strcmp(first, "End") == 0) {
stats.lines_skipped++;
free(first);
continue;
}
// typo handling
char *d = xstrdup(cur);
replace_chars(d, ';', ':');
replace_chars(d, '.', ':');
// remove spaces
size_t need = strlen(d) + 2;
char *e = (char *)malloc(need);
if (!e) { perror("malloc"); exit(EXIT_FAILURE); }
remove_char_copy(e, d, ' ');
// ensure trailing ':'
size_t len = strlen(e);
if (len == 0) {
stats.malformed++;
} else {
if (e[len-1] != ':') {
e[len] = ':';
e[len+1] = '\0';
}
stats.colon_count += count_char(e, ':');
}
free(first);
free(d);
free(e);
}
rewind(meta_file);
return stats;
}
int main(void) {
FILE *fp = fopen("meta.txt", "r");
if (!fp) { perror("meta.txt"); return 1; }
MetaStats s = meta_counter2(fp);
printf("Colons: %zu\nLines read: %zu\nSkipped: %zu\nMalformed: %zu\n",
s.colon_count, s.lines_read, s.lines_skipped, s.malformed);
fclose(fp);
return 0;
}
Why This Works Better
- Allocates
strlen(...) + 1
for strings (or usesstrdup
). - Always checks pointers before dereferencing.
- Guarantees NUL-termination after modifications.
- Frees heap memory at the end of each loop iteration.
- Adds extra stats (
lines_skipped
,malformed
) for practice.
Extra Practice Ideas
While fixing this, I added a few more features for fun and learning:
- Support comments (
#
at start of line). - Count malformed lines.
- Make “Start”/“End” comparisons case-insensitive.
- Replace the static buffer with
getline
for unlimited line size. - Run under Valgrind to confirm zero leaks:
valgrind --leak-check=full ./meta
Final Thought
In the end, fixing this bug reminded me just how unforgiving C can be when it comes to memory management. A small misuse of sizeof
was enough to crash my program on Linux, even though it seemed to “work” on macOS. By switching to strlen(...) + 1
, carefully checking pointers, and always freeing what I allocate, I’ve built a safer and more portable parser.