C++ tokenizer

Poby’s Home
2 min readJul 3, 2022

several ways tokenizing a string sequence

1. strtok

The strtok() function breaks a string into a sequence of zero or more
nonempty tokens. On the first call to strtok(), the string to be
parsed should be specified in str. In each subsequent call that should
parse the same string, str must be NULL.

strtok is not “re-entrant”!!! Better not use this. Instead, use strtok_r, which is the re-entrant version of it.

// C/C++ program for splitting a string
// using strtok()
#include <stdio.h>
#include <string.h>
int main()
{
char str[] = "Geeks-for-Geeks";
// Returns first token
char *token = strtok(str, "-");
// Keep printing tokens while one of the
// delimiters present in str[].
while (token != NULL)
{
printf("%s\n", token);
token = strtok(NULL, "-");
}
return 0;
}

2. strtok_r

The strtok_r() function is a reentrant version of strtok(). The saveptr argument is a pointer to a char * variable that is used
internally by strtok_r() in order to maintain context between successive calls that parse the same string.

// C/C++ program to demonstrate working of strtok_r()
// by splitting string based on space character.
#include<stdio.h>
#include<string.h>
int main()
{
char str[] = "Geeks for Geeks";
char *token;
char *rest = str;
while ((token = strtok_r(rest, " ", &rest)))
printf("%s\n", token);
return(0);
}

3. stringstream

// C/C++ program for splitting a string
#include <stdio.h>
#include <string>
#include <sstream> // stringstream
using namespace std;
int main()
{
string s = "hello world my world";
stringstream input(s);
string token;
while (getline(input, token, ' ')) {
printf("%s\n", token.c_str());
}
return 0;
}

--

--