When to use the terms “delimiter,” “terminator,” and “separator”

A delimiter denotes the limits of something, where it starts and where it ends. For example:

"this is a string"

has two delimiters, both of which happen to be the double-quote character. The delimiters indicate what’s part of the thing, and what is not.

A separator distinguishes two things in a sequence:

one, two
1\t2
code();  // comment

The role of a separator is to demarcate two distinct entities so that they can be distinguished. (Note that I say “two” because in computer science we’re generally talking about processing a linear sequence of characters).

A terminator indicates the end of a sequence. In a CSV, you could think of the newline as terminating the record on one line, or as separating one record from the next.

Token boundaries are often denoted by a change in syntax classes:

foo()

would likely be tokenised as word(foo), lparen, rparen – there aren’t any explicit delimiters between the tokens, but a tokenizer would recognise the change in grammar classes between alpha and punctuation characters.

The categories aren’t completely distinct. For example:

[red, green, blue]

could (depending on your syntax) be a list of three items; the brackets delimit the list and the right-bracket terminates the list and marks the end of the blue token.

As for SO’s use of those terms as tags, they’re just that: tags to indicate the topic of a question. There isn’t a single unified controlled vocabulary for tags; anyone with enough karma can add a new tag. Enough differences in terminology exist that you could never have a single controlled tag vocabulary across all of the topics that SO covers.

Leave a Comment Cancel reply