To explain a little bit how it works: it grabs one character at a time from the input script, and keeps a record of the context – whether it is inside a string, inside a comment and so on:
var CODE = 0; /* normal JS code */ var STRING_DBL = 1; /* double quoted string */ var STRING_SGL = 2; /* single quoted string */ var REGEXP = 3 ; /* regexp literal */ var ESCAPE = 4 ; /* some escape char (backslash) */ var MULTI_LINE_COMMENT = 5 ; var SINGLE_LINE_COMMENT = 6 ;
For every character, it considers whether this character will change the context from one mode to another. For example, if we're in normal code and see a " character, we switch to "double quoted string" mode – and vice versa, if we're already in double quoted mode we switch back to "normal code" mode. If, however, we're in "escape" mode (typically after a backslash inside a string) and see a " character the script knows that this double quote does not terminate the string.
it is a divisor, while in the expression
it marks the boundary of a regular expression. Hence, if we are in code mode and see a forward slash, it takes some extra thinking to tell if we should enter "regular expression" mode or not.
Last week, the forward slash was causing me trouble again. Looking at a problem here, I noticed that the formatting got all messed up after this statement:
Did you find the usage of an un-escaped forward slash in the regexp character class weird? Apparently,
is a valid regular expression. 😮