Validation of Constructs

Validation of Constructs

When parsing a Regular Expression, all constructs are validated as far

as its FORM are concerned.

For example, subtraction character class syntax is different than regular

character class syntax. So if the PCRE preset is selected and the expression

contains a subtraction class, errors may result.

Another example are conditional’s. All conditional forms are validated with

respect to the selected language chosen. So if the Perl preset is selected and

the conditional is an expression, an error will result. If the Dot-Net preset

is selected and the condition is a code form, an error will result.

So all of the syntax that tell a parser that it found a specific construct

in a particular language flavor, are recognized by RegexFormat.

This includes naming conventions and allowable character syntax in variables,

backrefs, hex, unicode, named unicode, recursions constructs, etc..

Things that are NOT validated:

- Character ranges in classes

Example, in [z-a] it is not validated that ‘z’ comes before ‘a’

- Property Names

Example, in \p{Vanderberg}, there is no lookup to see if ‘Vanderberg‘ exists.

- Backtracking Control Verbs

Example, in (*FROTO), there is no lookup to see if ‘FROTO’ is a valid verb

- Non ‘x’ modifier letters in cluster form or modifier group

Modifier form syntax is validated (Perl or not) but only the modifier letter ‘x’

is resolved. Example, in (?xP) or (?-xP: \d), there is no check to see if ‘P’ is valid.

Indeed, it may be valid in the future.

The x-modifier is always processed to ascertain the current scope of expansion or

compression the RegexFormat engine uses.

The scope of RegexFormat is to be as flexible as possible in its ability to continue to format

an expression. This means moving past a particular languages’ secondary level of validation.