RegexFormat 8  -  Unicode Special Edition



Major Version 8    Released 2-8-2017

Explore Unicode 9 with super controls.


This version is built with VS2015 and requires those  MFC / CRT  runtime libraries.

They are included in the setup as Merge Modules.

If you have  problems with installation, download and install the distributable  from Microsoft.

Alternatively, they are available from the distributable directory:

vc_redist.x64.exe   or   vc_redist.x86.exe



Available in 32 and 64 bit versions.



Quick Download

A zipped install of the latest version can be downloaded here:

32-bit : Version 8   and   64-bit : Version 8   __  or from the Download directory.



            Quick Links:

Download  or  Samples  directory,     v7  or  v6   history



   Important Note(s):

RegexFormat 8 uses crypto services from CryptoAPI.dll. This is usually  located in the

Windows\System32 directory. Please insure that it is installed.



Version History   ( Latest build:  8.4 - 2 )



Version 8.4 – 2        1/11/2018          Added the current Regex engine type text to the Formatted Output tab label (MDI document),

                                                       which is currently selected in the flags pane.

                                                       When the engine format type changes, the engine  type is included in the formatted tab label.

                                                       Also, when changed, an arrow indicator is set in the State button text as a reminder

                                                       that the regex source needs to be reformatted. The reminder arrow disappears upon

                                                       subsequent formatting. This extends the properties to a more easily visible location.


Version 8.4 – 0      12/1/2017          Release minor sub-version 8.4.



                                                       A Cumulative update ( which includes the previous regex engine modifications ),

                                                       new changes and some bug fixes.



                                                       - Within the Hex Reader  dialog.

                                                          Replace “CRLF” metrics and highlighting to encompass all Unicode line breaks.

                                                          Modified “Whitespace” metrics and highlighting to encompass all Unicode whitespace.

                                                       - A Save All Modified menu item added in the “File” menu. Includes a Yes to all button within the dialog.



                                                       - Within the Format regex code, fixed a bug where some whitespace was not getting

                                                         escaped when in X-Mode (eXpanded).

                                                       - Within the Strings To Regex(Ternary Tree)  dialog, fixed a bug in the Simple Factoring algorithm.



Version 8.3 – 8        11/16/2017        Regex engine modifications:

                                                       - Allow Back References  to undefined groups (not yet parsed).

                                                       - Allow Nested Back References.

                                                       Note – these are significant changes to the boost regex engine, and these and the other mods

                                                       bring it up to par (and performance) with Perl’s regex engine.


Version 8.3 – 7        11/6/2017          Added another new option to the Strings to Regex Ternary Tree Tool:

                                                       Do group factoring. Screenshot:  Strings to Regex – Ternary Tree       


Version 8.3 – 6        10/27/2017        Extended  Leap Year Range to Regex  tool’s year range from 0 – 9999.




Version 8.3 – 0      10/4/2017          Release minor sub-version 8.3.

                                                       Completed the system wide conversion started in Version 8.2 – 18.

                                                       Changes apply to ICU  mode  (UTF-32) only !

                                                       The non-ICU mode regex operations remain unchanged.



                                                       Removed the facet overhead (UTF-16 to UTF-32) of searching a target string.

                                                       Now uses u32string iterators directly for regex search / replace  operations.

                                                       This includes using u32string when constructing regex,  meaning

                                                       surrogate pairs and stand alone surrogates are resolved to UTF-32 codepoints.

                                                       Results are correctly mapped / highlighted back to the wide string display’s.


                                                       Affected code:  All places in the application that use ICU mode (UTF-32).



Version 8.2 – 25      9/28/2017          Upgraded the regex engine to version 1.65.1

                                                       All modifications are carried forward.


Version 8.2 – 23      9/25/2017          Fixed a minor bug where in certain circumstances, the floating close button

                                                       failed to display (when enabled) when mouse-over the mdi-tab.


                                                       Added / modified features in the Layout->Document & MDI Tabs menu:

                                                       - Enable Active Tab Bold Font    ( default = false )

                                                       - Tab Border Width     ( 0 - 5 pixels,  default = 2 )

                                                       - Text Shading - Inactive View     ( None, %20 - %50,  default = %20 )


Version 8.2 – 21      9/22/2017          Introducing a new regex generate tool:    Leap Year Range to Regex

                                                       A truly accurate tool that lets you generate a custom Leap Year regex given a range of years.

                                                       Multiple compression levels are selectable to suite any project and performance preference.

                                                       This is the first installment of a Date/Time regex generation suite soon to be available.

                                                       Screen shots:     Ly1     Ly2     Ly3     Ly4     Ly5     Ly6     Ly7     Ly8     Ly9


Version 8.2 – 20      9/13/2017          Added a new option to the Strings to Regex Ternary Tree Tool:

                                                       Convert alternations  (?: x | y | z )  to class  [ x y z ]


Version 8.2 – 19      9/11/2017          Fixed an issue on the 32-bit version where using MemDC for virtual list control with more than

                                                       500,000 items significantly slowed performance.

                                                       The 64-bit version is unaffected.  These virtual lists are used to display Unicode names.



Version 8.2 – 18   9/8/2017            General Modifications:  Removed the facet overhead (utf16 to utf32) of searching a target string

                                                       when  in  ICU  mode. Now uses UString32 iterators directly for regex search operations.


                                                       Affected code: 

                                                       - Benchmark suite,  %100 speed increase in ICU  flagged regex.

                                                       - UCD Interface,  %100 speed increase in Custom Rx and CodePoints pages.


                                                       Note that the UCD Interface pages now have the full Code Point range available for query.

                                                       This includes leading/trailing surrogates and non-characters as well.



Version 8.2 – 14      8/30/2017          Modified Benchmark suite – Added a custom control vertical bar with thumb indicating

                                                       current top slot. This is a subtle visual indicator when scrolling slots.


                                                       UCD – Custom Rx page, expand the regex input box.

                                                       Fixed a minor startup issue on this page.



Version 8.2 – 11      8/22/2017          Regex engine modification:  Corrected  Non-word boundary construct \B.

                                                       Previously, it did not correctly match at the beginning or end of string if the adjacent

                                                       character were a non-word.


                                                       Modified the Match Results title to display the regex options used to obtain the last match.

                                                       This is an important visual aid to help quickly diagnose possible wrong, invalid or non-matches.


                                                       Expanded the Benchmark suite to eight slots available per run.

                                                       The suite has been renamed to Mega-Bench 8 to reflect the increase in slots.

                                                       Screenshots:   Bench1   Bench2   Bench3   Bench Report Generator 



Version 8.2 – 6        7/27/2017          Modified Benchmark suite to update an items run display result immediately when

                                                       it’s run finishes. Previously, item display results were updated upon completion of the last run.

                                                       In the next update we will be adding more item slots (currently there are 2 available for runs).


Version 8.2 – 5        6/21/2017          Added a  Mark Location  debug option to the Mega-String control.

                                                       This option is only enabled for the Parsing function. It adds  = text = marks at

                                                       the location where start and end string quote delimiter’s were parsed and removed.

                                                       This option helps diagnose errant string quoting.


                                                       Additionally, if the Un-escape delimiters box is checked, it adds a where the opening or closing

                                                       delimiter was removed,  or a indicating no delimiter was found, but should be at this location.

                                                       Note that un-escaping escaped delimiters does not involve marking.

                                                       This option helps diagnose errant delimited regex.

                                                       Marking is available for parsing functions: Single, Double, and No Quoting.


                                                       Screenshot:     Mega-String : Mark Location 


Version 8.2 – 2        5/23/2017          Added Python’s Raw String syntax generation to the  Mega-String control.

                                                       Options include double r”  “ or single r’  quote constructs, as well as optional intelligent

                                                       padding already built into the Mega-String control. Optional lines continued + “\n for multi-line.

                                                       Safeguards odd number of escapes anywhere in target as well at the end of the string,

                                                       and provide proper escaping of delimiters.

                                                       Screenshot:     Mega-String : Python Raw Strings 


Version 8.2 – 1        5/14/2017          Added Regex Replace Format String Syntax to include Perl, Sed, Literal, and Boost-Extended.

                                                       Formerly, by default, the Perl format string was used in replacements with no other options.

                                                       This can be set within the Macro Manager dialog just above the replace edit box.


 (n/a)                       5/4/2017           Updated IIS7 web.config to allow .rxf mime type sample files to be downloaded.

                                                       These sample files can now be downloaded from the Download directory.


Version 8.2 – 0      4/24/2017          Release minor sub-version 8.2.

                                                       Updated to Regex engine 1.64. All modifications are carried forward.


Version 8.1 – 1        4/19/2017          Regex engine modification to fix a bug in class intersection.

                                                       Update to this if version 8.1-0 was installed.


Version 8.1 – 0      4/12/2017          Release minor sub-version 8.1.

                                                       Regex engine modifications to correctly handle class intersection.

                                                       Example [^\W\D] matches only digits.


Version 8.0 – 14      4/1/2017            Modified UCD Property Search to trim whitespace and added an automatic tokenize feature.

                                                       If the initial string is not found, the tokenized parts will be searched for instead.

                                                       The token delimiters can consist of any of these characters   <space> _ - , . ' * " ; \t


Version 8.0 – 13      3/23/2017          Fixed a Benchmark issue when advancing position on a zero-length match.

                                                       In a rare case, this resulted in incorrectly reporting the number of matches on a run.


Version 8.0 – 9        3/14/2017          Added a   Unique    page to the UCD Interface dialog.

                                                       This has the same functionality as the Codepoints and Custom-Rx pages, except the regex

                                                       object is removed.  It is instead replaced by an input edit box to paste or type any string.

                                                       The string is analyzed for unique codepoints which are displayed in the result.

                                                       The result can then be processed using the same features as in the Codepoints and Custom-Rx pages.


Version 8.0 – 8        3/10/2017          Added a   Custom-Rx    page to the UCD Interface dialog.

                                                       This has the same functionality as the Codepoints page, except the regex

                                                       object is editable.  Thus, any regex construct can be used to obtain a codepoint set.

                                                       Properties from the UCD regex cache can be easily added, mixed, and matched within the regex.


Version 8.0 – 6        2/21/2017          Some UCD navigation improvements and prevent tab control from getting focus.


Version 8.0 – 5        2/20/2017          Post-release:  Fixed an issue that caused a crash

                                                       when trying to drag dockable panes after accessing the UCD names page.

                                                       If using a versions between 8.0.0 - 8.0.4 it is recommended that it be

                                                       upgraded to  version 8.0.5.






New Unicode features:


A few ‘Super Controls’ are new - UCD (Unicode Character Database) Interface

using ICU4 58.2. Overhaul of regex engine with full Unicode 9 support, Properties

(over 1200) and Names (0x10FFFF). Includes all scripts and script extensions.


UCD Info Page :   UCD Interface Usage


UCD Tab Screenshots :   Usage    Properties    Codepoints    Names    Unique    Custom-Rx


New viewer available from all editors :   Uni-Name Viewer




Included features:


This application parses,  dynamically formats/expands/compresses Regular Expressions.

Includes a built-in testing regex engine derived and modified from Boost Regex 1.64.

Includes a regex benchmarking suite.

Uses and includes the  ICU4 58.2 Library.

Includes UCD (Unicode Character Database) Interface a ‘Super Control’ suite.

Many new controls, including a Unicode Name Viewer to go with the existing Hex Viewer.

View anything from anywhere, it’s integrated into all editors.


See Online Manual (Deprecated)


The core:


It’s many strong features include formatting, expanding, compressing expressions,

advanced comment handling, auto-generated capture group comments, analysis

tools, padding, Raw/Single/Double quoted String construction of finished expressions

that can be pasted into development code.


Includes independent property views of the current regular expression providing a quick

look at its state and comprehensive construct metrics and error analysis information.

Errors can be selected in different views. For example, when an error is selected from

the view list, it is instantly selected in both the input and output views, when selected

from the output, it is selected from the input and error list, etc.., - this makes

debugging quite easy.


Also included is a selectable, completely customizable analysis overlay of  conditional’s

and capture group counting (including named groups last), as well as annotated error

reporting of the entire expression embedded in the formatted output.

Formatting continues to the end of the expression regardless of errors, thus providing

a single pass, down stream look after possibly trivial errors.


A Flags pane is provided to easily turn on/off options and settings.

Over 400 internal flag bits control the parsing/formatting engine giving maximum

flexibility to precisely control how the expression is parsed, how it is expanded or

compressed, and the look and shape of the formatted output.

Its solid parsing foundation include most all individual constructs available in

Regular Expressions are provided for and are individually selectable. There are built-in

presets for the major flavors, but everything can be customized, giving the ability to

define custom language presets.


Included Presets:

·         User-Defined

·         Default

·         Custom

·         Perl

·         PCRE

·         Dot-Net

·         Java 6

·         Java 7

·         JavaScript


Expression with embedded ‘expanded’ or ‘compressed’ modes are handled seamlessly

by the engine.


Easily unveil the most complex packed expressions in existence with the click of a button.

Debug, refactor, make changes, then pack it back up for production.

Save the document (.rxf) with all of its views and Flags state, open it later when the

time comes for modification or maintenance or for quick recollection.


Whether a novice or expert, if you use Regular Expressions, this application will save

you hours of work.  See it, change it, and maintain it as real code.



Supported Platforms:

Windows XP, Vista, 7, 8, 10


Download RegexFormat

A zipped install of the latest version can be downloaded here

->       32-bit : Version 8   and   64-bit : Version 8


Manual/Help File:


Version 4.2 manual is included in the installation (or available online – see above link),

but can also be downloaded here ->  Manual/Help File



Unzip the files to a temporary directory then run the  Setup.exe  program.

The installed  Samples  directory contains data files with which to evaluate the application.

Miscellaneous samples can be obtained and are added to the Samples directory.



To Purchase:

Single and Multi-Site License(s) are offered and are now available for purchase.

Accepted payment methods include Major Credit Card or PayPal account.

Questions can be directed to support@regexformat.com


Choose a RegexFormat license purchase option:


Ø  Single License -   Price  $29 (USD)




Ø  MULT-Site License -   Price  $25 (USD) / ea. , quantity 2-100

(Requires an organization name/address)




A  registration key will be emailed to you after the purchase process completes.




RDNC Software

RegexFormat – Copyright  ©  2013 – 2018  RDNC Software