Introduction

Hello wordfile creator!

Here is what you are waiting for a long time – a macro (set) which is able to sort all words of all color groups of a syntax highlighting language definition.

It handles correct case sensitivity according to Nocase, words beginning with /, substrings defined with ** and also special language settings like HTML_LANG, XML_LANG and LATEX_LANG. FORTRAN_LANG and the other language markers have no importance on the sort order of the words in the color groups.

It does not matter if the language definition with words to sort is stored in a file with other language definitions, for example wordfile.txt or wordfile.uew, or the file contains only one language definition. Also blank lines within the whole language definition are allowed and are removed by the macro during execution. Set the caret anywhere within the language definition you want to sort and start the macro SortLanguage. That's all, lean back and look what's going on.

ATTENTION!

Do not use the macro SortLanguage with UltraEdit v15.00.0.1033 to v15.00.0.1047. With
these versions of UltraEdit this macro does not work because of two bugs in UltraEdit.

The macro is not working correct if the setting Automatically copy to clipboard when selection is made is enabled at Configuration - Editor - Miscellaneous. Uncheck this setting before running the macro. And also disable word-wrap mode if word-wrap is enabled for the wordfile or by default for new (temporary) files.

For wordfiles containing only a single syntax highlighting language see also command line tool SortLanguage and the Windows GUI UE Companion Utility.


General sorting requirements

Here are some general information about the sorting requirements of words in a syntax highlighting wordfile for UltraEdit and UEStudio.

The first line of a syntax highlighting language block in a wordfile is the language definition line. It starts with uppercase /Lx with x is a number in the range of 1 to 20. Normally the name of the syntax highlighting language in double quotes follows immediately the language number. The language definition line must end with either File Extensions = or File Names = and the list of file extensions or file names of those files of which content should be highlighted with this syntax highlighting language.

All keywords and key strings supported by UltraEdit and UEStudio to define how a syntax highlighting language should highlight the content of a file are case-sensitive and those key strings with an equal sign require exactly 1 space before and 1 space after the equal sign. An example demonstrating the incorrect usage of keywords and key strings in a language definition line with red marked errors:

/L20"Example" NoCase String Chars =" Line Comment  = // File Extensions= TXT

/L20"Example" Nocase String Chars = " Line Comment = // File Extensions = TXT

Compare the first line with the errors with the correct second line. What is wrong in the first line?

In keyword "Nocase" the character c is written in wrong case.
In key string "String Chars = " the space after the equal sign is missing.
In key string "Line Comment = " there are 2 spaces before the equal sign.
In key string "File Extensions = " the space before the equal sign is missing.

Important for the sorting order of the words in the color groups is the keyword Nocase in the language definition line because it controls among other things the case sensitivity of the words. Therefore the macro SortLanguage searches in the language definition line for the word nocase in any case and replaces it always by the correct keyword Nocase before it starts to sort the words.


All words in the color groups starting with the same character may be on the same line or spread across multiple lines, however if they are spread across multiple lines the lines must be one after the other with no empty lines or other line lines between them.

If the language is case-sensitive, the letter A is different from a and so words starting with A must be on a different line from words starting with a.
Words starting with the letter A must be on the same line as words starting with the letter a if the language is not case-sensitive.

First an example for a case-sensitive language with several sorting errors marked with red color:

Collection
Checkbox case
Anchor Applet Dictionary Area Arguments Array abstract
Boolean
Button
Crypto
Date Document Drive Drives
break
byte
default delete do double
class const catch char continue

What is wrong in the example above and why?

The word case starts with a lowercase c and the language is case-sensitive. So this word must be on a different line than the word Checkbox which starts with an uppercase C. The same mistake was made here for the word abstract.

The word Dictionary starts with D and therefore must be on a different line than the words starting with A.

The word Crypto starts with C and therefore must be on the same line with Collection or Checkbox or on a separate line, but with no other lines between the lines with words starting with C. In the example there are lines with words starting with A and B between the line with Checkbox and the line with Crypto and therefore this word is ignored.

That the words class const catch char continue are not sorted alphabetically within the line is no problem for UltraEdit/UEStudio. It is also no problem that for example the line with the words default delete do double is above the line with the words starting with c. And it also doesn't matter if some lines contain multiple words starting with the same character and other words starting with the same character are spread over multiple lines as long as lines with words starting with the same character build a unique block within a color group. But with such a weird grouping and ordering of the words mistakes can happen very easily when inserting additional words. Therefore the SortLanguage macro sorts also the words within a line and the entire lines alphabetically. Here is the corrected words list as produced by the macro:

Anchor Applet Area Arguments Array
Boolean Button
Checkbox Collection Crypto
Date Dictionary Document Drive Drives
abstract
break byte
case catch char class const continue
default delete do double

Now let us assume the keyword Nocase exists on the language definition line and therefore the case of the letters of the words in the color groups is not important. In this case all the words starting with a lowercase character in the list above would not be correct highlighted. The correct word order for a language ignoring the case of the letters A to Z would be:

abstract Anchor Applet Area Arguments Array
Boolean break Button byte
case catch char Checkbox class Collection const continue Crypto
Date default delete Dictionary do Document double Drive Drives

Language specific letters with a character value greater 127 are interpreted by the syntax highlighting engine always case-sensitive independent on presence of keyword Nocase in the wordfile. But wordfiles usually do not contain such letters and therefore the macro set for sorting the keywords do not process words with such letters different although the syntax highlighting engine would require it.


Lines starting with / are interpreted by UltraEdit/UEStudio as a line with a special syntax highlighting keyword. Therefore all lines in the color groups containing one or more "words" starting with / must start with // to be correct interpreted. An example with a wrong and a correct line:

/word1 /word2

// /word1 /word2


A line starting with ** defines a line with 1 or more substrings. The strings on this line can start with different characters. The lines with substrings must only build a block within a color group, best at top of the color group. Normally only 1 line is required for the definition of substrings in a color group. All words starting with those substrings are completely highlighted with the color of the color group. For more details on substrings see the documentation of TestForDuplicate.


Languages marked with HTML_LANG or XML_LANG in the language definition line enables the HTML/XML specific interpretation of the words in the color groups. If one of these keywords is present, < or </ may be placed in front of any word (tag) to highlight as desired without all keywords starting with < need to be on the same line. Instead the tags starting with the same letter must be on the same or contiguous lines as normally required for words like if the tags would not begin with < or </.


A language marked with LATEX_LANG in the language definition line enables the LaTex/Tex specific interpretation of the words in the color groups. If a word begins with \ then the second character is used to determine which line the word should be on. All words beginning with \a should be on the same line as other words beginning with \a or just a. In the same way, all words beginning with \b should be on the same line as other words beginning with \b or just b, but on a different line from those starting with \a, and so on.


For more details and help about syntax highlighting wordfiles see in help of UltraEdit or UEStudio the page Syntax Highlighting and the forum topic Readme for the Syntax Highlighting forum.


General macro information

Some general information about the macros used to sort the words in all color groups of a language.

The macros are ready for usage in the macro file SyntaxTools.mac. The macros are developed with having in view the compatibility with many versions of UltraEdit and UEStudio and are tested with many versions of UltraEdit. But always take a quick look on the result of the sorting operation. It is always possible that a version of UE/UES released after last update of the macro set has a bug in program code resulting in a wrong macro execution.

To use this macro set you need at least v8.20 of UltraEdit or any version of UEStudio. The macros were developed and tested with UE v10.10c and later versions of UltraEdit.

If you find any bugs or have other related questions, post it at http://www.ultraedit.com/forums/viewtopic.php?t=443.

You can see the source code of the macros below with lots of comments in case of being interested in how the macros work. If you want to make changes to fit your requirements better, feel free to do so, but take following into consideration:

All macros should have following properties:

Show Cancel Dialog for this macro  ... disabled
Continue if a Find with Replace not found  ...enabled ( <UE v13.10a+2)
Continue if search string not found  ...enabled (>= UE v13.10a+2)
Hotkey  ...none

You can assign a hotkey to macro SortLanguage if it is used frequently. Never run the submacros manually!

Remove the green comment lines with the blank lines before copying the instructions to the macro edit window. The comments are only for experts who want to know how the macros work.

The submacro WrapLines sets the maximum numbers of characters per line to 106 which is the best value for printing with Courier New 8 with 1.5 cm left and right border on a European A4 sheet. This line length is also good for lower resolutions (1024x768) and at least one additional view open on left or right side and a normal font size used for displaying the text. A wordfile for the UE/UES community should not work with larger line lengths to be readable by most users without the need to scroll the lines horizontally. But if you don't want this line length limit, remove in macro SortLanguage the command

PlayMacro 1 "WrapLines"

Disclaimer

THIS MACRO SET IS PROVIDED "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, STATUTORY OR OTHERWISE, INCLUDING WITHOUT LIMITATION ANY IMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO USE, RESULTS AND PERFORMANCE OF THE MACRO SET IS ASSUMED BY YOU AND IF THE MACRO SET SHOULD PROVE TO BE DEFECTIVE, YOU ASSUME THE ENTIRE COST OF ALL NECESSARY SERVICING, REPAIR OR OTHER REMEDIATION. UNDER NO CIRCUMSTANCES, CAN THE AUTHOR BE HELD RESPONSIBLE FOR ANY DAMAGE CAUSED IN ANY USUAL, SPECIAL, OR ACCIDENTAL WAY OR BY THE MACRO SET.


History

2005-04-27:

Fixed problem with more than one space between the words resulting in additional empty lines for each additional space on sorted language.

Fixed missing command Key HOME in the last line of this offline documentation.

2006-02-19:

A workaround for a bug of UltraEdit v10.xx was not well down because if the caret was already on the language definition line of which words should be sorted and this language definition is the last one in a wordfile with more than 1 language definition, the previous language definition was sorted. Found this problem by myself and fixed it now with a rewritten code block for finding start of current language definition. This new solution is even easier than the previous one.

Second, I rewrote the code block for selecting the whole language definition. The command Find RegExp "%*^p^p" was used in previous versions to select all lines of the language definition. This worked only because of a bug of UltraEdit. The new language selecting code block is much more complicated but with additional code it now allows also blank lines within the language definition. Note: Such blank lines are removed during macro execution.

2006-04-02:

Because of a bug in UltraEdit/UEStudio the Cancel dialog can cause crashes when calling a submacro. To avoid those crashes the macro property Show Cancel Dialog for this macro is not set any more in all macros.

Many users creating a syntax highlighting language definition for the first time write the keyword Nocase wrong. The keywords are case-sensitive and so nocase and NoCase are ignored by UE/UES. The macro SortLanguage detects now also a wrongly written Nocase keyword and corrects it automatically before sorting is executed.

With UltraEdit v11 and with UEStudio the keyword XML_LANG was introduced which has the same special meaning for words starting with < or </ as HTML_LANG. The macro SortLanguage recognizes now this special keyword too.

Before macro SortLanguage is executed the caret must be set anywhere within a language definition. If the caret is set on a blank line above a language definition and the file contains more than one language definition, the language definition above the caret was sorted by previous versions of this macro. If no other language definition is in the file and the caret is set on a blank line above the only language definition, the previous versions of the macro SortLanguage have done nothing. The macro SortLanguage was modified to first set the caret on a line which does contain any character before selecting the whole language definition. Now always the language below the current caret position is sorted if the caret is set on a blank line (= line which contains no or only whitespace characters).

Last some spelling mistakes were corrected in the documentation and the style of the documentation changed also a little.

2006-04-17:

The macros were designed for being executed on files with DOS line terminations because the syntax highlighting wordfile must be also a DOS file. The SortLanguage macro creates twice a new file. If the user has specified in the configuration dialog that the Default file type for new files is UNIX or MAC and not DOS and additionally has not selected the option Automatically convert to DOS format, new files were created not with CR/LF as line terminations and the macros failed. To solve this problem the command UnixMacToDos was inserted immediately after the 2 NewFile commands to make sure that the new file is always a DOS file.

Added to this documentation where to insert the macro command UnixReOn or PerlReOn if the user prefers the UNIX or Perl compatible regular expression engine instead of the UltraEdit regular expression engine which is used for these macros. Search for UnixReOn to find the 2 exit positions.

The order of the macros within the macro file has changed. The main macro SortLanguage is now the first macro in the file. That allows the user to run the macro SortLanguage also with Play Again from the Macro menu immediately after loading the macro file. So there is no need any more to select the macro SortLanguage from the macro list before execution after loading the macro file.

The macro file SyntaxTools.mac now contains also 3 additional macros to test a language definition for duplicate words. See TestForDuplicate for details about this additional macro set.

Last some small mistakes were corrected in the documentation.

2006-11-19:

There were 2 small errors in the macro codes for SortLanguage and ExpandSubstring. In both macros there was 1 Else which should be EndIf. These errors did not have an effect on the function of the macros.

The UltraEdit versions 12.10+3 to v12.10b and the UEStudio versions 5.50 and 5.50a move the focus always to nearest left tab in the file tab order instead of the last used file according to the window history when closing a file. With release of UltraEdit v12.20 and UEStudio v6.00 the focus handling after a file tab close can be customized with the option Move to nearest left tab after current tab is closed at Configuration - Application Layout - File Tabs. If this option is set or one of the UE/UES versions is used which always sets the focus to nearest left file tab after a file is closed, the wordfile with the language definition to sort has had to be the most right tab or the macro pasted the sorted language definition to the wrong file after closing the temporary files at end of the macro SortLanguage.

Now the macro SortLanguage has been improved and works independent of which file gets the focus after closing the 2 temporary file tabs. The macro now searches for the still existing selection of the whole language definition in all open windows before it pastes the sorted language definition over the unsorted definition.

2006-12-05:

In the source file the selected part of the language definition line does not start with a slash. Also in the temp file right before copying the sorted language back the language definition line has no slash at start. But for the loop to find correct file after closing last temp file a regular expression search was inserted which should find the start of the language definition line and should copy the language name with its language number to clipboard 8. This search was never successful and so the "find correct file" loop was executed with last content of clipboard 8 which could successfully find the correct file, but could also lead to an endless window switching loop. Fixed this bug by deleting the slash character in the regular expression search for the language definition number and name.

Last some small spelling mistakes were corrected in the documentation.

2007-05-01:

The macro file was renamed from SyntaxSort.mac to SyntaxTools.mac and this file from SyntaxSort.htm to SortLanguage.htm. The macro file contains now also an additional macro to test a language definition for invalid words. See TestForInvalid for details about this additional macro. The macro source code is also available as UEM file - see top of this text file. The macros for sorting the words were not modified.

2007-08-02:

Near the end of macro SortLanguage UltraEdit does not find under certain conditions (depending on PC hardware, version, source file) the language number and language name at top of the first temporary file and so does not copy those data to user clipboard 8. This could result in an endless loop because the correct window is never found because of wrong content in clipboard 8. A workaround was added for this very special UltraEdit problem.

2009-04-30:

Changed the regular expression for finding the language definition line to find also such lines which have no language name, only the language number. And made small modifications in comments and code, but without any effect on execution or result of the macro and therefore not really worth to document them in detail. Most changes were made on this description with lots of new information for interested readers.

2009-06-11:

In macro ExpandSubstring changed the method used to delete the remaining space at start of lines with substrings because of a bug detected with UltraEdit v15.00.0.1048.

2009-11-13:

Added the attention at top of the file.

2011-06-13:

Updated the macro to support also languages with up to 20 color groups.

2012-05-29:

Modified the macros SortLanguage and ReconvertWords to get in HTML and XML wordfiles the strings <? and ?> listed as <? ?> instead of ?> <? on a line after sort. This modification has no effect on syntax highlighting. <? ?> is just the better order for these 2 strings.

2013-01-06:

Rewrote the submacros CollectCase and CollectNocase and added submacro WrapLines for faster and better collecting words starting with same character on lines wrapped after column 106.

Modified main macro SortLanguage once more for better sorting HTML/XHTML and XML tags. The sort order is now

<tag <tag> </tag>

instead of

<tag> <tag </tag>

These 2 changes resulting in a better output after using macro SortLanguage have no effect on syntax highlighting based on wordfiles sorted already before with this macro.

Space characters at beginning of lines with words are now removed too. In previous versions such a space character at beginning of lines with words resulted in a blank line within a color group.


Source code of the macros

MACRO CollectCase

// This macro is a submacro for macro SortLanguage. It collects the
// sorted word definitions of a color group. All words starting with
// the same case-sensitive character are collected on a single line.
// 2013-01-06: This submacro was rewritten for faster speed and does not
// wrap the lines after column 106 any more, see submacro WrapLines.
InsertMode
ColumnModeOff
HexOff
UnixReOff
Loop
IfEof
ExitLoop
EndIf
IfCharIs "$%*+?[]^"
"^"
Key LEFT ARROW
StartSelect
Key RIGHT ARROW
Key RIGHT ARROW
EndSelect
Copy
Key LEFT ARROW
Key BACKSPACE
Else
StartSelect
Key RIGHT ARROW
EndSelect
Copy
Key LEFT ARROW
EndIf

// Inserting a space before running the UltraEdit regular expression
// Replace All was necessary as some versions of UltraEdit removed also the
// line break above the line at caret position like if option "Replace All
// is from top of file" would be enabled. The inserted space is removed
// after the Replace All.
" "
Find MatchCase RegExp "^p^(^c^)"
Replace All " ^1"
Key HOME
IfColNum 1
Delete
Else
Key BACKSPACE
EndIf
Key DOWN ARROW
EndLoop

MACRO CollectNocase

// This macro is a submacro for macro SortLanguage. It collects the
// sorted word definitions of a color group. All words starting with
// the same not case-sensitive character are collected on a single line.
// 2013-01-06: This submacro was rewritten for faster speed and does not
// wrap the lines after column 106 any more, see submacro WrapLines.
InsertMode
ColumnModeOff
HexOff
UnixReOff
Loop
IfEof
ExitLoop
EndIf
IfCharIs "$%*+?[]^"
"^"
Key LEFT ARROW
StartSelect
Key RIGHT ARROW
Key RIGHT ARROW
EndSelect
Copy
Key LEFT ARROW
Key BACKSPACE
Else
StartSelect
Key RIGHT ARROW
EndSelect
Copy
Key LEFT ARROW
EndIf
" "
Find RegExp "^p^(^c^)"
Replace All " ^1"
Key HOME
IfColNum 1
Delete
Else
Key BACKSPACE
EndIf
Key DOWN ARROW
EndLoop

MACRO ExpandSubstring

// This macro is a submacro for macro SortLanguage. It expands every
// substring definition which are marked with "** " at beginning of
// a line with "**__" and removes the substring definition string "** ".
InsertMode
ColumnModeOff
HexOff
UnixReOff
Loop
Find RegExp "%^*^* "
IfNotFound
ExitLoop
// 2006-11-19: Fixed wrong command Else to EndIf.
EndIf
StartSelect
Key LEFT ARROW
EndSelect
Delete
SelectLine
StartSelect
Find " "
Replace All SelectText " **__"
EndSelect
EndLoop
Top
// 2009-06-11: Changed the method used to delete the remaining space at start
// of the line with substrings because of a bug in UltraEdit v15.00.0.1048.
Find RegExp "% ^*^*__"
Replace All "**__"

MACRO ReconvertWords

// This macro is a submacro for macro SortLanguage. After collecting the
// words, some specials must be reconverted. First all lines with words
// beginning with a slash must begin with "// ". This can be easily done
// with 1 search and replace.
InsertMode
ColumnModeOff
HexOff
UnixReOff
Find RegExp "%/"
Replace All "// /"

// Substrings are marked with "**__" at beginning of each word. The problem
// here at converting it back is, that also words starting with "*" can be
// defined in the same color group and now these words are mixed with the
// substrings. So the substring words must be extracted from the words
// starting with "*".
Find RegExp "^*^*__"
IfFound
Key LEFT ARROW
Key LEFT ARROW
Key LEFT ARROW
Key LEFT ARROW
"
"
Key UP ARROW
Key END
IfColNum 1
Key DEL
Else
Key BACKSPACE
EndIf
Loop
Find RegExp "^*^*__[~ ^p]+"
IfNotFound
ExitLoop
EndIf
EndLoop
Key RIGHT ARROW
Key LEFT ARROW
"
"
Key DEL
Top
Find RegExp "%^*^*__"
Replace All "^*^* "
Find RegExp "^*^*__"
Replace All ""
EndIf

// Also HTML (starting with "<" or "</") and LATEX specific words
// (starting with "\") must be reconverted back to their original
// definition. If a language is HTML_LANG (XML_LANG) or LATEX_LANG
// is defined at column 1 line 2 and 3 ("0" = FALSE, "1" = TRUE).
Top
Key DOWN ARROW
IfCharIs "1"
Find RegExp "^([~ ^p]+^)_</"
Replace All "</^1"
Find RegExp "^([~ ^p]+^)<=<"
Replace All "<^1"
Find RegExp "^([~ ^p]+^)<_<"
Replace All "<^1>"
EndIf
Key DOWN ARROW
IfCharIs "1"
Find RegExp "^([~ ^p]+^)__\"
Replace All "\^1"
EndIf

MACRO WrapLines

// 2013-01-06: This macro is a submacro for macro SortLanguage. After
// sorting all words of a color group and reconverting back the special
// words, this submacro wraps lines longer than 106 characters. A smart
// hard wrap must be done because of substring definitions and lines
// starting with a slash. Please note that the loop below becomes an
// endless loop if there is a single word with 107 or more characters.
InsertMode
ColumnModeOff
HexOff
UnixReOff
Loop

// Find a line with at least 107 characters. If none found up to
// bottom of file, exit the loop for wrapping lines at column 107.
Find RegExp "%???????????????????????????????????????????????????????????????????????????????????????????????????????????"
IfNotFound
ExitLoop
EndIf

// Cancel the selection.
EndSelect
Key LEFT ARROW
Key RIGHT ARROW

// Replace next space in upwards direction by a DOS line termination.
Find MatchCase Up " "
Replace "^p"

// If the now wrapped line starts with a slash, the new line must start
// also with "// " as all lines containing words starting with a slash.
Key UP ARROW
IfCharIs "/"
Key DOWN ARROW
"// "
Key HOME
Else

// If the now wrapped line starts with the substring definition "** ",
// the new line must start also with "** " so that the words on this
// line are also interpreted as substrings.
IfCharIs "*"
Key RIGHT ARROW
IfCharIs "*"
Key RIGHT ARROW
IfCharIs 32
Key HOME
Key DOWN ARROW
"** "
Key HOME
EndIf
EndIf
EndIf
EndIf
EndLoop

MACRO SortLanguage

InsertMode
ColumnModeOff
HexOff
UnixReOff

// 2006-04-02: Set caret to the end of the current line and check if it
// is a blank line or a line which only contains spaces, tabs and form feeds
// which are removed with a loop solution because the macro code sequence:
//   Key RIGHT ARROW
//   Find RegExp Up "[ ^t^b]+"
//   Delete
// does not work (bug of find command). If the current line is a blank line,
// the caret is set in a loop to the end of the next line until either a
// non-blank line is found or the end of the file is reached.
Loop
Key END
IfEof
ExitLoop
EndIf
IfColNum 1
Key DOWN ARROW
Else
Key LEFT ARROW
IfCharIs 32
Key DEL
Else
IfCharIs 9
Key DEL
Else
IfCharIs 12
Key DEL
Else
ExitLoop
EndIf
EndIf
EndIf
EndIf
EndLoop

// 2006-02-19: The following code section is rewritten to fix the problem
// with selecting the wrong language definition if the words of the last
// language definition in the current file with more than 1 language
// definitions should be sorted.
// Find start of current language definition and exit macro if not found.
// The caret is set to the end of the current line before searching down
// to next language definition. This makes sure that if caret is already
// at start of the language definition, not the wrong language definition
// is later selected. If next language definition is found, set caret
// back at start of this language definition line. Now search up for the
// language definition and if found set the caret to start of this line.
// 2006-04-17: Insert macro command UnixReOn or PerlReOn before command
// ExitMacro if you prefer UNIX or Perl compatible regular expressions.
Find MatchCase RegExp "%/L[1-9][0-9]++*File "
IfFound
Key HOME
EndIf
Find MatchCase RegExp Up "%/L[1-9][0-9]++*File "
IfNotFound
ExitMacro
EndIf
Key HOME

// 2006-02-19: This language selecting code block with removing the
// blank lines after copying to the new temp file is completely new.
// The caret is at start of the line of the current language definition.
// The macro should select everything from here to next language definition
// or end of the file. But this would only select the language definition
// of the current language if the caret is at the start of the line. So
// the caret must be moved right once. The '/' of /Ln is not selected
// by this new method which must be taken into consideration twice at
// the following macro code.
Key RIGHT ARROW
StartSelect
Find MatchCase RegExp Select "%/L[1-9][0-9]++*File "
IfSel
Key HOME
Else
SelectToBottom
EndIf
Clipboard 9
Copy
EndSelect

// All lines of the language definition including blank lines except first
// character ('/') is now copied into clipboard 9. The macro now creates
// a new temp file, pastes the lines, removes all trailing spaces, deletes
// all blank lines (in a loop because of a bug in v10.xx of UltraEdit) and
// adds a single empty line at end of the file if the last line is already
// terminated with a line break or adds 2 line breaks, if the last line
// is not terminated with a line break.
// 2006-04-17: Inserted UnixMacToDos to create always a new DOS file.
NewFile
UnixMacToDos
Paste
Top
TrimTrailingSpaces
Loop
Find "^p^p"
Replace All "^p"
IfNotFound
ExitLoop
EndIf
EndLoop
Bottom
IfColNum 1
"
"
Else
"

"
EndIf
Top

// Check case sensitivity, HTML and LATEX language definitions for
// later correct sort according to these settings. Because the macro
// language does not support variables, the first three lines are
// used temporary for information storage and cut to clipboard 9.
// 2006-02-19: Add missing '/' at start of language definition line
// and set caret back to top of file before searching for the keywords.
// 2006-04-02: Search also for wrong written Nocase and fix it. And
// search also for XML_LANG which is handled by UE/UES like HTML_LANG.
"0...Nocase
0...HTML_LANG
0...LATEX_LANG
/"
Top
Find RegExp "%/L[1-9][0-9]++*Nocase"
IfFound
Key END
Find Up "Nocase"
Replace "Nocase"
OverStrikeMode
Top
"1"
EndIf
OverStrikeMode
Top
Find MatchCase RegExp "%/L[1-9][0-9]++*HTML_LANG"
IfFound
Top
Key DOWN ARROW
"1"
EndIf
Top
Find MatchCase RegExp "%/L[1-9][0-9]++*XML_LANG"
IfFound
Top
Key DOWN ARROW
"1"
EndIf
Top
Find MatchCase RegExp "%/L[1-9][0-9]++*LATEX_LANG"
IfFound
Top
Key DOWN ARROW
Key DOWN ARROW
"1"
EndIf
InsertMode
Top
StartSelect
Key DOWN ARROW
Key DOWN ARROW
Key DOWN ARROW
EndSelect
Cut

// A second temp file is required for sorting the words which is opened here
// once outside the following color group sorting loop. Clipboard 8 is used
// now for copying data.
// 2006-04-17: Inserted UnixMacToDos to create always a new DOS file.
// 2011-06-13: Loop count increased from 8 to 20 for up to 20 color groups.
NewFile
UnixMacToDos
NextWindow
Clipboard 8
Loop 20

// Selecting all words of current group and copy it to second temp file.
// It is possible that a color group is defined but does not contain
// any words. In this case the group is ignored. The selecting search
// should also be possible with a single "^{%/C[1-9][0-9]++^}^{^p$^}"
// search, but this is not working at an empty color group (bug ?).
// 2011-06-13: Regular expressions changed for up to 20 color groups.
Find MatchCase RegExp "%/C[1-9][0-9]++"
IfNotFound
ExitLoop
// 2006-11-19: Fixed wrong command Else to EndIf.
EndIf
Key HOME
Key DOWN ARROW
Find MatchCase RegExp Select "%/C[1-9][0-9]++"
IfNotFound
Find MatchCase RegExp Select "^p$"
EndIf
StartSelect
Key HOME
EndSelect
IfSel
Copy
PreviousWindow
Paste
Top

// Remove trailing spaces and remove "// " which marks lines with words
// beginning with '/'. Then expand all substring definitions with "**__"
// instead of a single "** " at start of the line. This is done with a
// submacro because loop nesting is not possible in UE macro language.
// 2013-01-06: Whitespaces at beginning of a line are removed now too.
TrimTrailingSpaces
Find RegExp "%[ ^t]+"
Replace All ""
Find RegExp "%// "
Replace All ""
PlayMacro 1 "ExpandSubstring"

// Now convert all spaces to line breaks to get only 1 word per line.
// 2005-04-27: Find " " is replaced by following command to fix
// problem with more than one space between the words.
Find RegExp "[ ^t]+"
Replace All "^p"

// Now LATEX and HTML specific reformatting is done. For LATEX words
// starting with backslash append the backslash at the end of each word
// and remove the backslash at the beginning. For HTML do the same with
// "<" and "</".
// 2013-01-06: Changed sort order for HTML/XML tags from  <tag> <tag </tag>
// to  <tag <tag> </tag> to get same order for tags as the tools written
// by Daniel W. Moore. The order of HTML and XML tags is even better by
// splitting the lines up before this reformatting and inserting a space
// removed immediately after sort.
Top
Clipboard 9
Paste
Key UP ARROW
IfCharIs "1"
Find RegExp "\^([~ ^p]+^)"
Replace All "^1 __\"
EndIf
Top
Key DOWN ARROW
IfCharIs "1"
Find RegExp "</^([~ ^p]+^)"
Replace All "^1 _</"
Find RegExp "<^([~ ^p]+^)>"
Replace All "^1 <_<"
Find RegExp "<^([~_/ ^p][~ ^p]++^)"
Replace All "^1 <=<"
Find MatchCase RegExp "/ <_<$"
Replace All " @/<_<"
Find MatchCase RegExp "> _</$"
Replace All " >_</"
EndIf
Clipboard 8

// Sort all lines except the first 3 lines with the variables according
// to the case sensitivity specified for the language and reconvert
// special words back to original definition.
Top
IfCharIs "0"
Key DOWN ARROW
Key DOWN ARROW
Key DOWN ARROW
SelectToBottom
SortAsc RemoveDup 1 -1 0 0 0 0 0 0
EndSelect
Top
Key DOWN ARROW
Key DOWN ARROW
Key DOWN ARROW
Find MatchCase RegExp " @/<_<$"
Replace All "/<_<"
Find MatchCase " "
Replace All ""
PlayMacro 1 "CollectCase"
Else
Key DOWN ARROW
Key DOWN ARROW
Key DOWN ARROW
SelectToBottom
SortAsc IgnoreCase RemoveDup 1 -1 0 0 0 0 0 0
EndSelect
Top
Key DOWN ARROW
Key DOWN ARROW
Key DOWN ARROW
Find MatchCase RegExp " @/<_<$"
Replace All "/<_<"
Find MatchCase " "
Replace All ""
PlayMacro 1 "CollectNocase"
EndIf
Top
PlayMacro 1 "ReconvertWords"

// Delete the 3 variable lines.
Top
StartSelect
Key DOWN ARROW
Key DOWN ARROW
Key DOWN ARROW
EndSelect
Delete

// 2013-01-06: Wrap all lines after column 106.
// Remove the next command to not wrap the lines.
PlayMacro 1 "WrapLines"

// Copy now sorted words of the current color group back to the
// first temp file with replacing the probably unsorted words.
SelectAll
Cut
NextWindow
Paste
EndIf
EndLoop

// Discard second temp file. 2006-11-19: Clear clipboard 8 is done later.
PreviousWindow
CloseFile NoSave

// Copy now whole sorted language definition with clipboard 9 back to
// the source file with replacing the unsorted language definition and
// discard first temp file.
// 2006-02-19: The '/' at start of the language definition is not selected
// in the source wordfile, but it was inserted in the temp file. So it must
// be removed here before copying and pasting the sorted language definition
// over the still existing selection in the source wordfile.
// 2006-11-19: Before copying whole sorted language definition copy
// the language definition only to clipboard 8 - see below why.
// 2006-12-05: In the temp file the language definition line does
// not start with a slash any more after deleting first character.
// 2007-08-02: It's possible under certain conditions that UltraEdit does not
// recognize that the caret is already at top of the file after deleting the
// slash and before executing the regular expression find. As workaround a
// space is inserted at top of the file, then the slash is deleted and last
// the inserted space too. That helps UltraEdit to synchronize and the Find
// then really always finds the language number and language name.

Top
" "
Key DEL
Key BACKSPACE
Find MatchCase RegExp "%L[1-9][0-9]++"*""
Copy
Clipboard 9
SelectAll
Copy
CloseFile NoSave

// 2006-11-19: Depending on the version of UltraEdit/UEStudio and the config
// option "Move to nearest left tab after current tab is closed" and the
// file tab order now either the wordfile with the still selected unsorted
// language definition has the focus or a different file. So search in a
// loop for a file which has a selection and this selection contains the
// language definition /Lxx"name" which was sorted before. Only now the
// sorted language definition can be pasted over the unsorted version.
Clipboard 8
Loop
IfSel
Find MatchCase "^c"
Replace All SelectText "^c"
IfFound
ExitLoop
EndIf
EndIf
NextWindow
EndLoop
ClearClipboard
Clipboard 9
Paste

// Clean up line breaks at end of file if necessary, save wordfile
// and set caret to begin of the sorted language definition.
// Inserting a space and deleting it immediately is necessary here
// because of a bug of UE (never finds a non white character here).
" "
Key BACKSPACE
Find RegExp "[~ ^t^p]"
IfNotFound
Find RegExp Up "[~ ^t^p]"
EndSelect
Key RIGHT ARROW
SelectToBottom
Delete
EndIf
ClearClipboard
Clipboard 0

// Add this command to enable auto-saving of the modified wordfile after
// sorting. By default the SortLanguage macro does not contain this Save
// command.
Save
Find MatchCase RegExp Up "%/L[1-9][0-9]++*File "

// 2006-02-19: For v10.xx of UltraEdit the current find selection
// must be ended here or the '/' will be selected after Key HOME.
EndSelect
Key HOME

// 2006-04-17: Insert macro command UnixReOn or PerlReOn here at the end
// of the macro if you prefer UNIX or Perl compatible regular expressions.