Parsing Commented Equates

by Guga/René.



Hello contributor,

First of all we want to thank you for the kindness of helping us with improving Spasm.
 
 

I - Brief Introduction
 

The objective here is to build the most complete document about Equates. As you know, Spasm uses the Win32 equates to assemble an executable. Our goal is to make a complete commented list of equates to be used, not only by the assembler, but, also, by the disassembler, and by a dialog showing all the possible relationships between equates and api functions.

When you are creating an application, you may have faced that, sometimes, you don't have all necessary infos of an specific Equate. Then what will you usually have to do? Check on win32.hlp or search on msnd site, or other sites that contains the info on this equate you intend to use.

But, why loosing your time searching everywhere for an info -that, in many cases is not complete- and having to guess what it is used for, and how?

Well, that's why we need your help for building a complete referenced list.

We already found about 52.000 equates that carries their values in hexadecimal (You can check on http://betov.free.fr/BigEquates.zip ).

By now, we are parsing all the infos about theses equates.
 
 

II - General Overview of the Final Plan
 

Actually, when you are assembling a project, and want some info of some equate, you just right-click on it's name and a messagebox shows up, displaying it's hexadecimal value.

For the assembly:

Creating a commented equates list will allow you to have all infos about some equate. So, after this huge work is done, when you will right-click on it, or select an item through the menu (Something like an ApiViewer - the final visual stuff and procedure will be done when the list will be completed), a dialog box will pop up, displaying all the necessary infos, like: comments; remarks; return values; parameters; equivalences names; other equates/functions related; hexadecimal/binary/decimal values; groups where it belongs etc.

For the disassembly:

The functionality and objective is the same as for the assembler, but with improvements, in order to rebuild a full source from the file you are targeting.

When you disassemble a file, it shows something like:

Code0401070: L2:
    push lpdwDisposition_040B2E4
    push phkResult_040A7BD
    push 00
    push 020007
    push 00
    push 00
    push 00
    push lpSubKey_040A33A
    push 080000002
    call 'ADVAPI32.RegCreateKeyExA'

With the use of the commented equates, Spasm will be able to translate this code to:

Code0401070: L2:
    push lpdwDisposition_040B2E4
    push phkResult_040A7BD
    push &NULL
    push &KEY_QUERY_VALUE__&KEY_SET_VALUE__&KEY_CREATE_SUB_KEY
    push &REG_OPTION_NON_VOLATILE
    push &NULL
    push 00
    push lpSubKey_040A33A
    push &HKEY_LOCAL_MACHINE
    call 'ADVAPI32.RegCreateKeyExA'

or even, with the futur HLL parsers:

Code0401070: L2:
    call 'ADVAPI32.RegCreateKeyExA' &HKEY_LOCAL_MACHINE,
                                    lpSubKey_040A33A, &NULL,
                                    &NULL,
                                    &REG_OPTION_NON_VOLATILE,
                                    &KEY_QUERY_VALUE__&KEY_SET_VALUE__&KEY_CREATE_SUB_KEY,
                                    &NULL,
                                    phkResult_040A7BD,
                                    lpdwDisposition_040B2E4

In order to make that possible, we are also building a complete Api functions list, with all necessary data, the same way as for the equates list.

So, on the disassembly, if you want to know what "&REG_OPTION_NON_VOLATILE" is related to, you will just right-click on it, (or select the menu item, that will be added).

And if you want to modify the style, since you already have the equate name, all you will have to do will be choosing another equate (Referenced to that function, of course).

Many improvements will be done on Spasm disassembler, in order to try to recreate completelly an 'original' source from any application.
 
 

III - Licensing problems
 

In some countries, copying and distributing M$ documents is illegal, specially if you are doing it as if it was yours.

It's not forbidden searching or storing that kinda info, since M$ distribute it for free on internet. What is illegal is to not make any reference to those documents origins, and distributing them as if you builded them yourself.

So, to avoid futur problems, particulary, when we will have to provide a SpAsm package for ReactOS distributions, it is absolutelly necessary that we make a reference to M$ or, much better, that we 'translate' the comments of the equates to be parsed.

What we call, here 'translating', is nothing but simply rewriting an M$ sentence into another formulation, so that nobody can say that this or that sentence is a copy/paste from an M$ document. It may seem to you ridiculous, actually, as M$ is not going to give any legal sues against the poor little guys we are, for such an innocent act. But, as soon as the ReactOS will come up to some public visibility, M$ will have no choice out of legal attacks, even knowing that these attacks have no chance for success: They will not die without fighting. So, tending our neck under the blade is not the way to go.

As you will see below, when you will parse a comment for a given Equate, you can do two things:

1 - Inserting a sign of "$" before each comment that is available, what means that it is a literal copy of an Microsoft document.

Or (better doing it now, while you have it on desk),

2 - You can just 'translate' the comment, which therefore will no more require the "$" mark insertion, as it will not be a literal copy, but a document created by yourself, based on a M$ document (or any other source of information you may have).

For example, if you want just to copy and paste an comment from M$, all you have to do is (You will know what is the table below, when you go to the next title):
 
 
BN_CLICKED $The BN_CLICKED notification message is sent = when the user clicks a button. The parent window of the button receives this notification message through the WM_COMMAND message. Unlike the = other button notification messages, this message is intended for applications written for any version of Windows.

BN_CLICKED

idButton = (int) LOWORD(wParam); // identifier of button

hwndButton = (HWND) lParam; // handle of button

$ $ $Remarks

A disabled button does not send a BN_CLICKED notification message to its parent window.

$See Also

WM_COMMAND

Or, if you want to make your own translation, it can be done like:
 
 
BN_CLICKED This notification message is sent when the user clicks a button. The parent window of the = button receives this through the WM_COMMAND message.

This equate is different from the others button notification messages, because it is intended for applications written = for any version of Windows.

This equates has the following organisation:

idButton = (int) LOWORD(wParam); // identifier of button

hwndButton = (HWND) lParam; // handle of button

    Remarks

If the buttons is disabled, he don't send a BN_CLICKED notification message to its parent = window.

See Also

WM_COMMAND


 

IV - Parsing Instructions
 

The job is quite easy to do. (I assume you have winword, and know how to build tables. If not i can tell you how, it's just easy. Just copy and paste the tables that already are in the .doc file. - And empty/delete the data you just pasted, as you need to insert the new data on it)

I will send to you a file (For example: 01.doc) with several equates and their inner data. All the data that need to be inserted on separated tables (rows).

All you have to do is compare the file i sent to you with another one that is on the Spasm Site (BigComments.zip >>> BigComments.txt)
 

So you will have two files:

01.doc - The file you have to change the data inside.

BigComments.txt - The source file where you will search for the data.
 

The file "01.doc" is already separated in tables for you. The first three rows, are filled with correct and/or incorrect data, and the rest is blank.

All you need to do is check the data in 01.doc, and fix or fill them with the data inside BigComments.txt.
 

Here comes an example:

On 01.doc, you may have an equate:
 
 
ASN_INTEGER Indicates an integer variable. %ld(long)      

Now, we need to fix it and check on BigComments.txt if we have more information to insert in the rows.

The table rows are displayed in that order:
 
EQUATE NAME Comments Parameters (Comments) Return Values Remarks See Also

For this example, after checking the data on the Bigcomments.txt, you will see that the needed information is shownd as:
 

AsnAny

The AsnAny structure contains an SNMP variable type and value. This structure is a member of the RFC1157VarBind structure that is used as a parameter in many of the SNMP functions.

(...)

Value

Meaning

Printed as

ASN_INTEGER ----------------à Our Equate

Indicates an integer variable. ----------------à The comments needed

%ld(long) ---- Not used, it is only a parameter for this structure

ASN_OCTETSTRING

Indicates an octet string variable.

putchar <oct>

(...)
 

You find that in this source file (Bigcomments.txt), the only information about the equate is a simple comment. There are no remark or retrieve value, or any other comment about that equate.

So, the table on 01.doc must be changed to:
 
ASN_INTEGER $Indicates an integer variable. $ $ $ $

 

The other example below is when you find a complete information about an equate.

In the 01.doc you are searching for the equates "BN_CLICKED" and "BN_DBLCLK". On "BigComments.txt" you will see the info displayed like this:

BN_CLICKED

The BN_CLICKED notification message is sent when the user clicks a button. = The parent window of the button receives this notification message through the WM_COMMAND message. Unlike the other button notification = messages, this message is intended for applications written for any version of Windows.

BN_CLICKED

idButton = (int) LOWORD(wParam); // identifier of button

hwndButton = (HWND) lParam; // handle of button

Remarks

A disabled button does not send a BN_CLICKED notification message to its parent window.

See Also

WM_COMMAND

BN_DBLCLK

The BN_DBLCLK notification message is sent when the user double-clicks a button that has the BS_OWNERDRAW or BS_RADIOBUTTON = style. The parent window of the button receives this notification message through a WM_COMMAND message.

BN_DBLCLK

idButton = (int) LOWORD(wParam); // identifier of button

hwndButton = (HWND) lParam; // handle of button

BN_DBLCLK is the same as the BN_DOUBLECLICKED notification message.

This notification is provided for compatibility with applications written for versions of Windows earlier than version = 3.0. New applications should use the BS_OWNERDRAW button style and the DRAWITEMSTRUCT structure for this task.

See Also

BN_CLICKED, BN_DOUBLECLICKED, WM_COMMAND
 
 

Every time you can't find a specific data (Like Parameter comments for example, you just leave the row empty).

So it will become ("$" means copied from M$, remember ?):
 
 
BN_CLICKED $The BN_CLICKED notification message is sent = when the user clicks a button. The parent window of the button receives this notification message through the WM_COMMAND message. Unlike the = other button notification messages, this message is intended for applications written for any version of Windows.

BN_CLICKED

idButton = (int) LOWORD(wParam); // identifier of button

hwndButton = (HWND) lParam; // handle of button

$ $ $Remarks

A disabled button does not send a BN_CLICKED notification message to its parent window.

$See Also

WM_COMMAND

BN_DBLCLK $The BN_DBLCLK notification message is sent when the user = double-clicks a button that has the BS_OWNERDRAW or BS_RADIOBUTTON style. The parent window of the button receives this notification = message through a WM_COMMAND message.

BN_DBLCLK

idButton = (int) LOWORD(wParam); // identifier of button

hwndButton = (HWND) lParam; // handle of button

BN_DBLCLK is the same as the BN_DOUBLECLICKED notification message.

This notification is provided for compatibility with applications written for versions of Windows earlier than version = 3.0. New applications should use the BS_OWNERDRAW button style and the DRAWITEMSTRUCT structure for this task.

$ $ $ $See Also

BN_CLICKED, BN_DOUBLECLICKED, WM_COMMAND


 

Or, if you have, for example:

CB_FINDSTRING

An application sends a CB_FINDSTRING message to search the list box of a combo box for an item beginning with the = characters in a specified string.

CB_FINDSTRING

wParam = (WPARAM) indexStart; // item before start of search

lParam = (LPARAM) (LPCSTR) lpszFind // prefix string address

Parameters

indexStart

Value of wParam. Specifies the zero-based index of the item preceding the first item to be searched. When the search = reaches the bottom of the list box, it continues from the top of the list box back to the item specified by the indexStart parameter. If = indexStart is ?1, the entire list box is searched from the beginning.

lpszFind

Value of lParam. Points to the null-terminated string that contains the prefix to search for. The search is not case = sensitive, so this string can contain any combination of uppercase and lowercase letters.

Return Values

The return value is the zero-based index of the matching item. If the search is unsuccessful, it = is CB_ERR.

Remarks

If you create the combo box with an owner-drawn style but without the CBS_HASSTRINGS style, what the = CB_FINDSTRING message does depends on whether your application uses the CBS_SORT style. If you use the CBS_SORT style, WM_COMPAREITEM messages = are sent to the owner of the combo box to determine which item matches the specified string. If you do not use the CBS_SORT style, the = CB_FINDSTRING message searches for a list item that matches the value of the lpszFind parameter.

See Also

CB_FINDSTRINGEXACT, CB_SELECTSTRING, CB_SETCURSEL, WM_COMPAREITEM
 

It will become:
 
CB_FINDSTRING $An application sends a CB_FINDSTRING message to search the list = box of a combo box for an item beginning with the characters in a specified string.

CB_FINDSTRING

wParam = (WPARAM) indexStart; // item before start of search

lParam = (LPARAM) (LPCSTR) lpszFind // prefix string address

$Parameters

indexStart

Value of wParam. Specifies the zero-based index of the item preceding the first item to be searched. When the search = reaches the bottom of the list box, it continues from the top of the list box back to the item specified by the indexStart parameter. If = indexStart is ?1, the entire list box is searched from the beginning.

lpszFind

Value of lParam. Points to the null-terminated string that contains the prefix to search for. The search is not case = sensitive, so this string can contain any combination of uppercase and lowercase letters.

$Return Values

The return value is the zero-based index of the matching item. If the search is unsuccessful, it = is CB_ERR.

$Remarks

If you create the combo box with an owner-drawn style but without the CBS_HASSTRINGS style, what the = CB_FINDSTRING message does depends on whether your application uses the CBS_SORT style. If you use the CBS_SORT style, WM_COMPAREITEM messages = are sent to the owner of the combo box to determine which item matches the specified string. If you do not use the CBS_SORT style, the = CB_FINDSTRING message searches for a list item that matches the value of the lpszFind parameter.

$See Also

CB_FINDSTRINGEXACT, CB_SELECTSTRING, CB_SETCURSEL, WM_COMPAREITEM

Other thing.

You may find that on certain data (like, remarks, or parameters, or retrieve values) you have some comments to other equates. Then, in order to be more accurate on the amount of equates to be parsed, we may need to display them as if they were new equates to parse. Like:
 

CB_DIR

An application sends a CB_DIR message to add a list of filenames to the list box of a combo box.

CB_DIR

wParam = (WPARAM) (UINT) uAttrs; // file attributes

lParam = (LPARAM) (LPCTSTR) lpszFileSpec; // address of filename

Parameters

uAttrs

Value of wParam. Specifies the attributes of the files to be added to the list box. It can be any combination of the = following values:

Value

Meaning

DDL_ARCHIVE -----------------à See it ? It is showing a equate and it's comments inside this equate.

Includes archived files.

DDL_DIRECTORY

Includes subdirectories. Subdirectory names are enclosed in square brackets ([ ]).

DDL_DRIVES

Includes drives. Drives are listed in the form [-x-], where x is the drive letter.

DDL_EXCLUSIVE

Includes only files with the specified attributes. By default, read-write files are listed even if DDL_READWRITE is not = specified.

DDL_HIDDEN

Includes hidden files.

DDL_READONLY

Includes read-only files.

DDL_READWRITE

Includes read-write files with no additional attributes.

DDL_SYSTEM

Includes system files.

lpszFileSpec -----------------à This is not a equate, it's just an parameter. So you will not parse it as a new one.

Value of lParam. Points to the null-terminated string that specifies the filename to add to the = list. If the filename contains any wildcards (for example, *.*), all files that match and have the attributes specified by the uAttrs = parameter are added to the list.

Return Values

The return value is the zero-based index of the last filename added to the list. If an error = occurs, the return value is CB_ERR. If insufficient space is available to store the new strings, it is CB_ERRSPACE.

See Also

CB_ADDSTRING, CB_INSERTSTRING, DlgDirList, DlgDirListComboBox
 
 

See ? DDL_ARCHIVE is inside this equate (CB_DIR) and it is showing some necessary info for it. So we can parse that only with the comments, and if, in some other part of the file, we find it again, pls., parse it as many times you find, and with all the infos you see (even if they look different sometimes).

This is because, at the end, we will display the equates in alpabetical order, check for duplicated equates, and see if some comments, parameters, etc.... are different. If we have the same information on the duplicated equates, we just need to delete one of them. If we find different equates, we just need to check their infos, in order to make them a single one. (This check work will be done later).
 
 

So, this example will become like this :
 
CB_DIR $An application sends a CB_DIR message to add a list of = filenames to the list box of a combo box.

CB_DIR

wParam = (WPARAM) (UINT) uAttrs; // file attributes

lParam = (LPARAM) (LPCTSTR) lpszFileSpec; // address of filename

$Parameters

uAttrs

Value of wParam. Specifies the attributes of the files to be added to the list box. It can be any combination of the = following values:

Value

Meaning

DDL_ARCHIVE

Includes archived files.

DDL_DIRECTORY

Includes subdirectories. Subdirectory names are enclosed in square brackets ([ ]).

DDL_DRIVES

Includes drives. Drives are listed in the form [-x-], where x is the drive letter.

DDL_EXCLUSIVE

Includes only files with the specified attributes. By default, read-write files are listed even if DDL_READWRITE is not = specified.

DDL_HIDDEN

Includes hidden files.

DDL_READONLY

Includes read-only files.

DDL_READWRITE

Includes read-write files with no additional attributes.

DDL_SYSTEM

Includes system files.

lpszFileSpec

Value of lParam. Points to the null-terminated string that specifies the filename to add to the list. If the filename = contains any wildcards (for example, *.*), all files that match and have the attributes specified by the uAttrs parameter are added to the = list.

$Return Values

The return value is the zero-based index of the last filename added to the list. If an error = occurs, the return value is CB_ERR. If insufficient space is available to store the new strings, it is CB_ERRSPACE.

$ $See Also

CB_ADDSTRING, CB_INSERTSTRING, DlgDirList, DlgDirListComboBox

DDL_ARCHIVE $Includes archived files.        
DDL_DIRECTORY $Includes subdirectories. Subdirectory names are enclosed in = square brackets ([ ]).        
DDL_DRIVES $Includes drives. Drives are listed in the form [-x-], where x = is the drive letter.        
DDL_EXCLUSIVE $Includes only files with the specified attributes. By default, = read-write files are listed even if DDL_READWRITE is not specified.        
DDL_HIDDEN $Includes hidden files.        
DDL_READONLY $Includes read-only files.        
DDL_READWRITE $Includes read-write files with no additional = attributes.        
DDL_SYSTEM $Includes system files.        

Last Thing (Finally).

You saw on the examples that on the rows related to comments, remarks, retrieve values, see also, etc, there may be references to other data like Equates, functions, structures etc. (Those items will be displayed enlighted)

For displaying them enlighted we must insert an special marker, before each concerned data.

So, for :

Equates = <$E>

Function = <$F>

Structures = <$S>

Macros = <$M>

So for example, they will be:
 
BN_DBLCLK $The BN_DBLCLK notification message is sent when the user = double-clicks a button that has the <$E>BS_OWNERDRAW or = <$E>BS_RADIOBUTTON style. The parent window of the button receives = this notification message through a <$E>WM_COMMAND = message.

BN_DBLCLK

idButton = (int) LOWORD(wParam); // identifier of button

hwndButton = (HWND) lParam; // handle of button

BN_DBLCLK is the same as the <$E>BN_DOUBLECLICKED = notification message.

This notification is provided for compatibility with applications written for versions of Windows earlier than version = 3.0. New applications should use the <$E>BS_OWNERDRAW button = style and the <$S>DRAWITEMSTRUCT = structure for this task.

$ $ $ $See Also

<$E>BN_CLICKED, <$E>BN_DOUBLECLICKED, <$E>WM_COMMAND

Any doubts, you can email me or insert you question on the board (The apropiated thread)

< mauroteste@hotmail.com >
 

Thanks a lot for any help. Regards,  Guga / René.