Overview
This document describes the ANTLR4 grammar specification for Visual Basic 6.0 used by the ProLeap VB6 Parser. The grammar is derived from the official Visual Basic 6.0 language reference and has been tested with MSDN VB6 statements and several Visual Basic 6.0 code repositories.
Grammar File Location
The complete grammar specification can be found at:
VisualBasic6.g4
Grammar Statistics
- Total Lines: 2,225 lines
- License: MIT License
- Author: Ulrich Wolffgang (proleap.io)
- Source: github.com/uwol/proleap-vb6-parser
Grammar Structure
The ANTLR4 grammar is organized into several major sections that correspond to the structure of Visual Basic 6.0 source files.
Module Rules
The top-level grammar rule defines the structure of a VB6 module (class, form, or standard module):
startRule
: module EOF
;
module
: WS? NEWLINE* (moduleHeader NEWLINE +)?
moduleReferences? NEWLINE*
controlProperties? NEWLINE*
moduleConfig? NEWLINE*
moduleAttributes? NEWLINE*
moduleOptions? NEWLINE*
moduleBody? NEWLINE* WS?
;
Module Components
- moduleHeader: VERSION line (e.g.,
VERSION 1.0 CLASS) - moduleReferences: Object/library references
- controlProperties: Form control definitions (for .frm files)
- moduleConfig: BEGIN/END configuration blocks
- moduleAttributes: Attribute statements
- moduleOptions: Option Base, Option Explicit, Option Compare, etc.
- moduleBody: The actual code (functions, subs, declarations)
Control Properties
Form files (.frm) contain control property definitions that are parsed using specialized rules:
controlProperties
: WS? BEGIN WS cp_ControlType WS cp_ControlIdentifier WS? NEWLINE+
cp_Properties+
END NEWLINE*
;
cp_SingleProperty
: WS? implicitCallStmt_InStmt WS? EQ WS? '$'?
cp_PropertyValue FRX_OFFSET? NEWLINE+
;
cp_NestedProperty
: WS? BEGINPROPERTY WS ambiguousIdentifier
(LPAREN INTEGERLITERAL RPAREN)? (WS GUID)? NEWLINE+
(cp_Properties+)?
ENDPROPERTY NEWLINE+
;
Block Statements
The blockStmt rule enumerates all possible VB6 statements that can appear in code blocks:
Control Flow
- doLoopStmt
- forEachStmt
- forNextStmt
- ifThenElseStmt
- selectCaseStmt
- whileWendStmt
- withStmt
File I/O
- closeStmt
- getStmt
- inputStmt
- lineInputStmt
- openStmt
- printStmt
- putStmt
- writeStmt
File System
- chDirStmt
- chDriveStmt
- filecopyStmt
- killStmt
- mkdirStmt
- nameStmt
- rmdirStmt
Error Handling
- errorStmt
- onErrorStmt
- resumeStmt
Declarations
VB6 supports various declaration types at the module level:
moduleBodyElement
: moduleBlock
| moduleOption
| declareStmt // External API declarations
| enumerationStmt // Enum definitions
| eventStmt // Event declarations
| functionStmt // Function definitions
| propertyGetStmt // Property Get
| propertySetStmt // Property Set
| propertyLetStmt // Property Let
| subStmt // Subroutine definitions
| typeStmt // User-defined types
| macroIfThenElseStmt // Conditional compilation
;
Statement Types
Control Flow Statements
If-Then-Else
ifThenElseStmt
: IF WS ifConditionStmt WS THEN WS blockStmt
(WS ELSE WS blockStmt)? // Single-line form
| ifBlockStmt ifElseIfBlockStmt* ifElseBlockStmt?
END_IF // Block form
;
Select Case
selectCaseStmt
: SELECT WS CASE WS valueStmt NEWLINE+
sC_Case*
END_SELECT
;
sC_Case
: CASE WS sC_Cond NEWLINE+ (block NEWLINE+)?
;
sC_Cond
: ELSE // Case Else
| sC_CondExpr (WS? COMMA WS? sC_CondExpr)*
;
Loops
// Do...Loop variants
doLoopStmt
: DO NEWLINE+ (block NEWLINE+)? LOOP
| DO WS (WHILE | UNTIL) WS valueStmt NEWLINE+
(block NEWLINE+)? LOOP
| DO NEWLINE+ (block NEWLINE+)
LOOP WS (WHILE | UNTIL) WS valueStmt
;
// For...Next
forNextStmt
: FOR WS iCS_S_VariableOrProcedureCall typeHint?
(WS asTypeClause)? WS? EQ WS? valueStmt
WS TO WS valueStmt (WS STEP WS valueStmt)? NEWLINE+
(block NEWLINE+)?
NEXT (WS ambiguousIdentifier typeHint?)?
;
// For Each...Next
forEachStmt
: FOR WS EACH WS ambiguousIdentifier typeHint?
WS IN WS valueStmt NEWLINE+
(block NEWLINE+)?
NEXT (WS ambiguousIdentifier)?
;
File Operations
Open Statement
openStmt
: OPEN WS valueStmt WS FOR WS
(APPEND | BINARY | INPUT | OUTPUT | RANDOM)
(WS ACCESS WS (READ | WRITE | READ_WRITE))?
(WS (SHARED | LOCK_READ | LOCK_WRITE | LOCK_READ_WRITE))?
WS AS WS valueStmt
(WS LEN WS? EQ WS? valueStmt)?
;
Variable Operations
Variable Declaration
variableStmt
: (DIM | STATIC | visibility) WS
(WITHEVENTS WS)? variableListStmt
;
variableSubStmt
: ambiguousIdentifier typeHint?
(WS? LPAREN WS? (subscripts WS?)? RPAREN WS?)?
(WS asTypeClause)?
;
Let/Set Statements
letStmt
: (LET WS)? implicitCallStmt_InStmt WS?
(EQ | PLUS_EQ | MINUS_EQ) WS? valueStmt
;
setStmt
: SET WS implicitCallStmt_InStmt WS? EQ WS? valueStmt
;
Error Handling
onErrorStmt
: (ON_ERROR | ON_LOCAL_ERROR) WS
(GOTO WS valueStmt COLON? | RESUME WS NEXT)
;
errorStmt
: ERROR WS valueStmt
;
resumeStmt
: RESUME (WS (NEXT | ambiguousIdentifier))?
;
Expressions
The grammar includes comprehensive expression parsing with operator precedence:
valueStmt
: literal // Literals
| implicitCallStmt_InStmt // Function/variable references
| LPAREN WS? valueStmt WS? RPAREN // Parenthesized expressions
| NEW WS valueStmt // Object instantiation
| valueStmt WS? POW WS? valueStmt // Exponentiation
| MINUS WS? valueStmt // Unary minus
| PLUS WS? valueStmt // Unary plus
| valueStmt WS? MULT WS? valueStmt // Multiplication
| valueStmt WS? DIV WS? valueStmt // Division
| valueStmt WS? INTDIV WS? valueStmt // Integer division
| valueStmt WS? MOD WS? valueStmt // Modulo
| valueStmt WS? PLUS WS? valueStmt // Addition
| valueStmt WS? MINUS WS? valueStmt // Subtraction
| valueStmt WS? AMPERSAND WS? valueStmt // String concatenation
| valueStmt WS? EQ WS? valueStmt // Equality
| valueStmt WS? NEQ WS? valueStmt // Inequality
| valueStmt WS? LT WS? valueStmt // Less than
| valueStmt WS? GT WS? valueStmt // Greater than
| valueStmt WS? LEQ WS? valueStmt // Less than or equal
| valueStmt WS? GEQ WS? valueStmt // Greater than or equal
| valueStmt WS? LIKE WS? valueStmt // Pattern matching
| valueStmt WS? IS WS? valueStmt // Object comparison
| NOT WS? valueStmt // Logical NOT
| valueStmt WS? AND WS? valueStmt // Logical AND
| valueStmt WS? OR WS? valueStmt // Logical OR
| valueStmt WS? XOR WS? valueStmt // Logical XOR
| valueStmt WS? EQV WS? valueStmt // Logical equivalence
| valueStmt WS? IMP WS? valueStmt // Logical implication
;
Lexer Rules
The grammar defines lexer rules for VB6 tokens including keywords, operators, and literals.
Keywords
The grammar recognizes all VB6 keywords including data types, control flow keywords, file operation keywords, and visibility modifiers:
DIMPUBLICPRIVATESTATICCONSTIFTHENELSEELSEIFENDFORNEXTDOLOOPWHILEUNTILSELECTCASEFUNCTIONSUBPROPERTYGETSETLETLiterals
literal
: COLORLITERAL // Color literals (&H00FF00&)
| DATELITERAL // Date literals (#1/1/2000#)
| DOUBLELITERAL // Double-precision floats
| FILENUMBER // File numbers (#1)
| INTEGERLITERAL // Integers
| STRINGLITERAL // String literals
| TRUE // Boolean True
| FALSE // Boolean False
| NOTHING // Nothing keyword
| NULL // Null keyword
;
Type Hints
VB6 supports single-character type declaration suffixes:
%- Integer&- Long!- Single#- Double@- Currency$- String
Usage Notes
Differences from VB6Parse Implementation
| Aspect | ANTLR4 Grammar | VB6Parse |
|---|---|---|
| Parser Generator | ANTLR4 (Java-based) | Custom hand-written parser (Rust) |
| Parse Tree | AST (Abstract Syntax Tree) | CST (Concrete Syntax Tree) via rowan |
| Whitespace | Explicit WS tokens in grammar | Preserved in CST automatically |
| Error Recovery | ANTLR4 built-in recovery | Custom error handling with ParseResult |
| Performance | JVM overhead | Native Rust performance |
Why Not Use ANTLR4?
VB6Parse uses a custom parser implementation for several reasons:
- Rust: Native Rust API instead of a Java API
- Memory Efficiency: Fine-grained control over allocations
- CST Preservation: Full source fidelity including whitespace and comments
- Error Recovery: Custom error handling tailored to VB6 parsing needs
- Integration: Seamless integration with Rust ecosystem
- Incremental Parsing: Potential for future incremental reparsing optimizations
Grammar Reference Value
Despite not being used directly, this ANTLR4 grammar specification is valuable for:
- Understanding the complete VB6 language syntax
- Cross-referencing VB6Parse implementation against a formal specification
- Identifying edge cases and language features that need testing
- Serving as documentation for VB6 language constructs
- Comparative analysis between different parsing approaches