I’m writing a parser for VBScript/VBA and I’m running into a problem where the parser seems to select the wrong rule and I can’t figure out why. My grammar is way bigger than the one posted here, but I have simplified it to a minimal reproducer.
The grammar is as follows:
module.exports = grammar({
name: 'vbscript',
extras: $ => [
$.horizontal_whitespace
],
rules: {
source_file: $ => repeat(
seq($._statement, $._whitespace),
),
_statement: $ => choice(
prec(2,$.variable_assignment),
prec(1,$.invocation_statement),
),
variable_assignment: $ => seq(
$.identifier,
'=',
$._expression
),
invocation_statement: $ => seq(
$.identifier,
optional(seq($.horizontal_whitespace, $.argument_list))
),
_expression: $ => choice(
$.identifier,
$.literal
),
argument_list: $ => seq(
$.argument,
repeat(seq(
',',
$.argument
))
),
argument: $ => choice(
$._expression,
),
literal: $ => choice(
$.string,
),
string: $ => seq('"', /[^"]*/, '"'),
identifier: $ => /[a-zA-Z_]w*/,
_whitespace: $ => repeat1(/[nr]/),
horizontal_whitespace: $=> /[ t]+/,
}
});
And the test code is as follows:
test = ""
SomeFunction arg1, arg2
The first line should be parsed as variable_assignment
and the second line as invocation_statement
. However, when I run the parser, I get the following output:
$ tree-sitter parse test.vba
(source_file [0, 0] - [2, 0]
(invocation_statement [0, 0] - [0, 9]
(identifier [0, 0] - [0, 4])
(horizontal_whitespace [0, 4] - [0, 5])
(ERROR [0, 5] - [0, 6])
(horizontal_whitespace [0, 6] - [0, 7])
(argument_list [0, 7] - [0, 9]
(argument [0, 7] - [0, 9]
(literal [0, 7] - [0, 9]
(string [0, 7] - [0, 9])))))
(invocation_statement [1, 0] - [1, 23]
(identifier [1, 0] - [1, 12])
(horizontal_whitespace [1, 12] - [1, 13])
(argument_list [1, 13] - [1, 23]
(argument [1, 13] - [1, 17]
(identifier [1, 13] - [1, 17]))
(horizontal_whitespace [1, 18] - [1, 19])
(argument [1, 19] - [1, 23]
(identifier [1, 19] - [1, 23])))))
test.vba 0.10 ms 330 bytes/ms (ERROR [0, 5] - [0, 6])
Apparently, the first line is parsed as invocation_statement
, which doesn’t work, because the =
sign is not included in that rule. I cannot figure out why it doesn’t try parsing it as a variable_assignment
, which has higher precedence. If I comment out line 15 (prec(1,$.invocation_statement)
) or line 26 (optional(seq($.horizontal_whitespace, $.argument_list))
), I get the following output:
$ tree-sitter parse test.vba
(source_file [0, 0] - [2, 0]
(variable_assignment [0, 0] - [0, 9]
(identifier [0, 0] - [0, 4])
(horizontal_whitespace [0, 4] - [0, 5])
(horizontal_whitespace [0, 6] - [0, 7])
(literal [0, 7] - [0, 9]
(string [0, 7] - [0, 9])))
(ERROR [1, 0] - [1, 18]
(identifier [1, 0] - [1, 12])
(horizontal_whitespace [1, 12] - [1, 13])
(identifier [1, 13] - [1, 17]))
(horizontal_whitespace [1, 18] - [1, 19])
(invocation_statement [1, 19] - [1, 23]
(identifier [1, 19] - [1, 23])))
test.vba 0.07 ms 460 bytes/ms (ERROR [1, 0] - [1, 18])
This shows that the variable_assignment
rule correctly matches the first line. Obviously, now the second line no longer parses. I hope someone can see what is going on here.