The SystemVerilog concrete syntax tree (CST) uses the language-agnostic syntax tree structure with its own set of (int
) enumerations for tree nodes and leaves. The CST includes all syntactically relevant tokens, no comments, no attributes (at this time), and limited support for preprocessing constructs.
The exact node enumerations should be considered fragile until stated otherwise; they may change, get new enumerations, or remove obsolete ones. Code that depends direct use of these enumerations should be well-tested so that breakages are easy to diagnose and fix.
CST leaves contain tokens, which bear the token enumerations generated from the parser implementation. These token enumerations are relatively stable. However, where practical, we encourage use of token classification functions.
The node enumerations are used directly in the semantic actions of the SystemVerilog parser, with functions like MakeTaggedNode
.
Both node and token enumerations are used in syntax tree analyzers and also drive formatting decisions in the formatter.
Ideal properties of CST nodes:
kFoo
, there should be MakeFoo
function that constructs a CST node from its arguments. Accessor functions should be short and composable.GetFooFromBar
-style functions that hide the structural details of a node, while remaining consistent with construction.This is not the case today because of the haste in which initial development took place, but help is wanted towards achieving the aforementioned ideals. See also https://github.com/chipsalliance/verible/issues/159.
Wouldn't an abstract syntax tree (AST) satisfy the above ideals? Yes, this would take time to write, and we would need help.
An AST may not be a great representation for unpreprocessed code, which is the focus of the first developer tool applications. Having a standard-compliant SV preprocessor would pave the way to making an AST more useful.
Most CST accessor function tests should follow this outline: