This example project aims to demonstrate how to build a chevrotain based language (lexer, compiler etc) with nested scopes.
This involves building a scope stack with a symbol stack for each level in the stack. This can then be used for IDE/editor content assist, displaying a list valid variable references at a given point in the document.
Please note that the error-recovery
and syntax
folder are currently only for reference until proper error recovery and syntax is developed for this Nested scope language (example).
$ npm install
This project uses Jest with jest-extended for additional convenience expectations
$ npx jest
- lexer
- parser
- actions and AST
- scope builder
- indexed assignment nodes
- content assist using lookup in indexed assignment map (by position)
- error handling and recovery (TODO)
let inputText = "{ b = 2 }";
let lexingResult = lex(inputText);
const { tokens } = lexingResult;
const inputText = "{ b = 2 }";
const result = parse(inputText);
Invalid input
let inputText = "{ %b = 2 ";
parse(inputText); // throws
Note that the parser must have error recovery enabled in order to function with an invalid document in the editor:
class JsonParser extends CstParser {
constructor() {
super(allTokens, {
// by default the error recovery / fault tolerance capabilities are disabled
recoveryEnabled: true
})
}
Given an input of: a=1
The AST
generated for both embedded and visitor actions looks like this:
{
type: "ASSIGNMENT",
variableName: "a",
valueAssigned: "1",
position: {
startOffset: 1,
endOffset: 1,
startLine: 1,
endLine: 1,
startColumn: 2,
endColumn: 2
}
}
The nested scope language example can be found in src/scope-lang
.
It is intended as an example for how to work with nested scopes and provide content assist over LSP for an editor/IDE such as VS Code.
- chevrotain content assist example project with specs.
- Chevrotain Editor/LSP discussion
- Visual Studio Language Server for dot language
- Quick Start to VSCode Plug-Ins: Language Server Protocol (LSP)
- Quick Start to VSCode Plug-ins: Write LSP Project from Scratch
- Quick Start to VSCode Plug-Ins: Code Completion
- Quick Start to VSCode Plug-Ins: LSP protocol initialization parameters
- Quick Start to VSCode Plug-Ins: Programming Language Extensions
- Quick Start to VSCode Plug-Ins: Diagnostic Information
- Quick Start to VSCode Plug-Ins: Running Commands
To add the completion provider (aka "content assist) for a VSC extension
connection.onInitialize((params): InitializeResult => {
return {
capabilities: {
// ...
completionProvider: {
resolveProvider: true,
triggerCharacters: ["="]
},
hoverProvider: true
}
};
});
Note: Much of the following code can be found in scope/lang/lsp/advanced
Sample onCompletion
handler:
connection.onCompletion((textDocumentPosition: TextDocumentPositionParams): CompletionItem[] => {
let text = documents.get(textDocumentPosition.textDocument.uri).getText();
let position = textDocumentPosition.position;
const lines = text.split(/\r?\n/g);
const currentLine = lines[position.line]
// use parsed model to lookup via position
// return a list of auto complete suggestions (for = assignment)
// must be array of CompletionItem (see below)
return results;
const assignmentIndex = {
3: { varsAvailable: ["a"] },
9: { varsAvailable: ["a, b"] },
17: { varsAvailable: ["b", "c"] }
};
const toAst = (inputText: string, opts = {}) => {
const lexResult = lex(inputText);
const toAstVisitorInstance: any = new AstVisitor(opts);
// ".input" is a setter which will reset the parser's internal's state.
parserInstance.input = lexResult.tokens;
// Automatic CST created when parsing
const cst = parserInstance.statements();
if (parserInstance.errors.length > 0) {
throw Error(
"Sad sad panda, parsing errors detected!\n" +
parserInstance.errors[0].message
);
}
const ast = toAstVisitorInstance.visit(cst);
// console.log("AST - visitor", ast);
return ast;
}
const onChange = (textDocumentPosition: TextDocumentPositionParams) => {
let text = documents.get(textDocumentPosition.textDocument.uri).getText();
const scopeTree = toAstVisitor(text, { positioned: true });
// run scope builder
const builder = new ScopeStackBuilder();
builder.build(scopeTree);
const { lineMap } = builder;
// we should
this.find = {
assignment: createIndexMatcher(lineMap, "assignment");
}
};
};
const onCompletion = (textDocumentPosition: TextDocumentPositionParams): CompletionItem[] => {
// position has character and line position
let text = documents.get(textDocumentPosition.textDocument.uri).getText();
let position = textDocumentPosition.position;
const lines = text.split(/\r?\n/g);
const line = lines[position.line]
// determine char just before position
const lastCharLinePos = Math.min(0, position.character -1)
const lastTypedChar = line.charAt(lastCharLinePos)
// map different completion functions for = and _
completionFn = getCompletionFnFor(lastTypedChar)
// TODO: execute completion function for last char typed that triggered it
const pos = {
line: position.line,
column: position.character
};
let assignmentValue = 'xyz...' // see solution below
// return a list of auto complete suggestions (for = assignment)
const const { data, column } = this.find.assignment(pos, assignmentValue);
const varsWithinScope = data.varsAvailable;
let completionItems = new Array<CompletionItem>();
// build completion items list
varsWithinScope.map(varName => results.push({
label: varName,
kind: CompletionItemKind.Reference,
data: varName
}))
return completionItems;
};
See CompletionItemKind enum (and more VS Code API documentation)
Imagine we add a trigger on _
for a language using snake case for variable names.
completionProvider: {
resolveProvider: true,
triggerCharacters: ["=", "_"]
},
We could then use an AST? lookup to detect that we are in the middle of an assignment and are typing a name for the RHS (value/reference being assigned).
var c = abc_
We could use logic like the following to propose only var names that start with what we have typed so far.
const const { data, column } = this.find.assignment(pos);
const line = lines[position.line]
const wordBeingTypedAfterAssignToken = line.slice(column+1, position.character).trim()
const filterVars = (varsWithinScope) => varsWithinScope.filter(varName => varName.startsWith(wordBeingTypedAfterAssignToken))
const isTypingVarName = wordBeingTypedAfterAssignToken.length > 0
// if we are typing a (variable) name ref
// - display var names that start with typed name
// - otherwise display all var names in scope
const relevantVarsWithinScope = isTypingVarName ? filterVars(varsWithinScope) : varsWithinScope
We could also use named scopes, similar to namespaces or modules/classes etc.
$.RULE("scope", () => {
$.CONSUME(Identifier); // named scopes
$.CONSUME(BeginScope);
$.AT_LEAST_ONE({
DEF: () => {
$.SUBRULE($.statement);
}
});
$.CONSUME(EndScope);
});
alpha {
a = 2
}
Then we could generate the varsAvailable
as a map instead, to indicate for which scope a particular variable is made available (reachable):
We could have the parser wrap the source code in a global
namespace by convention
global {
alpha {
a = 2
}
}
Then the simple solution would return varsAvailable
as follows
varsAvailable = {
a: {
scope: 'alpha'
},
b: {
scope: 'global'
}
}
Scope names are hierarchical, so it would be better to reference the full nested scope name.
varsAvailable = {
a: {
scope: 'global.alpha'
},
b: {
scope: 'global'
}
}
This could be easily achieved by maintaining another scope stack namedScopes
in the same fashion that varsAvailable
are built, then joining the scope names by .
to create the full scope name.
We can also add an onCompletionResolve
handler as follows. This can be used to provide additional context and documentation for the option available to be selected.
connection.onCompletionResolve(
(item: CompletionItem): CompletionItem => {
item.detail = item.data;
item.documentation = `${item.data} reference`;
return item;
}
)