Skip to main content

Interface: LanguageTokenizerComponents

libs/cursorless-engine/tokenizer/tokenizer.types.LanguageTokenizerComponents

Represents a custom tokenizer for a language

Properties

fixedTokens

fixedTokens: string[]

Defined in

libs/cursorless-engine/tokenizer/tokenizer.types.ts:4


identifierWordDelimiters

identifierWordDelimiters: string[]

These are allowable inside identifiers, and act to separate words in the identifier. They are raw strings, and will be regex-escaped.

Defined in

libs/cursorless-engine/tokenizer/tokenizer.types.ts:17


identifierWordRegexes

identifierWordRegexes: string[]

Each element of this list is a regex that can appear inside a token, and will be considered part of a subword. Note that there is no need to add a * here, as the regex will be allowed to repeat.

Defined in

libs/cursorless-engine/tokenizer/tokenizer.types.ts:11


numbersRegex

numbersRegex: string

Defined in

libs/cursorless-engine/tokenizer/tokenizer.types.ts:19


repeatableSymbols

repeatableSymbols: string[]

Defined in

libs/cursorless-engine/tokenizer/tokenizer.types.ts:20


singleSymbolsRegex

singleSymbolsRegex: string

Defined in

libs/cursorless-engine/tokenizer/tokenizer.types.ts:21