Interface: LanguageTokenizerComponents
libs/cursorless-engine/tokenizer/tokenizer.types.LanguageTokenizerComponents
Represents a custom tokenizer for a language
Properties
fixedTokens
• fixedTokens: string
[]
Defined in
libs/cursorless-engine/tokenizer/tokenizer.types.ts:4
identifierWordDelimiters
• identifierWordDelimiters: string
[]
These are allowable inside identifiers, and act to separate words in the identifier. They are raw strings, and will be regex-escaped.
Defined in
libs/cursorless-engine/tokenizer/tokenizer.types.ts:17
identifierWordRegexes
• identifierWordRegexes: string
[]
Each element of this list is a regex that can appear inside a token, and
will be considered part of a subword. Note that there is no need to add a
*
here, as the regex will be allowed to repeat.
Defined in
libs/cursorless-engine/tokenizer/tokenizer.types.ts:11
numbersRegex
• numbersRegex: string
Defined in
libs/cursorless-engine/tokenizer/tokenizer.types.ts:19
repeatableSymbols
• repeatableSymbols: string
[]
Defined in
libs/cursorless-engine/tokenizer/tokenizer.types.ts:20
singleSymbolsRegex
• singleSymbolsRegex: string
Defined in
libs/cursorless-engine/tokenizer/tokenizer.types.ts:21