- split stdlib.tolk into multiple files (tolk-stdlib/ folder)
(the "core" common.tolk is auto-imported, the rest are
needed to be explicitly imported like "@stdlib/tvm-dicts.tolk")
- all functions were renamed to long and clear names
- new naming is camelCase
Lots of changes, actually. Most noticeable are:
- traditional //comments
- #include -> import
- a rule "import what you use"
- ~ found -> !found (for -1/0)
- null() -> null
- is_null?(v) -> v == null
- throw is a keyword
- catch with swapped arguments
- throw_if, throw_unless -> assert
- do until -> do while
- elseif -> else if
- drop ifnot, elseifnot
- drop rarely used operators
A testing framework also appears here. All tests existed earlier,
but due to significant syntax changes, their history is useless.
Since I've implemented AST, now I can drop forward declarations.
Instead, I traverse AST of all files and register global symbols
(functions, constants, global vars) as a separate step, in advance.
That's why, while converting AST to Expr/Op, all available symbols are
already registered.
This greatly simplifies "intermediate state" of yet unknown functions
and checking them afterward.
Redeclaration of local variables (inside the same scope)
is now also prohibited.
Several related changes:
- stdlib.tolk is embedded into a distribution (deb package or tolk-js),
the user won't have to download it and store as a project file;
it's an important step to maintain correct language versioning
- stdlib.tolk is auto-included, that's why all its functions are
available out of the box
- strict includes: you can't use symbol `f` from another file
unless you've #include'd this file
- drop all C++ global variables holding compilation state,
merge them into a single struct CompilerState located at
compiler-state.h; for instance, stdlib filename is also there
A new lexer is noticeably faster and memory efficient
(although splitting a file to tokens is negligible in a whole pipeline).
But the purpose of rewriting lexer was not just to speed up,
but to allow writing code without spaces:
`2+2` is now 4, not a valid identifier as earlier.
The variety of symbols allowed in identifier has greatly reduced
and is now similar to other languages.
SrcLocation became 8 bytes on stack everywhere.
Command-line flags were also reworked:
- the input for Tolk compiler is only a single file now, it's parsed, and parsing continues while new #include are resolved
- flags like -A -P and so on are no more needed, actually