- split stdlib.tolk into multiple files (tolk-stdlib/ folder)
(the "core" common.tolk is auto-imported, the rest are
needed to be explicitly imported like "@stdlib/tvm-dicts.tolk")
- all functions were renamed to long and clear names
- new naming is camelCase
Lots of changes, actually. Most noticeable are:
- traditional //comments
- #include -> import
- a rule "import what you use"
- ~ found -> !found (for -1/0)
- null() -> null
- is_null?(v) -> v == null
- throw is a keyword
- catch with swapped arguments
- throw_if, throw_unless -> assert
- do until -> do while
- elseif -> else if
- drop ifnot, elseifnot
- drop rarely used operators
A testing framework also appears here. All tests existed earlier,
but due to significant syntax changes, their history is useless.
Since I've implemented AST, now I can drop forward declarations.
Instead, I traverse AST of all files and register global symbols
(functions, constants, global vars) as a separate step, in advance.
That's why, while converting AST to Expr/Op, all available symbols are
already registered.
This greatly simplifies "intermediate state" of yet unknown functions
and checking them afterward.
Redeclaration of local variables (inside the same scope)
is now also prohibited.
Several related changes:
- stdlib.tolk is embedded into a distribution (deb package or tolk-js),
the user won't have to download it and store as a project file;
it's an important step to maintain correct language versioning
- stdlib.tolk is auto-included, that's why all its functions are
available out of the box
- strict includes: you can't use symbol `f` from another file
unless you've #include'd this file
- drop all C++ global variables holding compilation state,
merge them into a single struct CompilerState located at
compiler-state.h; for instance, stdlib filename is also there
A new lexer is noticeably faster and memory efficient
(although splitting a file to tokens is negligible in a whole pipeline).
But the purpose of rewriting lexer was not just to speed up,
but to allow writing code without spaces:
`2+2` is now 4, not a valid identifier as earlier.
The variety of symbols allowed in identifier has greatly reduced
and is now similar to other languages.
SrcLocation became 8 bytes on stack everywhere.
Command-line flags were also reworked:
- the input for Tolk compiler is only a single file now, it's parsed, and parsing continues while new #include are resolved
- flags like -A -P and so on are no more needed, actually
All changes from PR "FunC v0.5.0":
https://github.com/ton-blockchain/ton/pull/1026
Instead of developing FunC, we decided to fork it.
BTW, the first Tolk release will be v0.6,
a metaphor of FunC v0.5 that missed a chance to occur.
The Tolk Language will be positioned as "next-generation FunC".
It's literally a fork of a FunC compiler,
introducing familiar syntax similar to TypeScript,
but leaving all low-level optimizations untouched.
Note, that FunC sources are partially stored
in the parser/ folder (shared with TL/B).
In Tolk, nothing is shared.
Everything from parser/ is copied into tolk/ folder.