This commit introduces nullable types `T?` that are
distinct from non-nullable `T`.
Example: `int?` (int or null) and `int` are different now.
Previously, `null` could be assigned to any primitive type.
Now, it can be assigned only to `T?`.
A non-null assertion operator `!` was also introduced,
similar to `!` in TypeScript and `!!` in Kotlin.
If `int?` still occupies 1 stack slot, `(int,int)?` and
other nullable tensors occupy N+1 slots, the last for
"null precedence". `v == null` actually compares that slot.
Assigning `(int,int)` to `(int,int)?` implicitly creates
a null presence slot. Assigning `null` to `(int,int)?` widens
this null value to 3 slots. This is called "type transitioning".
All stdlib functions prototypes have been updated to reflect
whether they return/accept a nullable or a strict value.
This commit also contains refactoring from `const FunctionData*`
to `FunctionPtr` and similar.
In FunC (and in Tolk before), the assignment
> lhs = rhs
evaluation order (at IR level) was "rhs first, lhs second".
In practice, this did not matter, because lhs could only
be a primitive:
> (v1, v2) = getValue()
Left side of assignment actually has no "evaluation".
Since Tolk implemented indexed access, there could be
> getTensor().0 = getValue()
or (in the future)
> getObject().field = getValue()
where evaluation order becomes significant.
Now evaluation order will be to "lhs first, rhs second"
(more expected from user's point of view), which will become
significant when building control flow graph.
It works both for reading and writing:
> var t = (1, 2);
> t.0; // 1
> t.0 = 5;
> t; // (5, 2)
It also works for typed/untyped tuples, producing INDEX and SETINDEX.
Global tensors and tuples works. Nesting `t.0.1.2` works. `mutate` works.
Even mixing tuples inside tensors inside a global for writing works.
In FunC (and in Tolk before), tensor vars (actually occupying
several stack slots) were represented as a single var in terms
or IR vars (Ops):
> var a = (1, 2);
> LET (_i) = (_1, _2)
Now, every tensor of N stack slots is represented as N IR vars.
> LET (_i, _j) = (_1, _2)
This will give an ability to control access to parts of a tensor
when implementing `tensorVar.0` syntax.
FunC's (and Tolk's before this PR) type system is based on Hindley-Milner.
This is a common approach for functional languages, where
types are inferred from usage through unification.
As a result, type declarations are not necessary:
() f(a,b) { return a+b; } // a and b now int, since `+` (int, int)
While this approach works for now, problems arise with the introduction
of new types like bool, where `!x` must handle both int and bool.
It will also become incompatible with int32 and other strict integers.
This will clash with structure methods, struggle with proper generics,
and become entirely impractical for union types.
This PR completely rewrites the type system targeting the future.
1) type of any expression is inferred and never changed
2) this is available because dependent expressions already inferred
3) forall completely removed, generic functions introduced
(they work like template functions actually, instantiated while inferring)
4) instantiation `<...>` syntax, example: `t.tupleAt<int>(0)`
5) `as` keyword, for example `t.tupleAt(0) as int`
6) methods binding is done along with type inferring, not before
("before", as worked previously, was always a wrong approach)
This is a very big change.
If FunC has `.methods()` and `~methods()`, Tolk has only dot,
one and only way to call a `.method()`.
A method may mutate an object, or may not.
It's a behavioral and semantic difference from FunC.
- `cs.loadInt(32)` modifies a slice and returns an integer
- `b.storeInt(x, 32)` modifies a builder
- `b = b.storeInt()` also works, since it not only modifies, but returns
- chained methods also work, they return `self`
- everything works exactly as expected, similar to JS
- no runtime overhead, exactly same Fift instructions
- custom methods are created with ease
- tilda `~` does not exist in Tolk at all
- split stdlib.tolk into multiple files (tolk-stdlib/ folder)
(the "core" common.tolk is auto-imported, the rest are
needed to be explicitly imported like "@stdlib/tvm-dicts.tolk")
- all functions were renamed to long and clear names
- new naming is camelCase
Lots of changes, actually. Most noticeable are:
- traditional //comments
- #include -> import
- a rule "import what you use"
- ~ found -> !found (for -1/0)
- null() -> null
- is_null?(v) -> v == null
- throw is a keyword
- catch with swapped arguments
- throw_if, throw_unless -> assert
- do until -> do while
- elseif -> else if
- drop ifnot, elseifnot
- drop rarely used operators
A testing framework also appears here. All tests existed earlier,
but due to significant syntax changes, their history is useless.