[Tolk] Rewrite the type system from Hindley-Milner to static typing

FunC's (and Tolk's before this PR) type system is based on Hindley-Milner. This is a common approach for functional languages, where types are inferred from usage through unification. As a result, type declarations are not necessary: () f(a,b) { return a+b; } // a and b now int, since `+` (int, int) While this approach works for now, problems arise with the introduction of new types like bool, where `!x` must handle both int and bool. It will also become incompatible with int32 and other strict integers. This will clash with structure methods, struggle with proper generics, and become entirely impractical for union types. This PR completely rewrites the type system targeting the future. 1) type of any expression is inferred and never changed 2) this is available because dependent expressions already inferred 3) forall completely removed, generic functions introduced (they work like template functions actually, instantiated while inferring) 4) instantiation `<...>` syntax, example: `t.tupleAt<int>(0)` 5) `as` keyword, for example `t.tupleAt(0) as int` 6) methods binding is done along with type inferring, not before ("before", as worked previously, was always a wrong approach)
2025-03-09 15:40:10 +00:00 · 2024-12-30 22:31:27 +07:00 · 2024-12-30 22:31:27 +07:00 · 799e2d1265
commit 799e2d1265
parent 3540424aa1
101 changed files with 5402 additions and 2713 deletions
--- a/tolk/lexer.cpp
+++ b/tolk/lexer.cpp
@ -328,6 +328,7 @@ struct ChunkIdentifierOrKeyword final : ChunkLexerBase {
      case 2:
        if (str == "do") return tok_do;
        if (str == "if") return tok_if;
+        if (str == "as") return tok_as;
        break;
      case 3:
        if (str == "int") return tok_int;
@ -345,7 +346,6 @@ struct ChunkIdentifierOrKeyword final : ChunkLexerBase {
        if (str == "null") return tok_null;
        if (str == "void") return tok_void;
        if (str == "bool") return tok_bool;
-        if (str == "auto") return tok_auto;
        if (str == "self") return tok_self;
        if (str == "tolk") return tok_tolk;
        if (str == "type") return tok_type;
@ -578,6 +578,16 @@ void Lexer::next_special(TokenType parse_next_as, const char* str_expected) {
  cur_token = tokens_circularbuf[++cur_token_idx & 7];
 }

+Lexer::SavedPositionForLookahead Lexer::save_parsing_position() const {
+  return {p_next, cur_token_idx, cur_token};
+}
+
+void Lexer::restore_position(SavedPositionForLookahead saved) {
+  p_next = saved.p_next;
+  cur_token_idx = last_token_idx = saved.cur_token_idx;
+  cur_token = saved.cur_token;
+}
+
 void Lexer::error(const std::string& err_msg) const {
  throw ParseError(cur_location(), err_msg);
 }
@ -595,7 +605,7 @@ void lexer_init() {
 // Hence, it's difficult to measure Lexer performance separately.
 // This function can be called just to tick Lexer performance, it just scans all input files.
 // There is no sense to use it in production, but when refactoring and optimizing Lexer, it's useful.
-void lexer_measure_performance(const AllSrcFiles& files_to_just_parse) {
+void lexer_measure_performance(const AllRegisteredSrcFiles& files_to_just_parse) {
  for (const SrcFile* file : files_to_just_parse) {
    Lexer lex(file);
    while (!lex.is_eof()) {