Klareco

Active Development

A conversational AI for Esperanto that prioritizes deterministic, explainable processing. Leverages Esperanto's regular grammar to replace neural components with programmatic logic, using learned parameters only for semantic reasoning.

Overview

Modern large language models are impressive but opaque. They learn everything—including grammar—from data, making their behavior difficult to predict or explain. Klareco asks: what if we separated the learnable from the deterministic?

Esperanto, with its completely regular grammar (16 rules, no exceptions), is the perfect testbed. We can encode grammar directly in code and reserve machine learning for the genuinely hard part: meaning.

Features

  • Esperanto's 16 grammar rules encoded directly in the parser, not learned
  • Deterministic parser achieves 91.8% parse rate on Esperanto text
  • Root embeddings cover 11,121 validated roots with 97.98% accuracy
  • Corpus index spans 4.38M sentences with compositional embeddings
  • ~733K learned parameters handle semantics; grammar is fully programmatic
  • RAG query system for Esperanto question answering
  • Thesis: explicit grammar via ASTs lets a small core match larger models while remaining explainable

Architecture

Input Text
[Deterministic Tokenizer]
[Rule-Based Parser] ← Esperanto's 16 rules (91.8% parse rate)
Abstract Syntax Tree
[Compositional Embeddings] ← Learned (~733K params)
Semantic Representation
[Deterministic Generator]
Output Text

Why Esperanto?

Natural languages are messy. English alone has:

  • Irregular verbs (go/went, be/was/were)
  • Context-dependent parsing (time flies like an arrow)
  • Exceptions to every rule

Esperanto was designed to be regular:

  • All verbs conjugate identically
  • Word class is marked by ending (-o noun, -a adjective, -e adverb, -i verb infinitive)
  • No irregular forms

This regularity means a hand-coded parser can handle 100% of Esperanto grammar—something impossible for natural languages.

Research Questions

  1. How much of language model capability comes from grammar vs. semantics?
  2. Can explicit structure reduce the parameters needed for competent language use?
  3. Does deterministic grammar improve explainability without sacrificing capability?

Klareco investigates these questions through a working implementation that separates learned semantic reasoning from programmatic linguistic structure.