Klareco - Skeptical Engineering

A conversational AI for Esperanto that prioritizes deterministic, explainable processing. Leverages Esperanto's regular grammar to replace neural components with programmatic logic, using learned parameters only for semantic reasoning.

🔗 GitHub Repository

Overview

Modern large language models are impressive but opaque. They learn everything—including grammar—from data, making their behavior difficult to predict or explain. Klareco asks: what if we separated the learnable from the deterministic?

Esperanto, with its completely regular grammar (16 rules, no exceptions), is the perfect testbed. We can encode grammar directly in code and reserve machine learning for the genuinely hard part: meaning.

Features

Esperanto's 16 grammar rules encoded directly in the parser, not learned
Deterministic parser achieves 91.8% parse rate on Esperanto text
Root embeddings cover 11,121 validated roots with 97.98% accuracy
Corpus index spans 4.38M sentences with compositional embeddings
~733K learned parameters handle semantics; grammar is fully programmatic
RAG query system for Esperanto question answering
Thesis: explicit grammar via ASTs lets a small core match larger models while remaining explainable

Architecture

Input Text
    ↓
[Deterministic Tokenizer]
    ↓
[Rule-Based Parser] ← Esperanto's 16 rules (91.8% parse rate)
    ↓
Abstract Syntax Tree
    ↓
[Compositional Embeddings] ← Learned (~733K params)
    ↓
Semantic Representation
    ↓
[Deterministic Generator]
    ↓
Output Text

Why Esperanto?

Natural languages are messy. English alone has:

Irregular verbs (go/went, be/was/were)
Context-dependent parsing (time flies like an arrow)
Exceptions to every rule

Esperanto was designed to be regular:

All verbs conjugate identically
Word class is marked by ending (-o noun, -a adjective, -e adverb, -i verb infinitive)
No irregular forms

This regularity means a hand-coded parser can handle 100% of Esperanto grammar—something impossible for natural languages.

Research Questions

How much of language model capability comes from grammar vs. semantics?
Can explicit structure reduce the parameters needed for competent language use?
Does deterministic grammar improve explainability without sacrificing capability?

Klareco investigates these questions through a working implementation that separates learned semantic reasoning from programmatic linguistic structure.