Introducing Kanon: the world’s best legal AI classifier

tl;dr

We’re announcing Kanon, a new family of legal AI models that beat their closest competitors at legal document classification. Kanon is now available in early access, and we’re looking for design partners to join us in shaping the future of legal AI. Kanon’s public release is slated for early this year. You can get in touch with us here to learn more.

Where Kanon fits in

In the two years following the release of ChatGPT, the legal industry has witnessed an explosion in new vertical AI models, products and startups — most of which are being built atop general-purpose chatbots, often ripped straight from OpenAI and Anthropic.

That isn’t such a bad thing. Generative AI solutions like Lexis+ and Harvey Assistant, both wrappers around OpenAI models, are helping lawyers cut down time spent on mundane research and drafting tasks, allowing them to focus on more complex challenges deserving of their attention.

With that said, chatbots cannot and should not be the answer to every automatable problem.

Many of the most time-consuming tasks lawyers perform — tasks like contract review and discovery — are simply not what chatbots are made for.

This is not to mention that, in order to truly excel at legal tasks, generative AI-based models tend to require enormous amounts of computing power. So much power that you often need an entire data center’s worth of GPUs to run them. These are resources that even the largest of law firms are unlikely to have on hand.

And, even if you do have the infrastructure necessary to operate compute-hungry chatbots, you may find that their license terms prevent you from running them on your own hardware. This can be a massive barrier for law firms and government entities that handle highly confidential, legally privileged and/or classified information that they cannot afford to leave their control.

This is where Isaacus and our flagship Kanon model family come in.

We’re a team of seasoned AI and legal experts who, after getting tired of seeing the same old generic chatbot-based legal AI solutions being branded with new names, decided to go at the problem a different way.

We’re focused on building specialized legal AI models and solutions that are:

  • Effective — We are not interested in releasing products that are not best-in-class.
  • Efficient — No matter how cool a demo it might make, we want models that anyone can run on their own hardware, not just Microsoft.
  • Scalable — We will meet you where you are, whether that’s in the cloud, on premises or even inside air-gapped devices.

Our philosophy is simple: do one thing and do it exceptionally well, or don’t do anything at all. We are not and will never be in the business of manufacturing sledgehammers for every size of nail.

Today, we’re proud to introduce Kanon, the first family of legal AI models to fully embrace the Isaacus philosophy.

What is Kanon?

Kanon is a family of small yet highly accurate legal AI models for classifying, extracting information from and assessing the similarity of legal documents, whether they be contracts, cases, legislation, textbooks or anything else.

As of today, the Kanon family includes two models:

  • Kanon: A high-performance, 317-million-parameter model with a context window of 512 tokens, occupying 1.2 gigabytes.
  • Kanon Mini: A lightweight Kanon variant with just 136 million parameters, taking up a mere 441 megabytes, enough to run on an iPhone.

We’ve also trained two versions of these models for the task of legal document classification:

  • Kanon Universal Classifier and Kanon Universal Classifier Mini: These are the world’s most accurate and efficient universal legal classifiers of their size. They can take a statement like “this clause entitles one to terminate an agreement in the event of circumstances beyond their reasonable control” and evaluate it against thousands of documents in mere seconds, producing startlingly accurate confidence scores — no finetuning necessary.

Despite their compactness, Kanon and Kanon Mini punch far above their weight, achieving 6% and 12% better performance, respectively, than their closest general-purpose counterparts.

Benchmarks

Evaluating a model that is meant to be able to classify almost anything is no easy task. There are no off-the-shelf benchmarking datasets for universal legal classifiers. We solved this by creating our own.

Over the course of six weeks, we had a legal professional construct a massive, diverse and genuinely challenging universal legal classification dataset. We built this dataset adversarially, deliberately selecting examples that would push our own models to their limits.

The dataset came to include 4,802 examples across 23 classifications, covering a wide range of contractual clauses and legal concepts, ranging from indemnities all the way to complex assessments of whether an agreement unilaterally obligates or benefits one side over another.

For each classification, we crafted a plain English prompt describing what we were looking for. To account for potential bias toward the way we expressed our prompts, we deliberately tested against multiple variations of the same prompt, for example, by changing “This is a contractual provision that…” to “This is a clause that…”. We further augmented all prompts by occasionally removing punctuation and altering capitalization.

We benchmarked Kanon Universal Classifier and Kanon Universal Classifier Mini against the best universal classifiers available today, specifically, Mortiz Laurer’s RoBERTa, DeBERTa and ModernBERT zero-shot classifiers.

To measure the models’ efficiency, we fed each model the full text of the U.S. Constitution, Apple’s terms of use and Brown v. Board of Education. We used our semchunk semantic chunking library to chunk the documents and then we ran them through the models 100 times each on an RTX 4090, averaging the results document-wise. To ensure comparisons between models of different architectures were as fair as possible, we used all available optimizations for each architecture, including 16-bit brain floating point automatic mixed precision and BetterTransformer.

We found that:

  • Kanon outperformed all other models, including Laurer’s DeBERTa v3 large zero-shot classifier, the world’s best general-purpose universal classification model, despite being 17% smaller. Furthermore, Kanon along with RoBERTa large were the fastest models of their size.
  • Kanon Mini outperformed all other models of its size. It was also the fastest model of its size, alongside RoBERTa base.

Performance was measured using the macro-Matthews correlation coefficient, widely regarded as the gold standard for evaluating the balanced predictive power of classifiers.

The full results of our benchmarks can be found in the table below:

How we did it

While we can’t share every detail of how we built Kanon, we can briefly outline some of the key factors that we think contributed the most to its outstanding performance.

High-quality training data

We spent months building the Blackstone Corpus, one of the world’s largest private repositories of contracts, decisions, legislation and other legal and government documents. In constructing our Corpus, we ensured it was:

  • Global — encompassing a wide range of jurisdictions, including the U.S., U.K., Canada, Australia, New Zealand, Ireland, the entire European Union, the United Nations and the International Court of Justice, to name a few.
  • Clean — we coupled handcrafted, dataset-specific regex-based cleaning routines with advanced low-quality and duplicate filtration techniques that harnessed, amongst other things, Kullback-Leibler divergences and simhashing.
  • Licensed — we meticulously cataloged data sources and their licenses to ensure full compliance with license holders’ copyright.

We then used the Corpus to pretrain and finetune (through the synthetic generation of statements) our models. For finetuning, we further augmented our Corpus with the Pile of Legal Classifications, a private, fully licensed collection of legal classification datasets, all with handcrafted prompts, in addition to a smaller sample of commercially licensed general-purpose classification sets.

Optimized legal tokenizer

We trained our own tokenizer, the Kanon Tokenizer, on the Blackstone Corpus, making it the world’s most space-efficient legal document tokenizer of its size. With a vocabulary of only 65,536 tokens, documents compressed with the tokenizer are capable of being stored as unsigned 16-bit integers, reducing memory requirements dramatically over larger vocabularies.

Using this tokenizer means that, although our models only have a context window of 512 tokens, they can fit far more meaning into those 512 tokens than other models can, thereby increasing their effective context window at no efficiency loss.

Greedy checkpointing, merging and self-distillation

We had a legal professional create three separate, highly diverse validation datasets, with the first set being used to greedily validate almost every training step in order to sample the best performing checkpoints, after which we used our second and third validation datasets to greedily validate potential merges of checkpoints, repeating this process via self-distillation until we arrived at the most optimal checkpoints possible.

What’s next

As of today, Kanon and Kanon Mini are available for early access to select design partners. You can reach out here to explore potentially partnering with us on our journey to reshape the future of legal AI.

In the coming months, we will:

  • Release Kanon Universal Classifier and Kanon Universal Classifier Mini via a public API as well as through on premises deployments.
  • Open source the Kanon Tokenizer under a commercially permissive license to encourage legal AI practitioners to start training models with a tokenizer that is fit-for-purpose.
  • Open source a smaller yet even more diverse universal classification benchmarking dataset to assist with reproducible evaluations of universal legal classifiers.

Later in the year, we’ll also be expanding our offering to include:

  • Text extraction and text embedding models.
  • Universal classification, extraction and search applications.
  • Kanon 2 and Kanon 2 Ultra (more to come soon).

You can stay tuned by following us on LinkedIn. We think 2025 is going to be a transformative year for legal AI 😉.