Isaacus just shipped a new state-of-the-art model, this time focused on reranking for legal RAG.
Although Kanon 2 Embedder already represents the frontier of legal-domain retrieval, we know that not everyone is ready to re-embed their entire corpus. We also knew there was still accuracy left on the table for teams handling highly sensitive legal work.
Enter Kanon 2 Reranker: the worldโs best legal reranking model.
We tested it across both production RAG pipelines and standalone retrieval tasks, and the results were remarkable.
Not only does it outperform the competition in a category where there are still very few serious alternatives, it also delivers major retrieval accuracy gains over our standalone embedder. Those improvements translated into exceptional downstream performance.
In our final test, we compared Voyage AI by MongoDB 2.5 Rerank with Kanon 2 Reranker on Legal RAG Bench, using identical embedding models, generative models, and pipeline hyperparameters. The only difference was the reranker.
The result: Kanon 2 Reranker decisively outperformed Voyage 2.5 Rerank.
On holdout questions, the head-to-head margin was one of the most extreme we have seen: for every 1 question Voyage got right and we got wrong, there were 6 questions we got right and Voyage got wrong.
We share an example in the blog post where Voyage Rerank actually underperforms Kanon 2 Embedder on its own, delivering the wrong context to the LLM. In that case, not using a reranker at all would have led to the correct answer.
All in all, Iโm immensely proud of the performance gains weโve achieved. But as we always say, the best benchmark is your own data.
This awesome visualization by @abdurrahmanbutler tracks how reliant the High Court of Australia has been on UK precedents over time.
Back in the early 1900s, up to 70% of citations in High Court decisions were from the UK. Today, that number sits around 20%.
This change seems to have happened gradually as Australia gained more and more independence from the UK, culminating in the Australia Acts of 1986, where we see a nice bump in the proportion of Australian cases cited.
These insights would not be possible without our latest legal AI model, Kanon 2 Enricher, which we used to extract dates and citations from High Court decisions in isaacus/open-australian-legal-corpus and categorize citations by jurisdiction. You can learn about Kanon 2 Enricher here: https://isaacus.com/blog/kanon-2-enricher.
Today weโre publicly releasing Kanon 2 Enricher, and with it, an entirely new class of AI model that weโre calling a hierarchical graphitization model. This is fundamentally different from both universal extraction models and generative models.
As a hierarchical graphitization model, Kanon 2 Enricher natively outputs a ๐ธ๐ป๐ผ๐๐น๐ฒ๐ฑ๐ด๐ฒ ๐ด๐ฟ๐ฎ๐ฝ๐ต rather than tokens, which makes it architecturally incapable of hallucinating or inventing text that wasnโt present in the input.
What that enables in practice is unlike any other model or ML architecture on the market:
โข ๐ก๐ผ ๐ต๐ฎ๐น๐น๐๐ฐ๐ถ๐ป๐ฎ๐๐ถ๐ผ๐ป๐ ๐ค It cannot hallucinate. All references and links are stored as spans, meaning exact character offsets anchored to the original text.
โข ๐๐ถ๐ฒ๐ฟ๐ฎ๐ฟ๐ฐ๐ต๐ถ๐ฐ๐ฎ๐น ๐๐ฒ๐ด๐บ๐ฒ๐ป๐๐ฎ๐๐ถ๐ผ๐ป, ๐ป๐ผ๐ ๐ท๐๐๐ ๐ฒ๐ ๐๐ฟ๐ฎ๐ฐ๐๐ถ๐ผ๐ป ๐ It deconstructs a documentโs full nested hierarchy, down to chapters, sections, clauses, schedules, signatures, and even singular sentences, and classifies each span with dozens of contextual features.
โข ๐๐ป๐๐ถ๐๐ ๐ฒ๐ ๐๐ฟ๐ฎ๐ฐ๐๐ถ๐ผ๐ป, ๐ฑ๐ถ๐๐ฎ๐บ๐ฏ๐ถ๐ด๐๐ฎ๐๐ถ๐ผ๐ป, ๐ฎ๐ป๐ฑ ๐น๐ถ๐ป๐ธ๐ถ๐ป๐ด ๐ It resolves what references actually point to, then links entities, citations, and cross-references into a single coherent graph.
โข ๐๐ฟ๐ฎ๐ฝ๐ต-๐ณ๐ถ๐ฟ๐๐ ๐ฒ๐ณ๐ณ๐ถ๐ฐ๐ถ๐ฒ๐ป๐ฐ๐ ๐โโก๏ธ Small enough to run locally on a consumer PC with sub-second latency, and it stays reliable on long documents where front