ContractLens—AIContractReview&ClauseComparisonAgent
58%
Review Time Reduced
First-pass review time
94%
Clause Deviation Recall
Vs. lawyer-reviewed ground truth
<9%
False Positive Rate
Flagged clauses that were actually standard
1,200+
Contracts Processed
In first 6 months
Illustrative Project
ContractLens is an illustrative example demonstrating an AI document review pattern, not a completed client engagement.
Overview
ContractLens reads incoming contracts and compares each clause against a standard playbook of expected terms, flagging deviations — unusual liability language, missing standard clauses, non-standard payment terms — so a lawyer reviews a short list of genuine deviations instead of reading every contract from scratch.
The Challenge
Contract language is highly varied even when the underlying business terms are standard — the same liability cap can be phrased a dozen different ways across different contracts. The system needed to recognize semantic equivalence to a standard clause, not just match similar wording, while still catching genuinely different terms that happened to use similar language to a standard clause.
Architecture & Technical Decisions
Clause Segmentation and Embedding
Contracts are segmented into individual clauses, and each clause is converted into an embedding alongside the playbook's standard clauses. This allows finding the most semantically similar standard clause for comparison, rather than relying on rigid pattern matching that breaks across phrasing variation.
LLM-Based Semantic Comparison
Once the most similar standard clause is identified via embedding search, an LLM call compares the two directly and determines whether the contract clause is materially equivalent, a minor acceptable variation, or a meaningful deviation — with reasoning attached explaining the specific difference.
Playbook as a Living, Versioned Document
The standard playbook itself is stored as a structured, versioned set of clauses rather than a static document, so legal can update standard language over time and the system picks up the change immediately without a re-deployment.
- Clause-level embedding search against a versioned standard playbook
- LLM comparison step for semantic equivalence, not just similarity scoring
- Explicit reasoning attached to every flagged deviation, not just a binary flag
- Severity tiering — some deviations are informational, others require legal sign-off before the contract proceeds
Results
- 58% reduction in first-pass contract review time
- 94% recall on clause deviations when validated against lawyer-reviewed ground truth — the system rarely missed a genuine deviation
- False positive rate under 9%, meaning legal's review queue stayed focused on real issues rather than noise
- Over 1,200 contracts processed in the first six months of production use
What I Learned
Combining embedding-based retrieval with an LLM comparison step outperformed either technique alone — embeddings alone couldn't reliably distinguish 'similar wording, different meaning' from genuine equivalence, and an LLM comparing every clause against every playbook entry without retrieval first would have been prohibitively slow and expensive. The two-stage approach got the speed of retrieval and the judgment quality of an LLM comparison, applied only where it was needed.