Two years on from its launch, Relativity's aiR for Review has shifted from a novelty to a tool that review teams are using on real matters. It is worth taking stock of where it works, where the cost-benefit makes sense, and where it still needs careful handling.
What aiR for Review actually does
aiR for Review uses a large language model, integrated into Relativity, to make relevance and issue-coding predictions against a defined set of review criteria. Unlike classic Active Learning, it does not require thousands of human-coded documents to begin producing useful output: a well-written set of instructions and a small validation sample is often enough to get a meaningful first pass. Each prediction comes with a rationale the reviewer can read.
Where it earns its place
Three use cases have become reasonably well established:
- Early case assessment. Running aiR against the corpus very early, on a draft of the relevance criteria, surfaces likely hot documents and gives the legal team something concrete to refine scope and strategy against.
- Issue tagging at scale. Multi-issue matters, where each document might touch several themes, are where the rationale-per-issue output is most valuable. Reviewers spend their time confirming or rejecting structured suggestions rather than starting from a blank tagging panel.
- QC and prioritisation. Even where humans do first-pass review, aiR is being used to flag documents whose human coding looks inconsistent with the surrounding pattern.
Where it still needs careful handling
It is not magic. A few practical points come up repeatedly:
- Instructions are the work. The quality of aiR's output is largely a function of how clearly relevance and issues are written. Vague criteria produce vague predictions, just as they do for human reviewers.
- Validation is non-negotiable. Defensibility still depends on measuring recall and precision against a human-coded sample, and being able to explain the methodology if challenged.
- Privilege remains sensitive. Most teams continue to apply traditional searches and human review for privilege, with aiR as an additional safety net rather than a replacement.
Wider context
aiR is not the only player. Reveal, DISCO and others have shipped comparable generative-AI features, and the EDRM has published practical guidance on validation in its EDRM model updates. The UK courts have not, to date, treated AI-assisted review as inherently more suspect than other technology-assisted review approaches, provided the methodology can be explained.
Bottom line
Generative-AI review tools earn their place where they let smaller review teams handle larger, more complex matters without sacrificing defensibility. They do not remove the need for a properly scoped collection, clear relevance criteria or human judgement on privilege and key documents — and matters where any of those are weak are not the matters to try them on first.
