Relativity aiR and generative AI in document review: where it actually helps

Two years on from its launch, Relativity's aiR for Review has shifted from a novelty to a tool that review teams are using on real matters. It is worth taking stock of where it works, where the cost-benefit makes sense, and where it still needs careful handling.

What aiR for Review actually does

aiR for Review uses a large language model, integrated into Relativity, to make relevance and issue-coding predictions against a defined set of review criteria. Unlike classic Active Learning, it does not require thousands of human-coded documents to begin producing useful output: a well-written set of instructions and a small validation sample is often enough to get a meaningful first pass. Each prediction comes with a rationale the reviewer can read.

Where it earns its place

Three use cases have become reasonably well established:

Early case assessment. Running aiR against the corpus very early, on a draft of the relevance criteria, surfaces likely hot documents and gives the legal team something concrete to refine scope and strategy against.
Issue tagging at scale. Multi-issue matters, where each document might touch several themes, are where the rationale-per-issue output is most valuable. Reviewers spend their time confirming or rejecting structured suggestions rather than starting from a blank tagging panel.
QC and prioritisation. Even where humans do first-pass review, aiR is being used to flag documents whose human coding looks inconsistent with the surrounding pattern.

Where it still needs careful handling

It is not magic. A few practical points come up repeatedly:

Instructions are the work. The quality of aiR's output is largely a function of how clearly relevance and issues are written. Vague criteria produce vague predictions, just as they do for human reviewers.
Validation is non-negotiable. Defensibility still depends on measuring recall and precision against a human-coded sample, and being able to explain the methodology if challenged.
Privilege remains sensitive. Most teams continue to apply traditional searches and human review for privilege, with aiR as an additional safety net rather than a replacement.

Wider context

aiR is not the only player. Reveal, DISCO and others have shipped comparable generative-AI features, and the EDRM has published practical guidance on validation in its EDRM model updates. The UK courts have not, to date, treated AI-assisted review as inherently more suspect than other technology-assisted review approaches, provided the methodology can be explained.

Bottom line

Generative-AI review tools earn their place where they let smaller review teams handle larger, more complex matters without sacrificing defensibility. They do not remove the need for a properly scoped collection, clear relevance criteria or human judgement on privilege and key documents — and matters where any of those are weak are not the matters to try them on first.