CAL vs Generative AI for Document Review

A practical decision matrix for choosing CAL, TAR, or generative AI for defensible, budget-conscious document review.

For small businesses and in-house counsel, document review is no longer a simple choice between “manual” and “technology-assisted.” Today, the real decision is whether your matter calls for Continuous Active Learning (CAL), traditional TAR, a generative AI-assisted workflow, or some hybrid review model that balances cost, speed, and defensibility. That choice matters because discovery is often where litigation budgets expand fastest, and the wrong workflow can either over-spend on attorney hours or create avoidable defensibility risk. If you need broader context on how AI has reshaped review workflows, start with AI and the evolution of document review and production.

The good news is that there is no one-size-fits-all answer. The better news is that there is a practical decision matrix that can guide the choice. In this guide, we break down how CAL, traditional TAR, and generative AI differ across the criteria that matter most to business buyers: cost, accuracy, speed, transparency, and courtroom defensibility. We also explain where hybrid models make the most sense, especially when your matter has mixed privilege, messy data, or a tight budget and an aggressive production deadline.

Pro Tip: The cheapest review plan is not always the lowest-cost plan. A workflow that misses hot documents, requires re-review, or invites a challenge can cost more than a more disciplined AI-assisted process.

1. What the three review approaches actually do

Traditional TAR: model-first, review-second

Traditional Technology-Assisted Review, often abbreviated TAR, usually begins with a seed set of documents reviewed by lawyers. Those labels train a predictive model, which then ranks or classifies the rest of the corpus. In many matters, this approach still performs well, especially where the data is fairly uniform and the legal team can invest time upfront. But TAR is often more static than CAL: once the initial training phase is completed, the workflow may not continuously adapt as intelligently to reviewer decisions. For teams building an eDiscovery strategy, it helps to compare TAR to broader workflow and data architecture principles, much like a real-time vs batch architecture tradeoff.

Traditional TAR can be attractive because it is familiar to eDiscovery vendors, litigators, and opposing counsel. That familiarity can help in negotiating search protocols and defending reasonableness if the process is documented well. However, a traditional TAR workflow may require more early human supervision, and its effectiveness can vary depending on how representative the seed set is. In practical terms, TAR works best when the legal issue is relatively clear, the document universe is manageable, and the team has enough time to establish reliable training criteria.

Continuous Active Learning: iterative ranking with review feedback

Continuous Active Learning (CAL) takes the TAR concept further by continuously retraining the model based on each reviewer decision. Instead of waiting for a complete initial training cycle, CAL prioritizes documents that appear most likely to be relevant, while the model keeps learning from every coding decision. That creates a more dynamic workflow, which often means the review team reaches high-value documents earlier and can reduce the number of documents that need full manual review. As the source material notes, CAL has become the current standard in many sophisticated review programs because it better aligns human effort with the highest-probability documents.

For litigators and in-house teams, the main operational benefit is not simply “AI speed.” It is prioritization. CAL helps focus expensive lawyer attention on documents that matter most while still preserving a controlled, reviewable process. When implemented well, it can reduce time to insight and help the team understand key facts before depositions, motions, or settlement discussions begin. Think of CAL as a disciplined feedback loop rather than a one-time machine decision.

Generative AI-assisted review: language understanding and synthesis

Generative AI changes the review equation by adding capabilities beyond classification. It can summarize, cluster, answer questions about a collection, extract concepts, draft issue tags, and help lawyers quickly triage large data sets. That makes it especially useful for first-pass analysis, witness prep, privilege spotting, case chronology building, and reviewing broad collections where the goal is to understand content rapidly. For a deeper operational lens on deploying AI in a real business environment, see From One-Off Pilots to an AI Operating Model.

However, generative AI is not the same thing as a defensible document ranking system. It can hallucinate, over-generalize, or miss context if used as a substitute for a review protocol. In other words, generative AI is strongest as an assistant, not always as the sole decision-maker. That is why many legal teams use it for workflow acceleration, issue clustering, and summaries, while relying on CAL or TAR for the core production decision process.

2. The litigation budget equation: where costs actually arise

Review labor usually dominates, but not always

In most matters, the biggest cost driver is attorney or contract reviewer time. This is where traditional linear review becomes expensive fast, especially when the corpus includes millions of emails, chat exports, spreadsheets, scanned PDFs, and cloud documents. Even when offshore review is available, it still depends on humans reading documents one by one, and that model scales poorly. The financial services world has learned similar lessons in document-heavy workflows; see Document AI for Financial Services for a useful parallel on extracting signal from noisy records.

What often surprises buyers is that review labor is only one layer of cost. Hosting, data processing, de-duplication, privilege logging, quality control, expert supervision, and late-stage re-review all add up. A workflow that looks inexpensive up front may become expensive if it generates too many false positives, forces repeated QC passes, or requires heavy attorney intervention to explain its results to the client or court. For a small business or in-house legal department, the hidden cost is often management time, not just vendor invoices.

Speed has value because it changes legal leverage

Litigation speed has direct financial value. Faster review can support earlier motion practice, faster settlement analysis, quicker compliance with discovery deadlines, and better management of business disruption. If the right documents are surfaced early, counsel can evaluate exposure sooner and avoid over-lawyering weak issues. A hybrid strategy is often the best way to preserve speed without surrendering process control; the same logic appears in other operational decisions such as capacity planning for hosting teams and even how a business deploys limited resources in uncertainty.

Generative AI can improve speed dramatically at the summary and triage stage, but the fastest workflow is not always the most defensible. CAL often offers the best blend of speed and statistical discipline, especially when the matter is document-heavy and the team expects meaningful production volumes. Traditional TAR may still be sufficient if deadlines are more relaxed and the data is relatively clean. The question is not just how quickly you can review documents, but how quickly you can reach a defensible, production-ready decision set.

Budget planning should assume iteration, not a straight line

One of the biggest budgeting mistakes is treating document review like a linear task with a fixed endpoint. In reality, review programs change as facts emerge, privilege issues surface, and custodians produce new data. That means your budget should include room for protocol changes, seed-set adjustments, and quality control. If you are building an internal playbook, a structured operating model can help; designing an AI-powered upskilling program is relevant because the right reviewer training reduces rework and mistakes.

Budget owners should also plan for the people costs of oversight. Even a strong AI workflow requires someone to validate training decisions, monitor production quality, and document why the chosen method was reasonable. That oversight is not wasted spend. It is what turns an AI-assisted process into a litigation-ready process.

3. Defensibility: what opposing counsel, regulators, and judges care about

Transparency and repeatability matter more than buzzwords

Defensibility in eDiscovery usually means you can explain what you did, why you did it, and how you know it was reasonable. Judges and opposing counsel are typically less interested in whether a tool is branded as AI and more interested in whether the workflow was documented, validated, and proportionate to the matter. That is why traditional TAR and CAL remain important: they are generally easier to describe in terms of training data, ranking logic, and quality control steps. When teams need to audit a sensitive workflow, the control mindset described in Forensics for Entangled AI Deals offers a useful model for preserving evidence integrity.

Generative AI can still be defensible, but usually only when it is used in a bounded and documented way. For example, a team might use it to summarize clusters, generate issue codes for attorney review, or accelerate first-pass triage, while keeping final responsiveness decisions under a controlled protocol. If you cannot explain how prompts were used, how outputs were validated, and whether the system hallucinated, then defensibility is weakened. In high-stakes matters, that uncertainty matters more than any productivity gain.

CAL often has the strongest defensibility story for large-scale review

CAL tends to be attractive because it can be described as an iterative, learning-based review method that adapts as reviewers code documents. That makes it more robust than a one-shot seed strategy in many complex matters. It also aligns with the practical reality that legal relevance often becomes clearer only after several rounds of document exposure. For teams looking to build a more resilient process, lessons from responding to sudden classification rollouts are surprisingly relevant: every classification system needs monitoring and correction when behavior changes.

Still, defensibility is not automatic. CAL must be paired with meaningful quality control, clear training criteria, and a record showing the team monitored recall and precision appropriately. If the workflow was sloppily run, the fact that it used CAL will not save it. The defensible choice is the one you can justify in a declaration, a meet-and-confer, or a motion practice context.

Traditional TAR remains useful when consistency matters more than novelty

There are matters where the predictability of traditional TAR is a feature, not a flaw. If the issue set is stable, custodians are limited, and the production universe is not enormous, a conventional predictive coding workflow may be easier to defend than a more experimental AI stack. That is especially true when the opposing side is conservative or the forum is sensitive to process novelty. For guidance on running a disciplined process review, the approach in How We Review a Local Pizzeria is a surprisingly good analogy: a clear rubric beats vague enthusiasm every time.

For small businesses, the key is not to overcomplicate the matter just because AI is available. If traditional TAR already fits the budget and risk profile, it may be the best defensible option. The goal is not to impress anyone with the latest technology. The goal is to produce relevant documents, protect privileged materials, and avoid procedural mistakes.

4. A practical decision matrix for small businesses and in-house counsel

Use the matrix to match workflow to matter type

The best choice depends on the matter profile, not on the vendor pitch. A contract dispute with 50,000 documents and limited custodians may fit traditional TAR or a light CAL workflow. A multi-party investigation involving chats, attachments, and inconsistent terminology may favor CAL because it learns from evolving labels. A fast-moving internal investigation, meanwhile, may benefit from generative AI-assisted triage before a more controlled review protocol is applied.

Below is a practical comparison framework you can use internally with finance, IT, and outside counsel. It emphasizes the variables that actually drive buy-vs-build and workflow design decisions, not just marketing claims.

Approach	Best Fit	Speed	Defensibility	Cost Profile	Main Risk
Traditional TAR	Stable issues, moderate data volumes, conservative teams	Medium	High when well-documented	Moderate upfront, lower than linear review	Weak seed set or stale training
Continuous Active Learning (CAL)	Large or complex matters, evolving issues, higher recall needs	High	High if QC is disciplined	Often lower review burden over time	Needs oversight and process discipline
Generative AI-assisted review	Early triage, summarization, clustering, first-pass insight	Very high for analysis	Medium unless bounded and validated	Low-to-moderate for analysis; can rise with governance	Hallucinations and uncontrolled prompts
Hybrid CAL + GenAI	Complex matters needing speed plus defensibility	High	High if roles are clearly separated	Often optimal for budget balance	Workflow sprawl without governance
Linear/manual review	Small, simple matters or low-data disputes	Low	High if fully supervised	Highest at scale	Cost escalation and reviewer fatigue

Budget sensitivity changes the answer

If your budget is tightly capped, the right answer may not be “the most advanced system.” It may be the one that reduces attorney review while keeping the process explainable. CAL often gives the best long-run cost control because it learns continuously and can reduce the number of documents sent to expensive reviewers. But if the matter is too small for meaningful model training, traditional TAR or even targeted manual review may be more economical than a more sophisticated AI stack.

For procurement-minded teams, this is where a decision matrix is more useful than a feature list. Ask whether the matter requires early case assessment, privilege-heavy analysis, broad responsiveness identification, or summary generation. Then select the workflow that best matches the most expensive part of the matter. If your team is managing wider operational change, the same prioritization logic is reflected in building an AI operating model, where repeatability matters more than isolated experimentation.

Decision thresholds that are easy to apply

As a rule of thumb, use CAL when the corpus is large, issue patterns are uncertain, and you expect multiple rounds of learning. Use traditional TAR when the subject matter is more stable and you want a familiar, easier-to-explain process. Use generative AI when the team needs fast comprehension, issue spotting, or collection-level synthesis, but not as a standalone substitute for defensible review. And use a hybrid model when the matter is complex enough to justify both process rigor and AI acceleration.

This same threshold thinking is useful in other procurement contexts, from when to buy vs. wait for a hardware upgrade to choosing a platform that can genuinely handle scale. The mistake is assuming every AI tool should be adopted simply because it exists. Better results come from matching tool capability to business need.

5. When hybrid review is the smartest option

Hybrid is not a compromise; it is often the optimal design

Hybrid review means combining tools strategically rather than forcing one tool to do everything. A common and effective pattern is to use generative AI for first-pass summarization, topic clustering, or issue extraction, then use CAL to prioritize and validate responsiveness decisions. This can lower the burden on senior lawyers without abandoning defensibility. In larger matters, hybrid workflows often outperform pure approaches because they let each technology do what it does best.

That said, hybrid only works when responsibilities are defined. Generative AI should not secretly become the final decision-maker in a production protocol. CAL should not be treated as a black box that removes the need for QC. The best hybrid designs define where human judgment is mandatory and where the model may accelerate the path to judgment.

Examples of strong hybrid use cases

A good hybrid use case is an internal investigation where the legal team needs to understand communications quickly, identify custodians, and isolate likely hot documents before a board update. Another is a litigation matter with a huge email set plus a smaller but important chat archive, where generative AI can help summarize conversations while CAL ranks the broader corpus. A third is a regulatory response where the response window is short and leadership needs quick visibility into the facts. In all of these, hybrid review can reduce time to insight while preserving a disciplined production workflow.

For teams with cross-border or complex records, the hybrid logic becomes even more valuable. Data quality, language differences, and scan-heavy files often benefit from layered workflows rather than a single tool. Similar challenges appear in cross-border healthcare documents, where record heterogeneity makes a one-size-fits-all process unreliable.

Governance is the price of flexibility

Hybrid review requires clear governance, particularly around prompt use, training data, privilege handling, and escalation paths. If reviewers do not know when to rely on the model and when to override it, the workflow will drift. This is why business teams should establish written protocols, define review roles, and retain logs of how decisions were made. For broader organizational readiness, consider the discipline behind AI-powered upskilling and translate it into legal review training.

The payoff is worth it. A governed hybrid workflow can deliver fast early visibility, lower attorney hours, and a more defensible production record. In practice, this is the sweet spot for many small businesses and in-house departments: enough AI to reduce waste, enough process to withstand scrutiny.

6. Implementation checklist: how to avoid costly mistakes

Start with the corpus, not the tool

Before choosing a platform, define the data universe. Identify custodians, file types, time ranges, deduplication rules, and privilege-sensitive categories. If the data is heavily spreadsheet-based, scanned, or multi-language, your workflow should reflect those realities. For example, a content-heavy set with mixed formats may need a more flexible parsing layer similar to the extraction challenges described in Document AI for financial services.

Many teams make the mistake of asking vendors for a product demo before they understand the matter profile. That is backwards. A defensible and budget-aware review strategy begins with an evidence inventory, then a risk assessment, then a workflow decision. The result is a cleaner scope, fewer surprises, and a more credible cost estimate.

Define success metrics upfront

Your success metrics should include more than “speed.” Track recall, precision, reviewer agreement, privilege hit rate, escalation rate, and time to first useful insight. If the matter is production-heavy, also measure how many documents were reviewed per producing document and how often the review protocol changed. Metrics help you determine whether CAL or a generative AI-assisted approach is actually helping or merely shifting work around. In operational terms, it is a lot like choosing real-time vs batch analytics: you need the right metric for the right decision layer.

These metrics also support discussions with finance and senior management. Instead of explaining why the matter “felt faster,” you can show the review path reduced cost per responsive document or shortened the timeline to key factual conclusions. That is the language business buyers understand.

Document the process as if you may have to defend it later

Every significant review choice should be recorded: why CAL was selected, how seed data was chosen, what prompts were used for generative AI, what human checks were applied, and what QA thresholds were required. This documentation is not just for opposing counsel. It is also an internal control that helps future matters start from a better baseline. Teams that treat each matter as a learning opportunity create a compounding advantage over time.

That mentality is similar to building a repeatable review rubric in other domains. If you want a practical analogy, compare your legal process to a published rating system: the rubric itself is part of the credibility. The same principle applies to document review, only the stakes are much higher.

7. Common scenarios and the best-fit workflow

Scenario one: small case, limited data, high clarity

If your matter has fewer than roughly 20,000 documents, a single custodian group, and a narrow legal issue, traditional TAR or even targeted manual review may be enough. In these cases, the overhead of a more sophisticated CAL deployment may outweigh the benefits. Generative AI can still help with summarization or issue spotting, but it should not add complexity to a simple matter. The objective is to avoid spending more on review infrastructure than on the substantive dispute.

For a small business, this is the “do not overbuy” scenario. The budget is usually better spent on outside counsel strategy, privilege protection, and a tight production protocol. When the issue is narrow and the document set is small, the most elegant solution is often the simplest one.

Scenario two: large, messy matter with shifting facts

When the corpus is large, the facts are evolving, and multiple custodians use inconsistent terminology, CAL is often the strongest core workflow. Add generative AI for clustering, summarization, and faster early case assessment, but keep the production decision process anchored in the CAL loop. This is a classic hybrid case because the legal team needs both speed and confidence. The larger and messier the corpus, the more valuable adaptive learning becomes.

Teams handling difficult evidence should also think like investigators, not just reviewers. As with auditing a defunct AI partner, preserving context and traceability matters at every stage. If the data trail becomes confused, the legal review will suffer.

Scenario three: urgent response with leadership visibility

When leadership needs answers quickly, generative AI can be useful even before the formal review protocol is fully set. It can produce summaries, identify key themes, and help counsel brief executives on likely issue areas. But once the legal stakes become clearer, the workflow should transition to a more controlled process, often CAL or TAR. This scenario is common in regulatory inquiries and internal investigations where the first 48 hours are critical.

Fast insight is valuable, but legal teams should resist the temptation to let urgency erase controls. The best practice is to use generative AI to accelerate understanding, not to replace governance. That is especially important where privilege, employment issues, or public disclosures may be implicated.

8. The bottom line: which approach fits your litigation budget?

Choose CAL when scale and defensibility both matter

CAL is usually the best fit when your matter is large enough to benefit from continuous model improvement and when you need a defensible, adaptive workflow that reduces review waste. It is especially attractive for matters with uncertain relevance patterns, large email volumes, or repeated learning opportunities. For many in-house teams, CAL offers the best balance of speed, accuracy, and cost discipline.

Choose traditional TAR when the matter is stable and conservative

Traditional TAR remains a solid choice when the issues are clearer, the scope is smaller, and you want a familiar method that many parties already understand. It may not be as adaptive as CAL, but it can still deliver strong value when used with proper QC. In budget-constrained matters, simplicity can be an advantage.

Choose generative AI as an accelerator, not a substitute

Generative AI is most valuable when it helps humans think faster: summarizing documents, clustering themes, drafting issue outlines, and supporting early analysis. It should generally be treated as a review assistant, not the sole mechanism for production decisions. When used this way, it can materially reduce time-to-insight without undermining the core review protocol.

For ongoing operational maturity, teams should think in terms of repeatable playbooks, not ad hoc experiments. The best organizations build a review framework, train staff on it, measure outcomes, and improve it matter by matter. That is how you turn AI from a promising tool into a reliable litigation capability.

FAQ

What is the main difference between CAL and traditional TAR?

CAL continuously learns from reviewer decisions and reprioritizes documents throughout the review, while traditional TAR typically relies more heavily on an initial seed set and a more static predictive process. CAL is generally better for large, evolving matters, while TAR can work well for more stable review populations.

Is generative AI defensible for document review?

Yes, but usually only in bounded, documented use cases. Generative AI is strongest for summarization, issue extraction, clustering, and first-pass analysis. If it is used for final responsiveness decisions, or if prompts and outputs are not validated, defensibility can weaken significantly.

Which workflow is cheapest for a small business?

The cheapest workflow depends on the size and complexity of the data. For small, simple matters, traditional TAR or even focused manual review may be cheapest. For larger matters, CAL can reduce total review cost by lowering the number of documents requiring expensive human review.

When does a hybrid review make sense?

Hybrid review makes sense when you need both speed and defensibility. A common pattern is to use generative AI for summarization or clustering, then use CAL for the core review and production process. Hybrid is especially useful in internal investigations, urgent regulatory responses, and large mixed-format matters.

How should in-house counsel evaluate vendor claims?

Ask for evidence, not slogans. Request information on training methodology, QC steps, prompt governance, validation procedures, and how the system handles privilege and duplicates. Also ask how the vendor documents decisions so your team can explain the workflow later if challenged.

What should be documented for defensibility?

You should document the data scope, review criteria, seed or training approach, model settings where applicable, QC thresholds, escalation rules, and any use of generative AI. The more clearly you can show what was done and why, the easier it is to defend the workflow later.

AI and the evolution of document review and production - A useful primer on how TAR, CAL, and generative AI fit into modern discovery.
From One-Off Pilots to an AI Operating Model - Practical guidance for turning experimental AI use into repeatable operations.
Document AI for Financial Services - A useful analogy for handling structured and semi-structured document sets at scale.
Forensics for Entangled AI Deals - Lessons on preserving evidence and traceability in complex AI-adjacent situations.
Healthcare Predictive Analytics: Real-Time vs Batch - A helpful framework for thinking about review system tradeoffs and operational latency.

1. What the three review approaches actually do

Traditional TAR: model-first, review-second

Continuous Active Learning: iterative ranking with review feedback

Generative AI-assisted review: language understanding and synthesis

2. The litigation budget equation: where costs actually arise

Review labor usually dominates, but not always

Speed has value because it changes legal leverage

Budget planning should assume iteration, not a straight line

3. Defensibility: what opposing counsel, regulators, and judges care about

Transparency and repeatability matter more than buzzwords

CAL often has the strongest defensibility story for large-scale review

Traditional TAR remains useful when consistency matters more than novelty

4. A practical decision matrix for small businesses and in-house counsel

Use the matrix to match workflow to matter type

Budget sensitivity changes the answer

Decision thresholds that are easy to apply

5. When hybrid review is the smartest option

Hybrid is not a compromise; it is often the optimal design

Examples of strong hybrid use cases

Governance is the price of flexibility

6. Implementation checklist: how to avoid costly mistakes

Start with the corpus, not the tool

Define success metrics upfront

Document the process as if you may have to defend it later

7. Common scenarios and the best-fit workflow

Scenario one: small case, limited data, high clarity

Scenario two: large, messy matter with shifting facts

Scenario three: urgent response with leadership visibility

8. The bottom line: which approach fits your litigation budget?

Choose CAL when scale and defensibility both matter

Choose traditional TAR when the matter is stable and conservative

Choose generative AI as an accelerator, not a substitute

FAQ

Related Reading

Related Topics

Jordan Mercer

Up Next

Family Law Money Judgments: Collecting Equalization, Fee, and Support Arrears Orders

Best Court Record Search Tools for Judgments, Liens, and Dockets

Judgment Collection Timeline: What Usually Happens in the First 30, 60, and 90 Days

From Our Network

Legal PPC for Solicitors: Campaign Structure, Keyword Match Types, and Lead Quality

Law Firm Review Generation: How Solicitors Can Get More Client Reviews Ethically

Local SEO for Solicitors: A City Page Strategy That Scales Without Thin Content

Call Answering Services for Law Firms: Pricing, Scripts, and Intake Quality Checklist

Live Chat for Law Firms: When It Helps, When It Hurts, and What to Measure

Law Firm Landing Page Checklist: Elements That Improve Calls and Form Fills