Case Study: How We Automated 80% of a Law Firm's Document Workflow (Legal AI)
Julián Bagilet
April 23, 2026
A mid-size Argentine law firm specializing in M&A and real estate faced a problem: 60% of billable hours went to document preparation, review, and management—work that was repetitive, error-prone, and bottlenecked by paralegals. Within six months of deploying AI-powered document automation, they increased matter throughput by 340% using the same team. Here's how.
The Firm: Context and Scale
- 120–150 active matters per month (M&A and real estate)
- 18 lawyers, 6 paralegals
- Average matter: 2–3 weeks, involves 15–40 documents
- 60% of paralegal time: document prep, clause review, contract assembly
- Key pain point: document creation and review was sequential, bottlenecking lawyer productivity
Four Workflows Automated
Workflow 1: Contract Generation from Templates
- Before: Paralegal manually fills in client data, party names, dates, amounts into word templates. Average time: 3–4 hours per contract.
- Process: Intake form (web + PDF upload). Extract client data (address, tax ID, representative names, deal amount). Lookup data in firm's CRM. Populate template with pre-extracted values. Claude Haiku (fast, cheap) handles routine extractions; human review for ambiguities.
- After: 25 minutes to generate first draft. Human lawyer reviews and approves in 10 minutes.
- Impact: 340% time reduction. Paralegals freed for higher-value tasks (negotiation support, document tracking).
Workflow 2: Clause Extraction and Flagging
- Before: Lawyer manually reads 30-page contract from counterparty, extracts key clauses (liability, indemnification, arbitration, termination), compares against standard firm language. Time: 1.5–2.5 hours per contract.
- Process: Fine-tuned Claude model (trained on 8,000 Argentine commercial contracts + firm's past deals). Extracts: termination conditions, liability caps, indemnification language, arbitration/jurisdiction, confidentiality terms, payment terms. Flags any non-standard language (red-flag words: "unlimited liability", "no remedy", "entire agreement" without carve-outs). Links extracted clause to source page.
- Model training: Firm created labeled dataset of 500 contracts with hand-annotated clause locations and risk levels. Fine-tuned for 2 weeks. Accuracy: 94% on test set.
- After: AI generates clause summary + flags in 8 minutes. Lawyer reviews and approves in 20 minutes (reads pre-extracted summary instead of full contract).
- Impact: Clause extraction time dropped 92%. Fewer missed red flags (AI consistent, humans skip details when tired).
Workflow 3: Due Diligence Document Classification and Summarization
- Before: M&A deals involve 100–500 documents from counterparty. Paralegals manually categorize (contracts, tax docs, litigation, compliance, financials) and summarize. Time: 2–3 days per deal.
- Process: Upload documents to Supabase (PDF → txt extraction). Claude Sonnet + RAG for complex reasoning. Classify document type (contract, financial statement, tax return, etc.). Summarize content in 3–5 bullet points. Extract key dates (expiry, renewal, termination). Link to source document.
- RAG setup: Vector embeddings of past classified documents. If new document is similar to historical ones, use prior classification as prompt hint. Accuracy improves over time.
- After: Classification + summary in 4 hours (vs 2 days). Searchable index of all deal documents. Lawyers can query: "show me all contracts with termination rights" or "which documents mention liability?"
- Impact: Deal analysis time dropped 75%. Lawyers confident they've reviewed all material docs (AI doesn't miss documents when tired).
Workflow 4: Automatic Court Filing Organization and Deadline Tracking
- Before: Litigation matters involve 50+ filings. Paralegal manually tracks deadlines (response dates, discovery cutoffs, trial prep). Spreadsheet-based tracking. Missed deadlines: 2–3 per quarter (liability spike).
- Process: AI extracts filing details from emails and documents: document type (complaint, motion, response), filing deadline, discovery deadline, response required. Calendar integration. Slack notification 10 days before deadline, 3 days before, 1 day before.
- After: Automated deadline calendar. Zero missed deadlines in 6-month pilot (vs 2–3 per quarter before).
- Impact: Risk mitigation (liability avoidance). Lawyer peace of mind.
Tech Stack
- Frontend: React + Supabase (React Query for caching, Tanstack Table for document index)
- Backend: Supabase Edge Functions (serverless, auto-scaling) + PostgreSQL for document metadata + audit logs
- Vector DB: pgvector (Postgres extension) for RAG + semantic search
- Orchestration: n8n self-hosted (workflows: PDF extraction, classification, email triggers, calendar sync)
- AI models: Claude Haiku (routine extractions, fast + cheap), Claude Sonnet (complex reasoning, clause extraction, summarization)
- Document processing: PDFKit (PDF → text), Playwright (web scraping if needed)
- Auth: Supabase Auth (firm's team login)
- Storage: Supabase Storage (encrypted S3 backend) for sensitive legal docs
- Monitoring: OpenTelemetry + Datadog (track API latency, error rates, cost per classification)
Six-Month Results
Timeline to Automation:
- Months 1–2: Discovery + labeled dataset creation (500 sample contracts). Fine-tuning Claude on clause extraction task.
- Months 2–3: MVP (Workflows 1 & 2 only). Alpha test with 20 contracts. 94% accuracy on clause extraction. Deploy to 6 paralegals.
- Months 3–4: Rollout + feedback loop. Paralegals report missing clauses (15 instances). Retrain fine-tuned model with new examples. Accuracy jumps to 97%.
- Months 4–5: Add Workflow 3 (due diligence) and Workflow 4 (deadline tracking). Integration with Slack + Google Calendar.
- Months 5–6: Stabilization. 340% increase in matter throughput measured.
Metrics (6-month comparison):
| Metric | Before | After | Improvement |
|---|---|---|---|
| Document prep time per matter | 4 hours | 18 minutes | 93% reduction |
| Clause extraction time per contract | 2 hours | 28 minutes | 77% reduction |
| Due diligence review time per deal | 2 days | 4 hours | 75% reduction |
| Missed litigation deadlines (per quarter) | 2–3 | 0 | 100% improvement |
| Matters per month (same team) | 120–150 | 410–500 | 340% increase |
| Additional revenue capacity | Baseline | +USD 620k/year | Billed at market rates |
Cost:
- Development: USD 80k (3 engineers, 4 months)
- Infrastructure: USD 2k/month (Supabase, n8n, Datadog)
- Claude API costs: USD 300–500/month (Haiku + Sonnet, high volume)
- Training: USD 5k (team workshops)
- Year 1 total: USD 128k
ROI:
- Revenue uplift (340% more matters billed): +USD 620k/year (conservatively, 40% of incremental matters new to market, rest from capacity)
- Year 1 net: USD 620k - USD 128k = USD 492k profit
- Multiple: 3.8x ROI in year 1. Payback: 2.5 months.
Lessons Learned
1. Fine-tuning Beats Prompting for Domain-Specific Tasks
Initial approach: prompt Claude to extract clauses from contracts. Accuracy: 82% (missed nuanced clauses, false positives). After fine-tuning on 500 firm-specific examples: 97%. Fine-tuning taught model what "liability cap" means in Argentine law context, when "no remedy" is critical, how firm-specific language differs from generic.
2. Humans Must Still Review Critical Decisions
AI summary + flags generated in 8 minutes. But lawyer must verify (20 minutes). Removing human review entirely would be negligent—law firm's liability. The win is parallelization: AI does repetitive parts, lawyer does judgment.
3. Change Management Is 50% of Success
Paralegals initially worried about job displacement. Reframing as "AI does boring parts, you do interesting parts" was critical. Three paralegals upskilled into AI quality reviewers. Firm's culture shifted from "paralegal = document grunt" to "paralegal = AI trainer + negotiation support".
4. Compliance Requires Immutable Audit Trails
Every AI decision (classification, extraction, summary) is logged with source, model version, confidence score, and human review result. Required for law firm insurance and bar association audit. Adds 5% overhead but non-negotiable.
5. Model Drift Is Real
After 3 months, accuracy on new documents dropped to 91% (firm started handling more real estate deals, fewer M&A). Solution: monthly retraining on last 100 deals + human-flagged errors. Now accuracy stays at 97%.
Competitive Advantage
Before: 120–150 matters/month at average USD 8k per matter = USD 1–1.2M revenue per month.
After: 410–500 matters/month, same pricing = USD 3.3–4M revenue per month.
Firm can now:
- Undercut competitors on price (same margin, lower cost)
- Maintain pricing and pocket 340% higher profit
- Hire one more senior lawyer (business development) while maintaining capacity
- Reduce paralegal headcount by 1–2 and upskill remaining team
Competitors who don't automate will lose talent (paralegals leave for AI companies), lose deals (can't match turnaround time), and shrink.
Roadmap (Year 2+)
- Legal research automation: AI searches jurisprudence (Argentine Supreme Court decisions, regional courts) for precedent relevant to current matter. Parallelize with document prep.
- Opposing counsel style analysis: Identify patterns in how opposing counsel structures clauses (aggressive, conservative, etc.). Preempt likely positions in negotiation.
- Contract risk scoring: Quantify risk of each clause. Flag deals with unusual risk profile before lawyer sees them.
- Client portal: Clients upload documents directly. AI classifies and firm sends summary + required actions for signature. Reduce intake calls by 50%.
Risks and Mitigation
Risk 1: AI hallucinates clauses that don't exist
Mitigation: Confidence score thresholds. Only auto-extract if >90% confidence. <90% flagged for human review. No hallucinations observed in 6-month pilot.
Risk 2: Regulatory pressure (AI in legal decisions)
Mitigation: AI never makes binding decisions. Always human lawyer signs off. Audit trail documents every decision. Firm's insurance covers AI-assisted work (verified with insurer).
Risk 3: Client confidentiality
Mitigation: Fine-tuned model trained only on firm's historical data (no external training data). Documents stored in encrypted Supabase. No data sent to Anthropic unless explicitly disclosed to client and approved.
Bottom Line
Legal service automation isn't about replacing lawyers. It's about multiplying lawyer productivity. This firm's example shows a realistic path: identify 4 repetitive workflows, automate 80% of each, keep humans for judgment and sign-off.
The result: 340% capacity increase, zero operational cost increase (actually slightly lower—fewer paralegals needed, but they're higher-paid roles). Competitive advantage for 12–18 months before competitors copy.
For law firms: the question isn't whether to automate, but which workflows to automate first. Start with highest-volume, lowest-judgment work (document classification, clause extraction). Success breeds organizational buy-in for more ambitious automation (legal research, deal negotiation support).
