TechnologyLawyers' Work Safe from AI Replacement, for Now

Lawyers’ Work Safe from AI Replacement, for Now

Key Takeaways:

  • New benchmarks are being developed to evaluate the ability of large language models (LLMs) to perform legal work in the real world.
  • Current LLMs have critical gaps in their reliability for professional adoption, with the best-performing model scoring only 37% on difficult legal problems.
  • LLMs frequently make inaccurate legal judgments and often reach correct conclusions through incomplete or opaque reasoning processes.
  • Professional benchmarks may still not capture the complexity of real-world legal work, which often involves subjective and challenging questions.
  • LLMs may not be trained to think like lawyers, lacking a mental model of the world and the ability to simulate scenarios and predict outcomes.

Introduction to LLMs in Legal Work
The use of large language models (LLMs) in legal work has been gaining attention in recent years, with many believing that these models have the potential to revolutionize the field. However, new benchmarks are aiming to better measure the models’ ability to do legal work in the real world. The Professional Reasoning Benchmark, published by ScaleAI in November, evaluated leading LLMs on legal and financial tasks designed by professionals in the field. The study found that the models have critical gaps in their reliability for professional adoption, with the best-performing model scoring only 37% on the most difficult legal problems. This means that the model met just over a third of possible points on the evaluation criteria, highlighting the significant limitations of current LLMs in performing legal work.

Limitations of Current LLMs
The study’s findings are consistent with other benchmarks measuring the models’ performance on economically valuable work. The AI Productivity Index, published by the data firm Mercor in September and updated in December, found that the models have "substantial limitations" in performing legal work. The best-performing model scored 77.9% on legal tasks, meaning it satisfied roughly four out of five evaluation criteria. While a model with such a score might generate substantial economic value in some industries, it may not be useful at all in fields where errors are costly. This highlights the need for more accurate and reliable LLMs that can perform legal work with a high degree of accuracy.

Challenges of Legal Reasoning
Unlike math or coding, in which LLMs have made significant progress, legal reasoning may be challenging for the models to learn. The law deals with messy real-world problems, riddled with ambiguity and subjectivity, that often have no right answer. Making matters worse, a lot of legal work isn’t recorded in ways that can be used to train the models. When it is, documents can span hundreds of pages, scattered across statutes, regulations, and court cases that exist in a complex hierarchy. This complexity makes it difficult for LLMs to learn and apply legal reasoning, and it may require significant advances in natural language processing and machine learning to overcome these challenges.

Shortcomings of Current LLM Training
A more fundamental limitation of current LLMs may be that they are simply not trained to think like lawyers. "The reasoning models still don’t fully reason about problems like we humans do," says Julian Nyarko, a law professor at Stanford Law School. The models may lack a mental model of the world—the ability to simulate a scenario and predict what will happen—and that capability could be at the heart of complex legal reasoning. It’s possible that the current paradigm of LLMs trained on next-word prediction gets us only so far, and that new approaches are needed to develop LLMs that can truly think like lawyers.

Future Directions
The development of more accurate and reliable LLMs for legal work will require significant advances in natural language processing and machine learning. It may also require a fundamental shift in how LLMs are trained, with a focus on developing models that can simulate scenarios and predict outcomes. Additionally, the development of more comprehensive benchmarks that capture the complexity of real-world legal work will be essential in evaluating the performance of LLMs. By addressing these challenges and limitations, it may be possible to develop LLMs that can truly support lawyers and other legal professionals in their work, and that can help to improve the efficiency and accuracy of legal services. However, for now, it’s clear that LLMs are not yet ready to replace human lawyers, and that significant work remains to be done to develop models that can truly think like lawyers.

- Advertisement -spot_img

More From UrbanEdge

Coinbase Insider Breach: Leaked Support Tool Screenshots

In May 2025, Coinbase experienced a sophisticated insider breach affecting 70,000 users. Hackers bribed support agents to leak sensitive data, resulting in over $2 million in theft through targeted scams. Coinbase responded by refusing ransom, launching a bounty program, and refunding victims...

Sector Impact Overview: Architecting the AI Integration Era

Sector Impact Overview: Architecting the AI Integration Era 1. Introduction:...

The Pulse of the Global Artificial Intelligence Landscape

This collection of news headlines highlights the rapidly evolving landscape...

NSW Police Tighten Protest Rules Ahead of Israeli President’s Visit

Key Takeaways The NSW Police commissioner has announced an extension...

Meet Team USA’s Most Seasoned Athlete: A Midwest Curler Bound for 2026 Olympics

Key Takeaways Rich Ruohonen, a 54-year-old curler from Minnesota, is...

Maddie Hall Inquest: Family Seeks Answers Over Mental Health Failures

Key Takeaways Madeleine Hall, a 16-year-old girl, died by suicide...

Will Arnett Booted Famous Comedian from Podcast After Just 10 Minutes

Key Takeaways: Will Arnett shares a harsh opinion about a...

Insider Threat: How Unhappy Employees Compromise Data Security

Key Takeaways Disgruntled employees pose a significant cybersecurity threat to...

Zillow’s Concerns Over Compass’ Rising Technology Threat

Key Takeaways: Zillow has identified Compass' growing suite of agent-...
- Advertisement -spot_img