There’s been debate throughout the legal industry about which software product is the superior tool for conducting technology-assisted review (TAR).
I’m no data scientist. In fact, I’m not a scientist at all. I’m not a programmer or a linguist. No PhD; no computer science degree; Heck, I’m not even a particularly highly qualified technical person. Nope. I’m just an operations guy and an educator. People working in operations are generally good at getting things done. Hopefully, we get things right the first time and we move on to the next project. Sometimes we get things wrong and the goal in that instance is to learn something and get it right next time.
It’s no different in legal technology and e-discovery. There’s been debate throughout the legal industry about which software product is the superior tool for conducting technology-assisted review (TAR). I’ve been Part of more discussions than I care to recount about the TAR process, the available tools, and the people using them. I’m not aware of any scientific study demonstrating that any particular TAR software or algorithm is dramatically better or, more importantly, significantly more accurate, than any other. In the end, it seems to me that the only real problem with TAR software –all of them—is the people who use them.
That’s not just the opinion of a somewhat cynical operations guy. It’s true. And I would not write it if it weren’t.
I’ve managed my share of TAR projects. I’ve used or seen used the various flavors of TAR and the outcomes these products produce. To be clear, none of them are perfect and not all of them exceed all expectations in all circumstances. In addition, for nearly two years I’ve been associated with a large group of practitioners and thought leaders –lawyers, judges and e-discovery professionals—who volunteer their time for the EDRM at Duke Law School. There has been healthy debate, a lot of discussion, and even a little discord as we draft a white paper on TAR. We even debated the origin, the use and the propriety of the TAR phrase itself. At this point, I have read I think most of the literature, the majority of which, by the way, does not originate in the legal industry, and I’ve gone out of my way to talk to the data scientists, the linguists and the programmers.
A few days ago, I began wondering what is known to be true about TAR that everyone in the e-discovery space should be able to agree upon.
First, TAR is not artificial intelligence. I know, I know, some folks have taken to generally lumping TAR under the general umbrella of AI-related tools. And I get it. But when you cut through the chaff of the marketing hype, TAR is machine learning. Nothing more; nothing less. It’s the same machine learning that’s been in use since the 1960s to analyze documents in other industries. There’s nothing artificially intelligent about TAR. It does not think or reason on its own. TAR applications analyze the content of files and categorize files based on the examples used to “train” the software. In other words, you get out of a TAR project exactly what you put into it. Anyone who says otherwise is either not being honest or just doesn’t know any better.
Second, TAR works. Whatever tool you’re using, whichever algorithm you deploy, whether it’s active or passive learning, supervised or unsupervised, the bottom line is the technology works. TAR applications effectively analyze, categorize and rank text-based documents.
Third, using a TAR application –any TAR application—saves time and money and results in a reasonable and proportional outcome.
If there is any short-coming of TAR technologies, the blame may fairly be placed at the feet (and in the minds) of humans. After all it is human reviewers who select and categorize the example documents that are used to guide the machine’s learning. It is humans who operate the software. It is humans who analyze the outcome. Perhaps the single-most important component of any TAR process is the thoughtful, deliberate and consistent input provided to the TAR software by human reviewers. If anything goes wrong in this “training” process, one could not realistically expect a satisfactory outcome.
All of this begs the question: Why are more organizations and firms not using TAR to perform document review in discovery? It’s complicated. And I’ve heard every excuse there is. Some say it’s too technical; others claim it puts lawyers out of work. More sensibly, some argue TAR is not entirely reliable, that it’s this black-box technology no one understands, or that it’s a hassle to negotiate the process with opposing parties.
In the end, none of these things pose a real obstacle to the use of TAR. If you think really about it, the TAR process is no different than legal practice and document review 50 years ago when the lead lawyer on a case propagated her knowledge of the facts of the case to younger associates or paralegals.
In my view, the primary reason more organizations are not using TAR is the failure to properly educate the legal populace about the utility of TAR, once again proving that humans are the weak link. As I suggested here a few weeks ago, legal operations personnel are in the best position to drive the discussion and more widespread use of TAR.
(This post originally appeared on Above the Law with some minor alterations.)