Back to Professional-Advisers-Litigation-Support Rankings

USA: An Introduction to eDiscovery

Legal teams that handle eDiscovery matters today face unprecedented pressure due to various economic, regulatory and technological trends. While these tensions are burdensome, they are also serving as accelerators for innovation and evolution within the eDiscovery space, especially around information governance and document review. This overview outlines the key stressors and challenges impacting legal teams today. We also highlight the innovative ways our clients are evolving to overcome these challenges.

Key Challenges Impacting Legal Teams  

Economic volatility, high inflation rates and the near-constant threat of recession

The current economic climate has led to significantly reduced legal budgets and lay-offs, forcing attorneys to do the same (or higher) volume of work with fewer resources. On the in-house side, corporate legal departments are facing intense pressure to become a value-add to the business, rather than operating primarily as a legal guardian and cost centre. In the same vein, to compete with the growing number of consulting firms and alternative legal service providers, attorneys at law firms are expected to add value beyond what has traditionally been deemed “legal work,” such as information governance. As their clients face the same economic headwinds, pressure to move towards early settlement or matter resolution is also reducing the number of billable hours corporate clients require from their outside counsel. This means that attorneys must expand their client portfolios and skills, bridging into new practice areas, to meet their billable obligations.

Record volumes of cloud data and constantly evolving data sources

The prevalence of cloud-based collaboration and communications platforms since 2020 has dramatically increased corporate data volumes and created a host of novel data challenges for eDiscovery teams. Mobile data, messaging platforms and collaboration tools have added a layer of complexity and nuance to how data must be handled for eDiscovery processes. Employees are increasingly using a wider variety of third-party platforms, applications and devices to collaborate on a project or discuss a single subject matter. These technologies may not be integrated with each other and are often not designed or implemented with downstream legal and eDiscovery implications in mind – leading to significant identification, preservation, collection, processing and review challenges. It is also easier than ever for employees to download and use unsanctioned communication applications for work purposes, creating additional legal risk, including for eDiscovery and enterprise data leakage. Recognising that individuals are increasingly using alternative methods to evade record-keeping and discovery; government agencies and regulators have increased scrutiny around the use and management of third-party systems. Over the past two years, regulators have issued large penalties (often targeted at financial and technology companies) related to record-keeping violations for improper retention of communications data. In February of 2023, the Department of Justice (DOJ) also issued new directives and compliance guidelines around third-party messaging applications for regulated entities.

Expected disruption caused by generative AI 

At the same time, generative AI is poised to disrupt the legal and eDiscovery world at an unprecedented scale. Kick-started by the release of ChatGPT in November of 2022, the unique user accessibility offered by publicly available generative AI tools has led to an AI renaissance. Companies and employees across every industry are testing the technology’s capabilities, looking for untapped opportunities to reduce the burden and risk of manual tasks. While the potential use cases are exciting, the rapid pace and scale of generative AI adoption by corporations (and their employees) raise significant concerns for legal teams, especially around the eDiscovery, data security and client confidentiality impacts and risks. On the eDiscovery side, the groundswell of generative AI technology use has the potential to create a deluge of novel business-related data and applications. However, there are currently more questions than answers regarding how to handle this type of data within the parameters of discovery law (eg, Who is the “custodian” of content created with generative AI? Who takes responsibility for misinformation in generative AI-created content?). It also remains to be seen how agencies and government bodies will regulate and/or oversee the increased use of AI across industries. If the sea change of paper discovery to eDiscovery is any guide, however, tackling the challenges posed by generative AI will require the strategic adoption of advanced analytics and innovative workflows to meet the moment.

Innovative Trends to Overcome Current eDiscovery Challenges

To overcome the challenges outlined above, innovative eDiscovery teams are taking a more holistic and proactive approach to managing the data that may end up in document review queues. Some of the key strategies are set out below. 

Becoming more proactive and strategic about the decisions made to manage in-place data

As noted above, the massive volume and complexity of managing and reviewing data from cloud sources are driving up the cost of document review and increasing risks for eDiscovery teams. This challenge will only increase with the addition of data created by generative AI technology. In-house legal teams are realising the importance of partnering and collaborating with internal and external stakeholders (eg, IT, compliance, data privacy and information governance experts) to ensure that all data is effectively managed from the very beginning of the data lifecycle within the enterprise. Organisations should have a well-thought-out, standardised, and documented information governance framework, including a robust records retention and deletion policy and practical guidance for employees. Company policies must consider enterprise systems (and how they are used by employees) – especially around communication and collaboration platforms.

Partnering with internal and external stakeholders with expertise on applicable cloud-based platforms and other new technologies is also becoming increasingly important to reduce legal and eDiscovery risks. Cloud-based platforms and applications (eg, M365, Google Workspace, Discord, Slack or Signal), and any new generative AI technologies, must be implemented and monitored in a way that ensures controls and measures are in place for appropriate preservation, disposal, and monitoring of data.

Similarly, forward-thinking legal teams are implementing data reduction strategies long before data is exported for review. Data reduction can be faster and more effective when it is applied to in-place data within the corporate enterprise. Native search and reporting tools within Microsoft Purview and Google Workspace, for example, can be used to interrogate in-place data and identify correct data sources, vet search results before committing to a large collection, and sample and validate data collections – all before data is exported for review. Well-crafted electronically stored information (ESI) protocols can also be used to narrow the scope of what data sources are collected before data is even identified or collected for review.

Taking a more technology-forward approach to eDiscovery with advanced AI and analytic eDiscovery technology

The excitement around the public release of tools like OpenAI’s ChatGPT and Google’s Bard ushered in an explosive interest in using generative AI technology for eDiscovery tasks. However, this initial public discourse by industry commentators is slightly premature – due to the technological, privacy, security and ethical limitations and risks of using those tools to analyse client data in a legal environment. But while legal media headlines, continuing legal education panels, and webinars have primarily discussed newly released generative AI tools as brand-new technology, the AI building blocks behind those tools are not new to experienced tech developers within the eDiscovery space. In recent years, natural language processing (NLP) and large language models such as BERT have been used to develop advanced AI tools built specifically for eDiscovery. Those building blocks are what make tools like ChatGPT and Bard so effective at understanding and analysing human language. And unlike newly released generative AI tools, the advanced AI tools being used in eDiscovery today are purpose-built to accurately learn from and classify corporate data within a closed environment for legal review. In fact, some have already been approved for data reduction and disposition in regulatory settings by government agencies.

As experienced technology providers continue to work on developing generative AI tools to modernise eDiscovery, legal and eDiscovery teams should be learning how to leverage existing advanced AI and analytic tools to manage many of the challenges highlighted above more effectively. Integrating these technologies into review workflows can drastically and defensibly reduce the volume of documents that require “eyes on” human review and help teams review the remaining data faster and more accurately. They are also highly effective at mitigating instances of inadvertently producing sensitive and privileged documents and can automate tasks that have traditionally required manual work.

For example, advanced analytic technology can replace traditional linear review teams with a combination of AI and linguistic models to deliver production-ready responsive review results at a fraction of the price of traditional review. These tools can also be used to build more accurate classification models with rapid stabilisation rates and low training set requirements for privilege review. This enables review teams to evolve past labouring over outdated, overly broad keyword search term reports and attorney name lists, and significantly reduce time spent on expensive privilege review. Similarly, these types of advanced analytic tools can be used to automate the creation of privilege logs and as a quality control check to ensure that existing privilege logs are consistent and accurate.

Equally important, advanced analytic tools can help isolate key documents much earlier on in a matter, and even help prove a negative inference or theory. This early access to information empowers corporate legal teams and law firms to make better, more strategic decisions from the outset of a matter. For organisations with large eDiscovery portfolios, this can translate into millions of dollars in savings year over year.

The unprecedented level of excitement around using generative AI for document review is a clear sign that legal teams are in desperate need of technology that can alleviate the burdens of modern data. Indeed, it is becoming imperative for eDiscovery teams to learn how to integrate existing advanced AI and analytic technology into review workflows to manage today’s growing and dynamic datasets. Forward-thinking attorneys can use this new cultural momentum to push their teams and organisations towards a more technologically forward mindset. In turn, this mindset will enable those teams to evolve alongside AI, putting them in a better position to take advantage of the role that generative AI will undoubtedly play in revolutionising eDiscovery in the future.

Minimising the need to re-review the same data across multiple matters

Despite growing data volumes, legal teams are still paying for review teams to repeatedly review the same data for similar issues across multiple matters – sometimes hundreds or even thousands of times. Due to the challenges outlined above, that approach is becoming untenable. Forward-thinking legal teams are finding ways to challenge the status quo by leveraging the time and money spent on past reviews to inform their work on new matters. This approach helps significantly reduce legal spend and improve consistency across matters. There are a variety of methods that can be used to minimise the need for re-review across matters, ranging from more programmatic strategies that leverage advanced AI tools across matters to more manual strategic review workflows created for clusters of related matters.

On the programmatic end, the advanced AI tools we mentioned above can be used to identify when the same or similar documents appeared in other matters and inform reviewers how that document was coded previously. These tools can also be used to create custom classifiers for content within documents that remains somewhat static across matters (ie, privilege, personally identifiable information, trade secrets, etc). These AI classifiers are trained with the company’s data and continue to learn as each new matter is ingested, rendering their predictions more accurate than out-of-the-box third-party classifiers or outdated search terms. This accuracy translates into reduced review cost and time overall, while greatly minimising the need to re-review the same or similar documents for those categories in new matters. Importantly, they also help transform the costliest portion of eDiscovery (document review) into a value-add. Data and attorney work product from past matters can provide a treasure trove of unprecedented insight – not only for case teams working on ongoing matters but across other sectors of a company’s business (compliance, HR, IT, etc). Tapping into these insights with the power of advanced AI can help in-house and outside counsel stand out by handing those teams the keys to making more strategic and efficient business decisions.

Other methods of minimising the need for redundant review across matters include creating a “junk bank” to avoid paying review teams to repeatedly re-review the same objectively non-responsive documents, working with review experts to create custom workflows across multiple databases for clusters of related matters (eg, multi-district litigation or matters with joint defence teams), and reusing coding on commonly requested documents requested (eg, corporate board minutes or regulatory documents).

Partnering with specialised experts and technology-forward review teams to create more efficient and effective review workflows

As data volumes become larger and more complex, legal teams are pushing to move past the “one size fits all” approach to document review. Asking document reviewers to review and tag each document in their assigned batch across a myriad of categories (eg, responsiveness, privilege, confidentiality, protected personal information, trade secret, issue codes or hot documents) is neither effective nor efficient given today’s modern data volumes.

Forward-thinking legal teams are partnering with specialised experts to optimise review by segregating review tasks into distinct workflows for better, faster outcomes.

Linguistic experts can create or optimise keywords and search terms at the outset of a matter to defensibly reduce review datasets by hundreds of thousands to millions of documents. Linguists and information retrieval experts can use linguistic modelling to craft and execute sophisticated search queries that identify matter issues and key documents faster and more accurately than a linear review team approach. This speeds up document review overall, as reviewers have fewer categories to assess for each document, leading to a faster, more accurate review.

AI and advanced analytic experts can defensibly and significantly reduce the amount of data that needs linear, eyes-on review by using the appropriate technology for the given review goals and matter data. They can also help explain the necessity and defensibility of AI and analytic workflows to clients, opposing counsel, judges, and government agencies.

Experienced review strategy consultants understand when and how to bring in specialised experts, manage and motivate review teams, and create an optimised review plan that ties all disparate workflows together cohesively for final production.

Technology-forward document review teams are trained to review with data reduction and efficiency in mind. These teams can identify workflows and data that are slowing review times or creating review challenges before they have more lasting downstream effects on review progress. Using document review teams who have this training reduces the cost, risk, and time of document review overall.


As this overview outlines, innovation and evolution are necessary to address the challenges of economic volatility and emerging data sources. The legal industry is best positioned to innovate and evolve through the powerful combination of AI technology and specialised expertise (such as linguists and information retrieval specialists), which can manage the growing volumes and complexity of data efficiently, effectively, and defensibly. With cost and risk reduction in mind, more specifically legal teams and eDiscovery providers must:

– focus their innovation on methods to improve the data management and review process;

– leverage AI to effectively identify the information legal teams need, as well as to reduce the amount of “eyes on” costly and time-consuming review required for larger data volumes;

– push more of the data culling and classification upstream into the enterprise and find ways to leverage AI in that process, to reduce data volumes hosted with third-party eDiscovery providers – this will reduce costs and mitigate risks created by corporate data being dispersed across multiple providers.; and

– take a holistic approach towards managing the enterprise data lifecycle (ie, use native tools in enterprise systems such as M365 and Google Workspace to reduce downstream costs).