Top eDiscovery Challenges and How to Overcome Them
by Aware
Electronic discovery might have been around since the 1960s, but legal teams still grapple with shifting from a reactive to proactive state. The result is investigations that are scattered, expensive, time-consuming, and incomplete. Moving the EDRM lifecycle to the left helps investigation teams to scale, budget, and secure their data more effectively. In this post, we explore what’s holding you back and how to overcome the challenges presented by the digital workplace.
What’s holding legal teams back?
Data management is increasingly complex in workplaces packed with multiple collaboration tools. Research by Enterprise Study Group (ESG) found that almost half of organizations used 6-10 such platforms, while a further 37% used 11-20.
That might seem like a lot, but workplace collaboration isn’t just limited to Slack and Microsoft Teams. When considering the likes of Confluence, Jira, Trello, Asana, Salesforce, and Google Workspace, it’s easy to see how the number adds up. And in all of these tools, employees are exchanging messages, documents, and data on a daily basis.
Getting to that data is often a slow, complicated process. Legal teams might not understand all the places where employees can collaborate and even when they do, they might not have the visibility they need to extract that data. Slack eDiscovery, for example, often requires an Enterprise Grid account or users will have to petition Slack for their own data—which may require a court order.
Then there’s volume to consider. Collaboration tools have overtaken email as the primary source of workplace communications data, and Aware research shows that just 1,000 employees will send over 400,000 messages every month—that’s 5.2 million per year, and all of it is discoverable ESI. Conducting efficient searches at that scale is an enormous part of the challenge of modern eDiscovery.
The data in these tools also represents a new challenge to timely investigations. It comes in nonstandard and complex formats, and—unlike email—collaboration conversations can flow seamlessly between public and private channels, direct messages, and threaded chats. Legacy eDiscovery tools designed for the comparatively rigid structure of emails simply cannot cope with the complexity they encounter in these data sets, and that means information gets lost, missed, and misinterpreted.
People communicate differently in collaboration
It only takes a glance at company Slack or Teams channels to see that this form of workplace communication looks very different to legacy formats like email. It’s chatty, casual, and packed with slang, acronyms, emojis, gifs, and attachments.
Collaboration tools also encourage the overlap between business and personal conversations. This can complicate the blanket application of rules and searches—two friends using expletives in DMs is not the same as an employee cursing in public channels, for instance.
In collaboration, context is king. Messages often lack the structure of email that can provide contextual clues, such as a title, previous email chain, or attachments. Instead, Slack and Teams chats exist in a vacuum that leaves investigators in the dark. Surfacing the surrounding messages alongside potential policy violations can add clarity to investigations and improve their outcomes.
This is where AI-powered filtering can really make a difference for legal teams, by helping to identify the messages that might contain more relevance or risk when the scale of data sprawl makes manual review an almost impossible task.
The meaning of messages is constantly in flux
Language is always evolving, but some parts of it are evolving faster than others. The crying laughing emoji (😂), for instance, has been one of the most widely used over the past 15 years. However, Gen Z and Gen Alpha more often use the sobbing (😭) or skull (💀) emojis in its place.
Understanding language and sentiment fluctuations is an essential part of interpreting context and meaning, but nonstandard language features are making this an increasingly complex task that is often split along generational lines. An older employee may indicate approval or acknowledgement with a 👍, but a younger one may take that as a passive-aggressive slight.
Further complicating matters, courts have now started to define what emojis mean in the workplace.
- In July 2023, a Canadian court ruled that a 👍 emoji constituted a binding contract
- In August 2023, a U.S. federal court ruled that 🚀📈💰emojis counted as financial advice
For legal teams, the problems of emojis (and gifs, memes, and similar editorial content) may begin even before interpretation. Simply collecting this data from collaboration tools is a challenge for legacy tools. Can your current solution capture inline emojis, or do they only display as Unicode squares? What about emojis as reactions, or custom emojis created by employees? Missing any of these can result in vital context being overlooked.
Collaboration messages aren’t static
Another top concern for legal and compliance officers is the dynamic nature of collaboration. Tools like Slack and Teams leave custodians in charge of their messages. At any point, a user can bidirectionally edit or delete past conversations and if they weren’t captured in real time, they may be gone forever.
When you’re in a reactive state, you may start an investigation already missing half the story. There is no mechanism in collaboration tools to stop employees from exfiltrating data, harassing coworkers, or sharing company-sensitive information in the wrong channels and then erasing the evidence. By simply editing a few words, meaning can be changed, doubt infused, and key context altered—and there is no way to preserve a record of these edits natively.
Emails, by contrast, exist as independent documents. Deleting the email from a sender’s outbox does not delete it from a recipient’s inbox. Moreover, recovering a deleted email is much easier than recovering an edited Slack message. Given Aware’s research shows that approximately 1 in 50 collaboration messages are edited or deleted, the scale of this problem is significant.
What risks live in collaboration tools? We analyzed 6.6B messages to find out
The only way to ensure these records are captured is through real-time ingestion. Batch ingesting may be the default model of some DLP solutions, but that is not enough to preserve a complete record in environments where employees can send and delete messages in an instant.
Data review is slow and complicated
When you’re under pressure to meet deadlines, collaboration tools present yet another roadblock—JSON files. These Slack exports are not intuitive to read or interpret, meaning it can take hours to comb through the information they contain and make sense of what it means. Worse, they don’t export easily into existing eDiscovery tools or workflows, necessitating additional manual work.
The more collaboration tools are in place across the organization, the greater the challenge investigators face in consolidating all their data into a single, centralized taxonomy for like-for-like comparison.
Aware simplifies eDiscovery for collaboration
Aware was purpose-built for eDiscovery within collaboration data. Our AI models are trained specifically on this data set and are continually refreshed to show the evolution in how we communicate, no matter the channel we use. Using Aware, legal teams can search all their collaboration tools from a single platform. Searches begin from custodian, keyword, channel type and more, and then refine results by parameters including time, sentiment, message type, and content modifications for more targeted, granular results.
- Real-time ingestion captures a complete record of all messages
- A centralized, defensible archive consolidates data from the entire collaboration stack
- AI/ML workflows enrich messages to enhance federated search and filtering
- Proprietary NLP delivers the industry’s leading sentiment and toxicity analysis
- Ingest and normalize collaboration exports into PDF, RSMF, CSV and more
Aware enables investigators to make informed decisions about the content they’re reviewing, reducing time to context and delivering faster, more accurate results that help legal teams move left by proactively preparing for the inevitable in complex collaboration datasets.