As of June 1, 2024, Vestige Digital Investigations is part of ArcherHall, a leading digital forensics, e-discovery, and cybersecurity service provider.
The Vestige team that you know and trust will continue to serve you at ArcherHall. Our expanded team, capabilities, and infrastructure will allow us to serve you and your clients even better.

De-Duplication

Jump To

Data, Data Everywhere!  It’s true, data is so easy to move, copy, replicate and archive that the modern organization has multiple copies of much of its data.  A study by the American Society for Information Science and Technology finds that 47% of respondents indicated that they sometimes accidentally have more than one copy of the same document, regardless of the organization’s policies and technologies used to control where documents are stored.  While there are other reasons that documents are duplicated (e.g. failure to recognize that it already exists and re-create with substantially the same information, saving email attachments as standalone documents, etc.), the mean level of file duplication within an organization is 21.8% — meaning that “on average, 21.8% of the documents in the file system have the same name as another file”.  (https://asistdl.onlinelibrary.wiley.com/doi/full/10.1002/meet.2011.14504801013)

With all of this data existing within an organization’s corpus of documents, the Discovery process often includes superfluous data.  This leads to:  1) increased data volume which results in higher expenses for processing and review, and 2) inconsistent reviews.  In fact, Vestige has experienced manual reviews of documents where one reviewer codes one copy of the document as non-responsive, while another reviewer codes an exact copy of the document as responsive.  (And, we’ve even seen the SAME reviewer code duplicate copies differently).  Needless to say, this is a significant issue within the electronic discovery of relevant data.

Vestige has the ability to quickly and accurately determine duplicates and near-duplicate documents. We can deduplicate data across custodians (horizontal deduplication) and up-and-down the various sources (vertical deduplication). For example, the same document may exist amongst 5 different custodians (each in their email) and several of the custodians have the document stored outside of the email system, some on external media, others on this hard-drive or a shared drive on a server. That single document may be within the environment, 5, 10, or even 20+ times.

If you’re interested in saving time, energy, resources and money in your reviews and have a higher quality review, talk to Vestige about how our data deduplication solutions fit your needs.

CONTACT US

Related White Papers