The Disclosure Menu in a world of big data

The volume of data that is potentially disclosable in construction disputes (and indeed disputes more generally) appears to increase exponentially with every year that passes. As Claire King and Stacy Sinclair explain, in that context choosing the right method for disclosure will not only enable a party to find and marshal the evidence they need to support their position, but also to ensure that the costs of disclosure do not become disproportionate to the value of the case or indeed the value that the exercise of disclosure brings to the table in the first place.

The English law concept of “Standard Disclosure” (where a party discloses those documents: (i) relied on by a party; (ii) adversely affecting that party’s case; (iii) adversely affecting another party’s case; (iv) supporting another party’s case; (v) required by a Practice Direction to be disclosed)1 sits somewhere between the very extensive and expensive discovery procedures found in the US and the much narrower civil law disclosure requirements where (broadly speaking) the parties only disclose what they are relying on.

However, parties increasingly need to consider options other than standard disclosure, as well as utilising the new technological tools available, in order to ensure that the costs of standard disclosure do not become totally disproportionate to the value of the claim.

In the case of paper disclosure, parties usually know what paper they have. Here, the problem is merely locating it physically and going through it to produce the documents required by the standard disclosure test. The problem with electronically stored information (“ESI”) is that parties often do not know how much ESI they have, or even the location of all the places where it might be found. This article examines a range of potential disclosure options available on big data cases, aside from traditional standard disclosure and manual review, including:

(i)     Predictive coding: increasingly an option as the technology available for this continues to improve and was approved by the courts in the recent cases of Pyrrho Investments Limited v MWB Property Ltd2 and Brown v BCA Trading Ltd3

(ii)    Reliance disclosure: the option favoured in international arbitrations and closely linked to the traditions of the civil law system; and

(iii)   Keys to the warehouse: a lesser known option but one suggested by Lord Justice Jackson in his 2011 lecture on “Controlling the Costs of Disclosure”4 and one which has been ordered in both an arbitration and a TCC claim Fenwick Elliott was involved with.

Before doing so, we briefly review why standard disclosure can be very expensive, and inefficient, especially in cases where there is a high volume of data.

Standard disclosure and manual review

Although solicitors (and indeed Judges and arbitrators) are becoming more and more aware of the options available on disclosure, some do remain attached to manual review and standard disclosure. The cost of carrying out such reviews can, however, be disproportional to the benefits of doing so, even when aided by electronic disclosure. A team of paralegals will still need to sift through the evidence assessing for relevance (albeit keywords may assist in getting rid of obviously irrelevant documents such as junk emails), their results will need to be checked and key documents filtered upwards for the benefit of the core legal team. The parties can try and reduce the data pool by agreeing key custodians and date ranges but in big claims with numerous custodians and terabytes of data, this can only get you so far.

The cost of reviewing data manually for standard disclosure can therefore be extremely high, inevitably involving a large team of paralegals at the coal face whose efficiency will naturally slip if they spend too long on such reviews on any given day. The Rand Review, Where the Money Goes, published in 2012 by a US not-for-profit organisation concluded that some 70% of the costs associated with the e-disclosure process concern this review function.5

Accordingly, parties need to think very hard about what benefits can be obtained from reviewing documents individually for the purposes of providing standard disclosure when compared against the costs of actually doing so.

Predictive coding

Predictive coding is a document review technology that allows computers to predict particular document classifications (e.g. relevant or privileged) based on coding decisions made by the lawyers running the claims in question. Broadly speaking a seed set of data is coded by a senior lawyer with in-depth knowledge of the case. The results are analysed by the predictive coding software and sample sets are generated which can also be coded to increase the level of accuracy and apply the coding across the whole data set.6

The system brings potentially massive costs savings given the high percentage of costs associated with the review function. However it has limitations. The results are only as good as the coding done on the seed sets of data which will need checking and there may also be limitations on very complex multi-issue cases where no one seed set will cover all the issues in dispute. Having said that, there is now some evidence that in fact predictive coding can lead to more accurate coding than manual review.7

In Pyrrho Investments Ltd v MWB Property Ltd8 the English courts approved the use of predictive coding for the first time. Other jurisdictions have arguably been more ahead of the game in this respect, with the US in particular leading the way. Indeed, Ireland had already approved its use the year before.9

Master Matthews noted in approving its use that: experience in other jurisdictions has shown it can be useful in “appropriate cases”; there was some evidence to show predictive coding could be more accurate than manual review alone or manual review combined with keyword searches; the costs of full manual review of 3.1 million documents would be unreasonable; the claim was worth tens of millions and accordingly the cost of the software was proportionate and if the software did not work there was enough time to resolve the issues.[10

In the more recent case of David Brown v BCA Trading (unreported) the court again approved the use of predictive coding, with the law firm pushing for predictive coding, estimating it would cost one third of what the manual review would be.

As technology continues to improve, the advantages of using predictive coding and the frequency of its use in cases which are fairly high value with big data sets are set to increase exponentially.

Reliance disclosure

Reliance disclosure is a favourite in international arbitration. Here, parties only disclose, in the first instance, those documents on which they rely. In other words, parties only disclose those documents which evidence their arguments and benefit their own case.

The IBA Rules on the Taking of Evidence in International Arbitration[11 are commonly used to supplement other institutional arbitration rules. In the IBA rules, the parties first submit those documents on which they rely, except for any documents which have already been submitted previously. Then, parties submit a “Request to Procure”. In this request, the party sets out a description of each document, or a description of a category of documents, they are seeking from the other party, which they reasonably believe to exist. The request must include such statements as to how the documents (or category of documents) are relevant to the case and why it is assumed that the other party has possession of such documents. The arbitral tribunal then orders production of the requested documents or deals with any objections made.

Some swear by this method of disclosure, but it can cause additional costs as parties do tend to argue over the schedules of documents produced for the “Request to Procure” (otherwise known as Redfern schedules which are meant to collaboratively collect each party’s position in respect of each document request). Parties might argue that a particular document request is unduly burdensome and/or is not relevant to the issues in the case. In addition, there may be repeated requests for disclosure of specific documents, causing the disclosure process to become protracted and costly, rather than a discrete, fixed period. Arguably, these disputes over the request for documents are not so dissimilar to those arguments over keywords in standard disclosure, which unfortunately are all too common.

Key to the warehouse

The keys to the warehouse is a lesser known option in disclosure and a term that was first coined by Lord Justice Jackson in his 2011 lecture on “Controlling the Costs of Disclosure”.[12 In that lecture he stated:

“4.7 One possible order under sub-para (f) – the key to the warehouse. One possible order which could be made under rule 31.5 (4) (f) is that each side (after removing privileged documents) should simply hand over the ‘key to the warehouse’. In other words, each party hands over all its documents and the other side can choose which ones it wishes to use. This means that each party devotes its resources to selecting what it regards as helpful from the other side’s store of documents. That is the opposite of standard disclosure, which requires each party to examine its own documents and (in effect) to pick out the ones that it thinks will help the other side. I am aware of one recent case in which a ‘key to the warehouse’ order was made by the Technology and Construction Court.” [Emphasis added]

At its simplest then, keys to the warehouse involves handing over all of the documents potentially relevant to the dispute after having removed privileged information and, to the extent possible without a manual review, junk or completely irrelevant information. What constitutes the warehouse will obviously need to be clearly defined in order to avoid disputes between the parties at a later stage. Parties may, for example, want to agree a list of key custodians and apply date range filters to any collection as well.

Keys to the warehouse may become a less useful option as predictive coding gets more advanced but it may be useful in disputes with high volumes of documents and extensive lists of issues which can make predictive coding very difficult. The parties obviously will need to ensure that there is clear agreement between them that if privileged documents are accidentally disclosed they will be returned without the other side having read them. (The wording in the TeCSA protocol is ideal in this respect.)[13

The inherent issue some parties may have with this is the fear of handing over documents whose content has not been reviewed in detail before being disclosed. However, where the volumes of data are sufficiently large to make the costs of manual review significant and disproportionate, this is a very real alternative. Lawyers and paralegals can still carry out targeted searches to support their pleadings on their own documents and the ones they hand over. However, this option avoids the downsides of reliance disclosure (i.e. repeated applications for documents) and the costs of manual review.


Choosing which method of disclosure is right for any particular case, and subsequently reaching agreement with the other side, is never easy. Relevant factors include: complexity and the number of issues in dispute, the number of documents involved and the size of each party’s database, the value of the dispute, the forum of the dispute (litigation/arbitration) and the openness and willingness of each party to use new technologies. Whatever method is employed, careful advanced consideration and planning is needed to ensure the process is reasonable, proportionate and efficient. In an age of Artificial Intelligence with technological advances constantly on the horizon, electronic disclosure is destined to continue to evolve in the near future.

Watch this space…

Back to the previous page | Next article

  • 1. See CPR 31.10
  • 2. [2016] EWHC 256 (Ch)
  • 3. [2016]EWHC 1464 (Ch)
  • 4. See Lord Justice Jackson’s lecture on “Controlling the Costs of Disclosure”, Seventh Lecture in the Implementation Programmes, the LexisNexis Conference on Avoiding and Resolving Constructions Disputes, 24 November 2011
  • 5. See Nicholas Pace and Laura Zakaras, Where the Money Goes: Understanding Litigant Expenditures for Producing Electronic Discovery (the Rand Institute for Civil Justice), April 2012
  • 6. See Kroll OnTrack, Mastering Predictive Coding: The Ultimate Guide, key considerations and best practices to help you increase ediscovery efficiencies and save money with predictive coding 2014
  • 7. See paragraph 33 (2) of Pyrrho Investments Ltd v MWB Property Ltd [2016] EWHC 256 (Ch)
  • 8. [2016] EWHC 256 (Ch)
  • 9. See Bank Resolution Corporation Limited and others v Sean Quinn and others [2015] IEHC 175
  • 10. See paragraph 33 of Pyrrho Investments Limited v MWB Property Limited [2016] EWHC 256 (Ch)
  • 11. The IBA adopted new rules on 29 May 2010, which supersede the 1999 version
  • 12.
  • 13.

Download our latest Annual Review


Subscribe to our newsletters

We regularly produce newsletters, articles and papers to keep our clients and other stakeholders up to date with the latest developments and debates in construction and energy law. You can browse some of our most recent materials Here, or sign up to our monthly publications below to receive them directly to your inbox.

Sign up