D4 eDiscovery Service Blog
Aug 15

3 Diff't Document Production Formats and Their Pros and Cons

Let us start with a simple proposition: Not all production requests or productions are created equal. There are good ones, and there are bad ones, and we all know when we’ve been on the receiving end of a bad one. Here is what is generally accepted as far as production formats are concerned. Next time I’ll discuss the different metadata fields that should be requested in productions.

Most productions take one of three forms: (1) tiff/text, (2) searchable pdfs or (3) native form productions. There are advantages and disadvantages of each one of these formats, and will be discussed below. These different formats generally look as follows:

1. Tiff/Text: All documents are converted from native files to Black & White Single Page Group IV Tiffs. Separate, document level text files are also provided for each record. Lastly, an image (opt) and metadata (dat) load file is provided which includes information for every record for which said information was available.

2. Text Searchable PDFs: Essentially, the same as above is provided, but instead of simply exporting the converted images, those images are converted to document level pdfs on export, and then OCR’d to incorporate searchability.

3. Native File Productions: Here, the native file, renamed for its bates number and often including a confidentiality designation, is provided. Separate document level text files are also provided, as well as a metatdata load file.


Tiff/Text generally cost more since most vendors charge to convert a native file into static images. They are also prone to error because not everything converts correctly to image format and outside counsel is often pulling and inserting documents until the very last minute, which can create numbering issues.

One tip is that documents with redactions must be re-OCR’d no matter what type of production you are providing; if you fail to re-OCR the redacted images, you will in fact be producing the text that you redacted out of the image as part of the extracted text from the file.

For Text Searchable PDF Productions, I have to be frank. I don’t know why these exist in this day and age, but I continue to see them being transferred about in the industry. My best guess is that the opposing party may request them from you when they do not have and are unwilling to spend the money on a review platform. They can open up each pdf, run a text search in it, review it and move on.

The biggest disadvantage is that most processing tools do not provide the ability to incorporate text into the pdf as it is being created from the images. You are then left with using the system to re-OCR the pdfs to make them searchable on the back end. OCR is never as good as extracted text because it is less accurate; the OCR engine will always make a mistake somewhere, so it is therefore less desirable than using the extracted text from a native file.

For Native File Productions, the biggest fear that counsel has to grapple with is that they may be missing something in the metadata or the hidden text that could be used against them if the other side discovers it. Hidden text encompasses the likes of track changes and speaker notes, which are often ripe with the thoughts of the document creator.

It is important to review this information before you produce the file, because it will undoubtedly be contained in both the native file and the extracted text from that file. The review team should examine these files in native form (an option in most robust review platforms) so that they can see this information and be confident when the production rolls around.

The key is keeping all of this in mind when you are negotiating with the other side for form of production. Obviously, some forms of production are more cost efficient, but carry a slightly higher risk of exposure.

Next time, I’ll cover the types of metadata fields that you can and should request from the opposing party.

People That Read This Post Also Read:

Part I: The Basics of eDiscovery: Treatment of Document Families

3 Reasons Why eDiscovery Data Should be Processed in Coordinated Universal Time (UTC)

Uncovering ESI (Electronically Stored Information) – 20 Tips for Keyword Searching in eDiscovery

5 Things Every Attorney Should Know about eDiscovery Productions

5 Best Practices For Dealing With Text in eDiscovery

DeNISTing the NIST List

Metadata and eDiscovery: Metadata, Metadata, it’s Everywhere

Tags: , , , , ,

Leave a Reply


Connect with D4