Source Code Protective Orders, from the perspective of a source-code examiner
Andrew Schulman
SLC
Written Sept. 2019
This is the first in a series of articles on source code and how it is handled, examined, and analyzed in litigation. This first article addresses the impact that Protective Orders (POs) have on the source-code examination. Separate articles will discuss source-code discovery disputes, aspects of the source-code exam not directly related to the PO, and so on.
Note that any descriptions here of any particular methodology for source-code examination are exemplary, and not intended to describe strict requirements.
Attorneys working in software-related litigation (such as patent, copyright, and trade secret cases) often agree to protective order (PO) restrictions on source code access, without thinking through the implications for how the source code will be examined by experts and consultants. Certain common PO restrictions may dramatically increase source-code examination time and expense, or in some instances even possibly affect its thoroughness. (A forthcoming article will define “source code” for litigation purposes.)
As a simple example, consider a software copyright case, in which source code from the defendant and plaintiff must be compared against each other to look for literal as well as substantial similarity. Sometimes the two sides will blithely agree to a PO in which each side’s source code must be kept on a separate standalone computer; the two source-code computers will likely be at law offices in two separate cities, and the PO prohibits copying between the two computers. Comparing the two bodies of source code under these circumstances is, to say the least, challenging. That such a PO clause was agreed to in the first place may indicate that the PO was viewed too abstractly, without considering how it would impact a crucial part of the case’s fact finding. (By the way, it actually is possible for source-code examiners to do some minimal comparison under these circumstances, using hand-written notes and a tokenization/sampling method, but this is a last resort.)
The producing party’s law firm, which is hosting its client’s source code for inspection by the other side (we’ll here use the terms “receiving party” and “producing party”), may well understand the PO’s impact on the source-code exam, and may be insisting on certain PO restrictions as much for the adverse effect on the receiving party’s source-code exam time and expense, as to actually protect any trade-secret or confidential nature of the source code itself. Restrictive POs may be insisted upon as a tit-for-tat response to the other side’s “fishing expeditions” or discovery abuse, especially when the other side is a so-called “troll” for whom mutuality of discovery burdens doesn’t hold.
This article urges attorneys, especially those receiving source code productions, to think through how a PO will impact the source-code exam, and if possible ask their expert or non-testifying consultant to review, up front, a proposed PO, rather than later presenting the expert with a fait accompli (which one wit has defined as “French for ‘the train has left the station’ “).
While the author happens to also be a lawyer, he is mainly a source-code examiner, and this article is based on about 20 years experience in source-code examinations for patent, copyright, and trade-secret litigation. The source-code exam environment is heavily affected by the PO, and the main point of this article is to discuss this impact. The material here is applicable to software litigation generally, including patent, copyright and trade-secret cases.
Some typical PO restrictions
Some typical PO restrictions on source-code access are listed below; this list includes both reasonable and unreasonable restrictions (the following list does not cover any clauses that might simply spell out e.g. which types of source code are to be produced, or plain-text vs. inside-version-control format):
- “Attorneys eyes only” (AEO), including outside experts; “Outside attorneys eyes only” (OAEO): no access to source code by the opposing party itself, possibly including in-house counsel, and especially including its in-house engineers; and no sharing even of quotations from source code (thus, a party’s own expert reports and claim charts, when based on the other side’s source code, may need to be redacted, or not shared at all with the party itself).
- Patent prosecution bar: no source-code access for litigation attorneys also helping the other side acquire patents, at least in related areas (see Unwired Planet v. Apple; and what has been called “Geotag v. The Known Universe”).
- Source-code provider receives resumes (CVs) of the other side’s proposed source-code examiners, and may object if the examiner could pose a competitive risk (see Symantec v. Acronis); the examiner may even be “tainted” by access to other source code in previous cases. Note that this standard PO clause of course lets the producing party probe what would otherwise (under FRCP 26(b)(4)(D)) be unavailable information on the identity and background of those of the receiving party’s non-testifying consulting experts who will be examining source code. (Of course, because the producing party must know which is going to be accessing its source code.)
- Source code only accessible on a “standalone” protected computer, without an internet connection. This is intended to prevent opposing experts and consultants from leaking a company’s “crown jewels” to the internet. While a properly-crafted non-disclosure agreement (NDA), signed by professional source-code examiners, would generally provide many of the same protections as a PO, here the locked-down computer does prevent inadvertent expert leakage through e.g. Google searches.
- Protected source-code computer only accessible M-F 9-6 at producing law firm site or escrow facility. This can have a dramatic effect on an expert’s ability to follow up on 2AM brainstorms.
- “No analysis of source code” outside the restricted source-code room: no printing for later analysis; all analysis, i.e. careful reading of the code, must be done in situ. This restricts whether the testifying expert can read source-code print-outs generated by non-testifying consultants. It is difficult to think of a legitimate reason for such clauses, except perhaps to limit printing (see “Forensic Software Analyst May Not Study Printed Source Code” here).
- No ability to copy anything (not just source code) to or from the protected source-code computer: no accessible USB ports, DVD/CD drives, etc.
- Thus, no ability for an examiner to copy software-examination tools to the source-code computer; tools must be agreed upon beforehand, and attorneys unfortunately often agree to the set of tools, without consulting experts as to the tools suited for the particular job (see below on tools). The result is especially problematic when source code has been produced inside version-control files, but matching version-control software such as SVN is absent.
- Further, no ability for examiner to copy notes, examiner’s own script output, etc. from the source-code computer; see below on hand-written note-taking. Non-adverse clients may wish to permit their outside examiners to copy notes to an encrypted flash drive for removal from the source-code review room. However, it’s possible that the same restrictions, applied to the other side’s examiners, should also be applied to one’s own outside examiners, if the underlying reason for the restriction is a “reasonable security precaution” (RSP) to maintain trade secret (TS) status (see underlying reasons discussed below).
- “No copying of source code” interpreted to mean no direct quotations from source code in expert reports, claim charts, etc.; one work-around is to cite proper nouns (function names, variable names, etc.) from source code, without transcribing lines of code, or any portions of lines that contain “logic” (basically, anything with an equal sign). Sometimes not even this is allowed, and the examiner must paraphrase all function/method names in its work product.
- Any memo or other subsequent document which quotes from, or perhaps even refers to, source code will thereby come under the same PO restrictions as the source code itself. As noted above, this may mean that a party can only see a redacted version of its “own” expert’s report which references the other side’s source code.
- Source-code computer only has standard operating-system-provided tools (e.g., findstr, grep, diff, awk, cscript), or those explicitly agreed to in the PO (e.g., SciTools Understand, PowerGrep, Apple XCode, Microsoft Visual Studio, dtSearch, WinMerge, Sigasi (for VHDL or Verilog code), Cygwin (Linux-like environment for Windows), text editors for printing, such as Notepad++, TextWrangler or EditPadPro, etc. For a PO naming such tools, see here. Some parties decide (though not addressed in the PO) to turn off command-line access, which seriously impacts source-code examination by some examiners whose regular practice is to employ command-line tools.
- Examiner may not bring a laptop, mobile devices, etc. into the source-code room; usually internet searches must be done in separate “break-out” room (which is typically provided).
- No laptop in the same room as source code means: only hand-written notes allowed; in some cases, the producing party may have an opportunity to inspect or copy examiner’s hand-written notes. See EPL Holdings v. Apple on note-taking. It is important to preserve all hand-written source-code notes, in case the producing party is later permitted to inspect or copy them. An examiner should carefully read the PO before a source-code examination, and should bring a copy of the PO with them to the source-code exam, to check should questions arise regarding note-taking, printing, command-line access, etc.
- Limitations on quantity of source code files, pages, and/or lines which may be printed; the PO may also restrict print-outs to source-code files themselves, not allowing the (often important) printing of directory listings, script output, etc. (see e.g. Apple v. Samsung, 2013 WL 1563253, preventing Apple from taking printouts of output from the “diff” program: “… if the diff reports are Apple experts’ notes, those notes necessarily include source code, and removal of notes with source code references is prohibited under the Protective Order”).
- Printouts are generally reviewed by the producing party, before production to the requesting party; this in some ways is the very purpose of a source-code exam (see forthcoming article on the purposes for source-code examinations); the producing party may thus glean early inferences into the other side’s case, and this in turn yields a perverse incentive to delay printing source-code files until the last-possible minute.
- Proctor often may sit in room with examiner, or just outside door of source-code room; some POs address whether the proctor may actively monitor the source-code computer screen, keystrokes, file accesses, or searches.
- Sometimes lengthy sign-in/sign-out are required even for brief coffee or toilet breaks: sometimes this results from a genuine (if possibly false) mystique around the source-code’s value, but one additional goal may be sheer annoyance (see “we’re really sticking it to them” below).
- Lack of an internet connection prevents the examiner from comparing the produced source code with e.g. public open source. Lack of web access further prevents copying VeryLongNamesLikeThisOneHere from the source into Google; such a search is often important to determine if code corresponds to a known standard. However, even querying for a source-code snippet can leak confidential information, and preventing even inadvertent leaks is largely the point of turning off internet access. Lack of web access also blocks internet-based tools, such as JavaScript beautifiers, Java deobfuscators, and Google Translate (useful e.g. for source-code comments and strings in Chinese or Korean). Again, this is consistent with the very purpose of the PO, but attorneys should be aware of the impact on the source-code exam.
- Continuing impact of the PO, past the life of the case, including not only continued protection for source code and for litigation-created materials based on that source code, but also possible attempted restrictions on future work by experts and consultants who have been exposed to the source code (e.g. BIAX v. Brother Int’l. on party’s “arrogant” attempt to restrict opposing expert from similar work for four years, based on exposure to source code; such a restriction would seriously limit the “pool” of experts needed by the patent litigation system).
In other words: no laptop in room, hand-written notes only, no direct quotations from source, no internet connection, no cutting and pasting “veryLongFunctionNamesLikeThis” into Google (instead, write them down on paper, walk down the hall, type them in); limited printing; all analysis must be done in situ.
While some of these restrictions are likely reasonable under some circumstances, as a whole it is reminiscent of the Thatcher Library vault room scene in Citizen Kane.
Typical source-code PO restrictions are also discussed in a law review article by Lydia Pallas Loren & Andy Johnson-Laird, “Computer Software-Related Litigation: Discovery and the Overly-Protective Order,” 6 Fed. Courts L.R. (2012); the article covers e.g. security for printed source code in transit; proscribed items in source-code room; requirement that examiner only take handwritten notes; restriction on “studying” source code outside source-code room; proctors; etc.
Blanket source-code PO
Overly restrictive source-code POs often reflect a confusion (or deliberate mystification) of source code, as though “The Source Code” were, in and of itself, a trade secret (TS) and/or confidential business information (CBI), merely by virtue of being source code.
However, much of the information in source code (though certainly not all) is often already public (mostly clearly in the case of “open source”), or readily ascertainable e.g. by reverse engineering a product on the market (apart from license restrictions on reverse engineering, which may not apply in litigation; see author’s article on reverse engineering to investigate software patent infringement). Consider for example a PO on Java source code for an Android app, when the app itself is publicly available, has not been obfuscated, and thus can be readily decompiled (if nothing else, its decision not to obfuscate its publicly-available code may be a factor in determining whether the producing party truly regards its code as TS).
A blanket PO is typically placed over the entire body of produced source code, even when that produced source largely contains open source (albeit perhaps with significant vendor modifications), and even if the owner has not indicated the presence of any specific trade secrets (e.g. server code only directly accessible behind a firewall) and/or of sensitive materials (e.g. code relating to passwords, security, or encryption) contained within the source code. The most confidential portions of the source code are often “comments” by the programmers, which are generally removed before distribution of software to consumers, and hence not susceptible to reverse engineering. But ironically, such comments are unlikely to hold up to a trade-secret analysis of whether the owner derives economic benefit from others not knowing them.
Not surprisingly (“if everything is special, then nothing is special”), blanket treatment of a company’s source code as a single TS can easily lead to under-appreciating the specific TSes which really can reside in source code. See below on TS compilations.
A company often doesn’t know its own TS before the issue comes up. Of course, the company as a whole has knowledge of its own technology, but it may not have thought through which portions of its technology are actually TS (“if only we knew what we know”). Unlike patents which the company must proactively demarcate from the rest of its technology, TS (like copyright) does not require such up-front designation.
Further, most attorneys would be hard-pressed to do a TS review of a large volume of source code, similar to a “privilege review” of emails to be produced in litigation. [[Make clear here that this discussion of TS status of source code re: any litigation involving source code, and not specific to a TS case (which would have a separate issue of TS status of source code).]]
Thus, it is easier for the producing party to use a blanket or umbrella designation of all source code under a PO, have the receiving party extract/select the specific files it wants to make its case, and then review those (likely few) selected items before printing/production. Tt this point it likely would be simple to decide whether these few items are TS, and thus whether further PO restrictions (such as “no analysis outside the source-code room”) apply. However, this opportunity for subsequent review is not typically used.
Under the Federal Rules of Civil Procedure (FRCP) 26(c)(1), a party seeking a PO has the burden of demonstrating “good cause” by showing a particularized need for the protection sought; good cause is shown when “when it is specifically established that disclosure will cause a clearly defined and serious injury. Broad allegations of harm, unsubstantiated by specific examples … will not suffice” (Glenmede Trust v. Thompson; see also tobacco litigation Cipollone v. Liggett; prozac litigation In re Eli Lilly).
The need to specifically show “good cause” appears to be ignored once source code is involved, either because source code itself, as a whole, is assumed to be a trade secret (even though no one would ever regard the near-parallel category of “company internal email” as inherently TS), or because it is (with some good reasons) seen as too time-consuming or expensive to figure out beforehand where the trade secrets are within the code as a whole.
As already noted, companies often have little idea what their valuable trade secrets actually are, until the issue comes up when employees walk out the door to start their own company (“you don’t know what you got until it’s gone”), and so many litigants would be unprepared, if pressed, to show particularized good cause (i.e., something more specific than “our source code is our crown jewels”) as typically required in PO case law.
One solution may be for the producing party to have its own engineers review the relevant source code for TS or other confidential information, prior to production to the other side. This would be feasible for smaller amounts of code, especially if the original developers are still available, but may not be feasible for large bodies of code (it’s not as if there were software that could automatically scan files for the presence of TS), and often the developers most knowledgeable about source code have left the company by time litigation rolls around.
Because of the special blanket protective status typically bestowed on source code as a general category, and because the issue comes up again and again in modern civil litigation, there are model POs covering source code. See e.g. N.D. Ca. (section 9 on source code); E.D. Tex.(paragraph 10 on source code); D. Del. (“Default Standard for Access to Source Code”); ITC (“Source Code Provision to be inserted in Model Commission APO”).
Some of these model POs enshrine unnecessarily restrictive source-code examination practices. The scenario noted at the beginning of this article — two source-code trees requiring comparison with each other, but located on two different machines in two different cities — may arise in a copyright or TS case when adopting a court’s model source-code PO that was designed for use in a patent case. In patent litigation, source code is compared to patent claims, not with other source code, so the use of a patent-related model PO for a copyright or TS case — which is likely heavily based on source-to-source comparisons — is misplaced.
Legitimate reasons for PO restrictions on source code
Source-code POs are becoming more restrictive. Some underlying reasons are legitimate; as many appear to be illegitimate. First, some legitimate reasons for PO restrictions on source code (apart from those restrictions which could be put in place through a standard non-disclosure agreement [NDA]). Not surprisingly, all the following reasons boil down to the goal of protecting the producing party’s TS or confidential information, while still providing the receiving party with access to information to make its case in litigation:
- Keeping the other side’s engineers and patent prosecutors from seeing your source code; see AEO and patent prosecution bar above.
- Maintaining TS status (especially vis-a-vis one’s own employees) of source code, by noisily exercising “reasonable security precautions” (RSP) in litigation with outsiders. But then, the producing party’s own outside experts and consultants perhaps should (despite the inconvenient and cost) be similarly restricted.
- Protecting the likely-genuine TS or CBI status of some portions of the source code, coupled with the perceived difficulty of demarcating these portions beforehand (though shouldn’t the source-code owner know some specifics about its “crown jewels”?), may point towards a blanket PO on the source code as a whole.
- Perhaps protecting the source code as a whole as a single TS “compilation.” Even if known to be comprised largely of publicly-known information such as open source, a company’s source code might be viewed as a secret aggregation of otherwise public information (see case). While reasonable sounding, the author has never heard this put forward as a reason for a highly-restrictive PO, as opposed to a standard NDA. TS compilation status likely requires showing that the selection itself is a valuable secret that has been reasonably protected.
- Preventing inadvertent leakages (through e.g. Google searches for function names appearing in the code) that might occur, were the source code viewed on an internet-connected machine.
Even with these legitimate reasons, litigants should ask, not whether protection in the abstract is needed, but whether a highly-restrictive PO fulfills those reasons better than would an NDA, to an extent that justifies the extra burden to both parties of the PO.
Less-legitimate reasons for restrictive POs
Some less-legitimate reasons to seek a restrictive PO include:
- A desire to create artificial “crown jewels” aura around source code. This may later backfire in damages calculations, e.g. when the source-code owner requests a restrictive PO on the basis of the valuable “crown jewels” or “secret sauce” nature of what it may later, to reduce damages, characterize as its old, stale, no-longer-used code accused of infringement. It may also backfire when the party turns out to have lost some of the source code; because the wheels of justice grind slowly, litigation in this area will often be about partially-missing older code (specific versions and dates are often crucial in source-code examination), which however the owner will have previously held up as its crown jewels.
- A desire to make source-code examination inconvenient, expensive, and time-consuming, for the opposing party (especially if that party is seen as otherwise immune to normal discovery mutuality, as discussed e.g. by Bone, Economics of Civil Procedure).
- The producing party’s law firm may enforce bizarre restrictions to keep its client happy: “we’re really sticking it to them.” Curiously, the law firm sometimes relaxes these restrictions when the client is not present.
- Cynically, the producing party’s law firm may be billing its client by the hour, and/or may charge a markup for expenses. Elaborate PO measures can be very expensive for the producing party as well as for the receiving party (whose examiners may have to fly to another city, and stay in a hotel, to examine the protected source code).
- The producing party may desire access to the receiving party’s source-code examiner’s work product, and the ability to view this may be a “side effect” of the PO.
- Desire to make the other side’s discovery less thorough; a testifying expert may be hesitant to complain of how a PO-restricted source code exam affected their work, or that of their assistants or consulting non-testifying experts. You can just hear the deposition question/answer: “Your source code access was greatly restricted, Dr. Expert, so your report is based on only a partial or limited view of the code, unlike the conclusions of my expert, who had full access to our own source code?” “But wait, you’re the one who put those very restrictions in place, despite knowing that over half the source code was open source.” “Be that as it may, nonetheless your view was restricted and my expert’s view was not?” There may be perfectly good answers, such as “Actually, no, because like I said much of the code was open source, and I examined that, and also your product was shipping with debugging symbols enabled, so the product itself is practically an open book, even without the source code. We needed your produced source code largely to confirm what we already knew from the product itself, and so your restrictions made that confirmation more expensive and time-consuming, but didn’t undermine the basis for my conclusions.” Also note the point made earlier that, if the PO purpose is protect TS, the producing party’s outside experts perhaps should be made to operate under the same restrictions as the receiving party: “what’s good for the goose is good for the gander.”
It is possible that PO restrictions will interfere with what would otherwise be the examiner’s normal non-litigation practice (which is, in turn, a Daubert-related factor in assessing the reliability of testifying experts). At the very least, the typical source-code PO will affect the examiner’s normal working style: for example, M-F 9-5 limits the ability to follow up on 2AM brainstorms. Nonetheless, while operating under sometimes strange PO restrictions, the examiner must do a reasonable examination of the source code.
Conclusion
Those negotiating source-code protective orders should not be overly impressed by utterance of the seemingly-magical incantation “source code.” It is far better to think of source code as akin to a company’s confidential internal emails, i.e., as just another type of document (albeit one which definitely has a more direct role in producing the company’s products; see forthcoming article comparing and contrasting source code to other forms of electronic discovery).
Attorneys negotiating source-code protective orders should try to have their technical expert or consultant assess the impact a proposed or model PO will have on the source code exam, and on the ways in which the needs of a given case may depart from the assumptions of a model PO. Attorneys should push back against certain restrictions, such as forbidding a laptop in the same room as the source code; such a ban results in a requirement that all notes be handwritten, which raises costs and could lower accuracy. At the same time, try to limit source-code discovery requests (see forthcoming article on source code discovery and abuse, and on how the receiving party can often beforehand learn the names of, and ask specifically for, relevant files and functions), so that the other side is less tempted to retaliate with, or has less excuse for, overly restrictive source-code protective orders.
See also version of this article at LinkedIn
See also version of this article at DisputeSoft
See also notes for Protective Orders chapter in Source Code book.