Source code ch.09: Discovery

Source Code & Software Patents: A Guide to Software & Internet Patent Litigation for Attorneys & Experts
by Andrew Schulman (
Detailed outline for forthcoming book

Chapter 9: Source code discovery

9.1 Introduction

  • Apple v. Samsung (ND Cal 2012): “In a typical patent infringement case involving computer software, few tasks excite a defendant less than a requirement that it produce source code. Engineers and management howl at the notion of providing strangers, and especially a fierce competitor, access to the crown jewels. Counsel struggle to understand even exactly what code exists and exactly how it can be made available for reasonable inspection. All sorts of questions are immediately posed. Exactly who representing the plaintiff gets access—and does this list include patent prosecution counsel, undisclosed experts, and so-called ‘competitive decision makers’? Must requirements and specification documents that explain the functionality implemented by the code be included? What compilation, debugging and analysis tools are required? What about the test database and user manuals? Make files? Build files? Does the code have to [be] produce[d] in a native repository such as CVS or Perforce? Must daily builds in development be produced (and if so, in real-time or batch?) or is production limited only to copies in commercial release? Put simply, source code production is disruptive, expensive, and fraught with monumental opportunities to screw up.”
  • Questions from court in Apple v. Samsung, unpacked:
    • Who gets access?; see chapter 11 on protective order (PO) and chapter 15 on exam environment
    • Include docs & specs?; see WHAT below: definition of source code, narrative/explanation
    • Tools?; see chapters on PO, exam environment
    • Test DB, manuals?; see WHAT below
    • Make/build files?; see WHAT below
    • Native, version control?; see HOW format below
    • Daily builds?; see WHEN below; rolling, supplementary source-code production
  • Other questions:
    • WHY need/want source code?; relevance, “doing things with source code”
    • What specific source code is at issue: versions; partial v. all; see WHICH, HOW MUCH
    • Time, place and manner of production?; see WHEN, WHERE, and HOW (both format e.g. native, and exam environment); time/place mostly in chs. on PO, exam environment
    • Costs, cost-shifting, burdens, inaccessibility: see below on problems (really re: src code & e-discovery?), HOW MUCH
    • Missing code, incomplete production, spoliation: see below on problems
    • WHOSE source code: third-party subpoenas, 3P code intermingled, 3P interventions; depends in part on how far need to drill down; P/C/C
    • How interrelate with other discovery mechanisms, esp. rog and dep?; see WHY
    • Forms of abuse (specific to source-code discovery?), and sanctions
  • Other topics from FRCP, ACN:
    • FRCP 34 — Producing documents, electronically stored information (ESI), and inspections
    • 26(b) — Discovery scope & limits
    • 26(d), (e), (f), (g) — Timing & sequence of discovery; supplementing; conferences; signing
    • 37 — Failure to make disclosures; sanctions
    • 45 — Subpoenas (Third party discovery)
    • 33(d) — Option to produce business records in response to interrogatory: as-kept vs. categorized
  • Other discovery-related topics from Local Patent Rules:
    • Mandatory disclosure, without request, in conjunction with production of party contentions; can this otherwise-mandatory disclosure be held back in response to inadequate (or assertedly-inadequate) contentions/claim charts?
    • xxx
  • Loosely structured around questions of who, what, where, why, how, and how much, though in different order
    • WHY is source code being requested: relevance, usefulness, etc.
    • WHAT, WHICH source code specifically: versions, etc.
    • WHO has the burden of explaining, and extracting/selecting the most relevant code
    • HOW MUCH source code should be requested and produced: fishing, dumping, and the quantity case; partial vs. “all”
    • HOW to request: via PIC or specific request, including rog
    • HOW to respond to request: as-kept, vs. per-request; time, place, and manner of production (see also ch. 11 on PO); format
    • WHOSE source code?: possession, custody and control (PCC); third parties subpoenas; intermingled third-party source code
  • Followed by problems: costs, burdens, inaccessibility, sanctions, etc.
  • Relationship of source-code discovery to e-discovery (see also ASU paper on source code discovery relationship to e-discovery)

9.2 Discovery/production of source code: WHY? (relevance, usefulness, necessity) [separate subsections on how might use, what expect to find in code that not public; vs. case relevance, necessity, best evidence]

  • Know WHY you’re asking for source code: what answers do you expect to answer from source code, which cannot be answered from the accused product itself, or from non-source evidence?
  • “Doing things with source code”; see
  • Thinking through now source code will be used in the case: just something your expert needs as basis for opinion, or do you hope to use e.g. at deposition?
  • FRCP ACN on “thinking”: ACN re: 26(b)(1)(i): “oblige lawyers to think through their discovery activities in advance”; re: 26(g): “obliges each attorney to stop and think about the legitimacy of a discovery request, a response thereto, or an objection”
  • Often, source code will simply confirm what is known or strongly suspected from other evidence
  • Source code may act as “mere” corroboration (another data point)
  • Source code produced in discovery may act as authenticated evidence (D produced this material in response to discovery request for source code to x, therefore this material is auto-authenticated as the source code for x; see case xxx; but source code produced for inspection may be “dump” including lots of stuff that didn’t make it into product; scaffolding, testing — make still be relevant e.g. re: making, but not re: selling)
  • When is source code itself the accused instrumentality, rather than “mere evidence” of the structure/operation of an accused product/service?
  • When can’t party make its case without source code?; see xxx on discovery & SJ
  • Interrelation of source code discovery with “informal discovery,” including pre-filing investigation, reverse engineering of accused product
  • Interrelation with other discovery tools, especially rog (see 33(d)) and dep
  • What information is ONLY available from source code: e.g. programmer comments; function/structure names removed in compilation; source code for server processes running behind firewall (code can be tested using network inputs & outputs, but code itself can’t be read)
  • When is source code “necessary,” as opposed to merely useful?
  • For products based on open source, when is source-code production nonetheless necessary or important, e.g. for vendor’s modifications to open source code?; see also ch. 3 on open source
  • Use of standards to by-pass need for source code?; e.g. 802.11 cases; but often do need to see if product really compliant, per marketing, in way which relevant to patent (see case xxx on patent litig followed by false advertising case)
  • When don’t need source code?
  • When does a party actively NOT want source code? (see case xxx where D tried to force its source code onto P, which didn’t want to see)
  • [Reasons for resisting source-code production go here, or in problems section?]
  • Is source code ever “best evidence,” at least in the loose sense of carrying far more weight than non-source evidence?
  • Legal questions to which source code is or isn’t relevant: e.g. when is P’s source code relevant to D’s non-infringement or invalidity case?
  • Relevance of specific versions of source code: see 9.3 below on source code for past, future versions
  • Relevance of specific portions vs. entirety of source code: see 9.5 below on source-code portions
  • Standard for discoverability: likely to lead to discovery of relevant and admissible evidence
  • Improper reasons for requesting source code, and improper reasons for resisting its production

9.3 Discovery/production of WHICH source code?; WHAT to ask for, what to produce (specificity of request; versions)

  • Tendency of attorneys on both sides to refer to “the source code” without sufficient specificity: precise product name, version number, platform (Windows, OSX, iOS, Android, etc.)
  • Know what you’re asking for: specific products, version numbers, platforms, components (avoiding “they need to give us their codes” vague requests)
  • Source code production should include input to, and output from, any code generators, templates, in-house compilers, etc.
  • Avoiding the temptation to ask for “all code which infringes our patent” (asking D to draw legal conclusions and figure out P’s case)
  • “Asking” for source code via sufficiently-specific preliminary/initial infringement contentions, under Local Patent Rule mandatory-disclosure rules
  • Framing source-code discovery requests with specificity, based on pre-filing reverse engineering of accused product

9.3.1 Relevance of source code for past, future, and unreleased versions (WHEN is code from?)

  • Relevance of source code for past versions, including older than SOL/laches date (case xxx)
  • Relevance of source code for future versions, e.g. injunctive relief (case xxx)
  • Relevance of source code for unreleased versions, internal test versions, etc.
  • Intermingled versions (fewer source code “trees” than there are products, with #ifdef etc. for versions)

9.3.2 Definition of “source code” for discovery & PO purposes

  • Definitions of source code in typical protective orders (POs)
  • Does “source code” include xxx, xxx?
  • Non-source docs intermingled with source-code production
  • Source code intermingled with non-source document production
  • Get experts involved early in discovery requests: xxx

9.4 Discovery/production of source code: WHOSE explanatory/selection burdens?

9.4.1 Explanatory burden, including production by request category rather than as-kept xxx [explain why this re: explanation]

  • When is the producing party required to explain or “roadmap” its source code, as part of the source-code production, rather than waiting for e.g. deposition after source-code production?
  • Hand-holding xxx
  • “Teaching” docs xxx
  • 33(d) rog response with source instead of narrative: allowed to produce as-kept vs. aligned per-request?
  • If not produced as kept, then MUST provide some organization?; see 9.7 below

9.4.2 Extraction/selection burden

  • Whose responsibility is it to select relevant source code? (D should knows its code best, but P should know best what it’s looking for; case where D expected to predict P’s case xxx)
  • Burden of creating software to extract xxx
  • Purpose of src code “review” is one party do select/extract from other party material: see xxx
  • Source-code production is usually a “review,” not quite a normal document production, and not quite an in situ inspection
  • burden of extract/select may -> e-discovery inaccessibility factors; is other party willing to do extraction, if first party willing to dump all under PO?

9.5 Discovery/production of HOW MUCH source code? (partial v. all; selection; fishing, dumping, and the quantity case)

  • overly burdensome requests for “all”
  • what does “all” even mean in the context of frequently-revised software?
  • overly burdensome dumping of all, “you go find it”
  • quantity case (hundreds of products, try to agree beforehand on representative instrumentalities for src code production as well as ICs)
  • be careful what you ask for, you just might get it
  • how far need to “drill down” in source code, should depend in part on level of detail in PIC? (but perhaps lack of detail in PIC arguably shows need for more detailed source-code production?)

9.6 Production of source code: time, place, and manner (WHEN, WHERE, HOW)

  • mandatory disclosure under LPR, following sufficient PIC
  • time: including supplementary requests, supplementary production w/o request, duties of supplementation, rolling production, daily builds, etc.; see ch. 14 on scheduling/timing
  • timing “games”: see xxx below on problems
  • when okay to wait; are there really tactical advantages to waiting, or really better for D to produce?; see problems xxx below
    • WHEN: need separate subsection on discovery timing, e.g. discovery & SJ
  • where: see PO chapter
  • manner: see exam environment chapter
  • format: see next section

9.7 Production of source code: format (native; version control)

  • if don’t produce as-kept, then must organize per-request?
  • As kept (native) vs. categorized by request; usable/searchable format
  • version control system
  • see ch. 15 on tools, exam environment
  • “redaction” of comments
  • producing source code as printed rather than electronic form (case xxx); sometimes both parties will agree to production “in native form,” and then production turns out to be a PDF file
  • producing source code as just another doc in non-source production (“source-codeness” has been waived?)

9.8 Discovery of source code from WHOM? (third parties; possession, custody & control)

  • Third party source-code discovery under FRCP 45 subpoena
  • Possession, custody & control of source code
  • Intermingled third-party source code (cases with third-party intervenors)
  • Depends in part on how far need to drill-down: e.g. if parties informally stipulate that function named x does x, may not need lower-level third-party code on which x relies

9.9 Source-code discovery/production problems (costs & cost-shifting; burdens; inaccessibility; missing code; fishing expeditions; sanctions)

  • Poorly-worded requests, or insufficiently detailed PICs
  • Applying Zubulake cost/benefit factors to source-code discovery; see also 9.10 re: e-discovery
  • Inaccessibility, undue cost & burden, proportionality to case need; see also xxx on who extracts/selects
  • Not burden, but “undue” burden: some burden is due; whether undue based on xxx
  • cost is nearly substantive issue, since undue cost is a matter of proportionality, which in turn based on needs for proving/rebut case
  • Patent “trolls” and discovery asymmetry
  • Timing abuse, e.g. last-minute production; see also WHEN
  • Dumb obedience responses, drag out, in response to perceived troll abuse?; but FRCP ACN explains read requests with “good faith”? xxx
  • strategic/tactical use of discovery, and motions; Bone on civpro economics; xxx
  • Missing code not necessarily mean production is “incomplete”; case xxx (significance of “incomplete” production?)
  • Spoliation, missing code; see ch. 10, and ch. 16 on testing completeness of production; how loss of code relates to burdens (e.g. i4i case); inferences from lost code
  • Production of wrong code; see case xxx where parties stipulate that wrong code represents code at issue
  • “gamesmanship” or really don’t know code? (even though “crown jewels”)
  • sanctions appear to be rare, instead party given 14 days or 1 month
  • motions to compel; meet & confer; conference; not bothering judge; whose interest to have judge annoyed?

9.10 Source code & e-discovery

  • How source-code discovery same as, different from other e-discovery, ESI: see ASU article:
    • centralization, dispersal
    • searchability
    • review: inspection then production
    • verification, authentication
    • spoliation: code changes over time, crown jewels get lost?
    • form of production: fewer “native” issues; plain text; but version control (or code within IDE)
    • manner of production: PO, AEO
    • burden of explanation at same time as produce? (any non-source cases e.g. schematics, or all other forms of tech doc production assumed self-explanatory?)
  • Further points, not in ASU article:
    • Code is text, but more than text: how does this affect e-discovery?; search terms used to find, e.g. CamelCase
    • “Metadata”, especially dates; OS file date vs. date inside file; separate file dates, not just in version control, but in xxx
    • Fewer reasons for review, redaction: code unlikely to ever be privileged (is source code written during litigation, e.g. to try design-around, ever work product or attorney/client privileged?; is the cc: attorney gambit ever used with source code?)
    • Cases where comments were “redacted” (e.g. xxx)

9.10 Miscellaneous source-code discovery issues

  • Discovery of expert’s source code, used as basis for forming conclusions, e.g. economist/damages expert models
  • Source code in “discovery about discovery,” e.g. to show in-house records retention/destruction policy
  • Would source code ever be covered by attorney-client or work product privilege? (see 9.9 above)

Print Friendly, PDF & Email