Chapter 23: Handling source-code quantity and versions
23.1 Source-code quantity issues
- Plaintiff-generated quantity: the “quantity case” (many products x many patents/claims)
- Defendant-generated quantity: discovery “dumping”; insistence upon duplicating nearly-identical claim charts
- Less is more: large claim charts with duplication contains less information than shorter charts without repetition (but see chapter 26 on claim charts, for judicial pressure for repetitive claim charts)
- Situations in which requesting party wants less than was produced (e.g. request for base file plus “diffs,” rather than for all raw files)
- Prioritizing & triage; schedule & budget; “just say no”; “know when to say no”; when breadth is the enemy of depth
- Writing small scripts, using tools already available on the source-code machine, to analyze large amounts of source code
- Likely to be a relatively small number of crucial source-code files, but will also need different versions of these files (see below), and will likely need the files which call into the crucial files, and the files containing code called by the crucial files (see chapter 19 on tracing calls-to and calls-from)
- Expert/examiner’s job to sift/extract what matters from much larger quantity
- Sometimes the client appears to want quantity rather than quality
- Discovery abuse, make-work, and “turning the crank” [how does source code relate to empirical studies of discovery time and expense?]
- Attorney obligations under FRCP to “stop and think” (see chapter 9 on discovery)
- Intersection of quantity/version burdens with extra expense/time caused by typical source-code PO restrictions, which can perversely lead to over-requesting and over-printing (see chapters 15 and 24)
23.2 Requesting & producing specific versions of “the source code”
- No such thing as “the source code”: specific products, versions, platforms
- Being specific (e.g. in discovery request, and in responding) about product name, version, platform
- Past, present, and future versions
- When are older/shelved versions important?: laches/SOL; priority; prior-user defense; non-infringing alternatives
- When are unreleased/forthcoming versions important?: injunction; non-infringing alternatives
- Generally, what is the relevance of unreleased software in patent infringement litigation?
- Version number of source code is not always clear from the source code itself
- Mapping internal “codenames” (Denali, Yukon, Tsunami, Mt. Rainer, etc.) to product names/versions
- Ensuring that the produced source-code version matches the accused product; often not; attorneys on both sides often accept the “wrong” version as a proxy; see chapter 22 on matching source code to product
- Supplemental discovery for additional versions (see chapter 14 on scheduling for impact of delay in producing all relevant versions)
23.3 Changes to, and evolution of, specific source-code files
- The importance of identifying different versions of the “key” source-code files
- File comparisons, the “diff” utility, and other forms of diff (e.g. WinMerge)
- Creating version “diffs”
- Mapping evolution of relevant functions/data over multiple product versions
- Internal vs. external version of source code file (scrubbed files)
- Vendor’s modified versions of open source files; difficulty of diff/compare on protected source-code machine (see chapter 15 on the source-code examination restrictions created by POs)
- “Versionitis” (“A situation in which there are many different (and possibly incompatible) versions of the same software, file or document”): ensuring that selected source-code files have matching version numbers
23.4 Working with version control systems
- Perforce, Subversion, Git, Mercurial, etc.
- Using version-control command lines
- Acquiring date/timestamp and author information from version control systems
- Recovering source-code files from old version-control databases (PVCS, etc.)
- Version control tracks diffs for a given “branch,” but generally does not handle duplication between branches
- Change logs included in the source-code or non-source productions
Source code cases re: code quantity & versions
- 3Com v. D-Link (Realtek) (missing versions)
- St. Clair IP v. Acer (experts mistakenly looked at wrong version of source code?)
- Apple v. Samsung (D argue produced code is “representative”; re: not producing all source code versions; treatment of code examiner’s “diff” output)
- Mediostream v. Microsoft (code produced “as kept in ordinary course of business” = flat “diff” file)
- LaserDynamics v. Asus (opponent requests production of “diff” files, and/or explanation of differences between similar files)