Source code ch.20: Close reading

Source Code & Software Patents: A Guide to Software & Internet Patent Litigation for Attorneys & Experts
by Andrew Schulman (http://www.SoftwareLitigationConsulting.com)
Detailed outline for forthcoming book

Chapter 20: “Close reading” of source code

20.1 General approaches to code reading

  • 20.1.0 What is “close reading” and does it apply to source code?
    • the phrase is often used loosely, without consideration for example of whether the phrase implies reading that is confined “within the four corners” of the document;
    • what is the opposite “remote reading” approach?;
    • is “close reading” necessarily static examination of the text, without consideration of run-time behavior revealed through dynamic examination?; note cases which hinged on run-time behavior (e.g. WMP encoder in xxx; number of times through loop in xxx)
  • 20.1.1 Code Reading by Diomidis Spinellis
  • 20.1.2 Literate Programming by Donald Knuth, but “Code is Not Literature” (code as “specimen” to be “decoded”)
  • 20.1.3 Other materials on code reading & on semi-automation of code analysis
  • 20.1.4 Is there “only one way” to interpret a given piece of code?; is code ever “ambiguous,” or subject to multiple valid interpretations?
  • 20.1.5 Inferring “intent” from code (though this more important in antitrust than in patent litigation: see chapter 29 on correlating source code with non-source documents, e.g. emails)

20.2 Code-reading “gotchas”

  • Preprocessor #if, #ifdef, other means of designating potentially unused or “dead” code
  • Is code controlled by if, while, for, ever executed?
  • Case-sensitive languages: “x” is not “X”
  • Function overloading, overriding, scoping, and namespaces: X() does not necessarily call the immediately-obvious version of X
  • Implicit indirect calls through function pointers: f() may really be (*f)(), with multiple instances of f
  • ifs without if: logic hiding as array indexing: do_tab[cmd](pkt); vtables; table-driven code; state machines
  • Scenarios in which function parameters may be modified
  • Implicit function parameters: call to f(a,b,c); but function body is f(a,b,c,d,e)
  • Aliases or synonyms for variables
  • Operator overloading: hiding (e.g. in C++) of complex operations behind a simple-looking operator such as =
  • Hiding (e.g. in C++) of operations behind implicit/invisible code (object destructor called at scope exit)
  • Virtual or abstract functions, abstract data types (ADT) and templates need an instance to become operative code
  • Similarly, macro code found in include files is not included in the product, unless the macro is used at least once
  • Loops: if the claim calls for a “loop”, the loop should execute at least once, and arguably more than once
  • Plurality: if the claim calls for a “plurality,” make sure there are two or more
  • Executable strings, embedded code for another language (e.g., JavaScript or SQL)
  • Code may in effect create its own mini-language or mini virtual machine, through the use of numbered commands
  • Error handling: when it can be ignored, when not
  • Remember that in the case of structured exception handling (SEH) and other event-handling functions, and e.g. C++ destructors, code may be executed at point x which is not at all visible at point x in the source code
  • Names: do not assume that a name accurately reflects the function, variable, class, etc. named (though the source-code owner will look silly if it denies that its code does what its plain name appears to say it does; see case xxx with “prediction”)
  • Ambiguous function names (e.g. does “FitInRect” test for fit, or does it perform fit?)
  • Comments: do not assume that comments are accurate
  • If order of operation matters, consider the effect of threading
  • When claim limitation requires an absence (“without”) or only one (“solely”), actively seek out disconfirming instances
  • etc.

20.3 When & when not to “drill down”

  • 20.3.1 Sometimes the examiner must drill down to otherwise-irrelevant lower levels, in order to confirm that code at an upper level really does what is implied by its name or comments
  • 20.3.2 Short-circuiting source-code analysis re: use/implementation of standard (see e.g. 802.11 cases in ch.xxx)