…provides the filename “IMPTIFF.DLL”. The CodeClaim collection of code files contains about 490,000 files which are also in NSRL. One of these 490,000 files has the MD5 hash 2bcbe445d25271e95752e5fde8a69082. In CodeClaim, this file is X:\CD0138\CORELWPA\PROGRAMS\IMPTIFF.DLL; the file-system date is March 23, 1995. Of the 811,000 files with the extension DLL in NSRL, CodeClaim currently has about 27,000. I have begun testing a subset of these: about 9,900 uniquely-named DLL files, with a total size of 2.28 GB. “Uniquely-named” means for example that one file with the name “kernel32.dll” was used out of the 90 different versions in CodeClaim; this file was selected at random, and is unlikely to be the newest or largest. A “strings” utility was run on 9,900 DLL files, resulting in about 278 MB of output, about 10% of the size of the underlying code files. This 10% is both an over-estimate and an under-estimate of the usable text to found at least in Windows-based code files. An over-estimate because it contains a large amount of junk which merely looked like readable text to the “strings” utility. An under-estimate because “strings” is only one of at least a dozen methods of extracting useful text from binary code…
Reading the article, it may not seem to have anything to do with IP litigation, but this National Software Reference Library appears to potentially be an important basis for a prior-art software library (that is, not a collection of publications about software, but of text extracted from the software itself, for use as prior art). Modern software generally contains a large amount of useful text. This text would need to be extracted from binary/object files, and then indexed. The National Software Reference Library by Barbara Guttman LinkedIn IP Litigation discussion The list of products in the collection is available at http://www.nsrl.nist.gov/RDS/rds_2.43/NSRLProd.txt (3 MB text file). Of course, to be useful as searchable prior art, either to litigators or the PTO, more would be needed than this list of products or even the list of individual files comprising the products. I’m going to do some tests of text extraction against some of the files in their collection. The fingerprints right now are file-level MD5, SHA1, etc. The original purpose, as I understand it, was so that criminal investigators would know what files they did NOT need to look at when examining a suspect’s computer. They do seem to be…
Source Code & Software Patents: A Guide to Software & Internet Patent Litigation for Attorneys & Experts by Andrew Schulman (http://www.SoftwareLitigationConsulting.com) Detailed outline for forthcoming book Chapter 6: Pre-filing investigation of software/internet/mobile products & services: Examining the product/service, without source code 6.1. Background and introduction to notice pleading, FRCP Rule 11, Local Patent Rules, and likely forthcoming legislation 6.1.0 Conventional wisdom about software products & source code (the “Catch 22” or “chicken-and-egg” problem) 6.1.1 Notice pleading and FRCP now-defunct Form 18 (Patent Infringement) 6.1.2 FRCP Rule 11 in patent litigation; with a brief history of Rule 11; proposed changes to Rule 11 in HR 2655 6.1.2B Fee awards for “exceptional” cases under 35 USC 285, for “objectively baseless” infringement contentions brought in bad faith (see article) 6.1.3 Local Patent Rules and proposed legislation 6.1.4 Application of Rule 11 and Local Patent Rule 3-1 to software/internet patent litigation 6.1.5 Comparison to other instances where plaintiff’s burden is reduced, or presumptions are reversed 6.1.6 Evidentiary doctrines of solely or “peculiarly” possessing information 6.1.7 Selection of cases 6.1.1 Notice pleading and FRCP Form 18 (Patent Infringement) 6.1.1.1 Iqbal, Twombly, and In re Bill of Lading 6.1.1.2 General background on heightened pleading…