Reverse Engineering book

Reverse Engineering: Purposes, Methodologies, Tools, and Law

by Andrew Schulman

The following are notes for a forthcoming book, which will include the use of reverse engineering as a fact-gathering tool in litigation, when the working or composition of a product or system is at issue. Contact the author for more information.

Also see articles on “Reverse engineering as a fact-investigation tool in software patent litigation,” “Hiding in plain sight: Using reverse engineering to uncover (or help show absence of) software patent infringement,” and “Open to inspection: Using reverse engineering to uncover software prior art.”

The forthcoming book will include detailed coverage of hardware reverse engineering (based on the work of GreyB, such as “How we used electrical signal analysis” to detect smartphone processes), and of reverse engineering for non-litigation purposes.

The outline as currently planned (with major sections on simple vs. static vs. dynamic examination) puts more emphasis on specific tools than is consistent with the book’s goal of stressing what one is trying to accomplish with reverse engineering (what types of questions it can answer), and de-emphasizing how to use this or that NiftyTool with this or that specific version of a target product.

Summary outline

Part One: An overview of reverse engineering purposes, methodologies, tools, and law

  1. Introduction, with several newsworthy examples of reverse engineering
  2. Benefits of the outsider’s perspective
  3. Defining reverse engineering: what it is, and is not
  4. How reverse engineering relates to other means for learning about technology
  5. Reverse engineering methodology and heuristics (including methods/heuristics for source-code examination)
  6. Why reverse engineering?: Purposes and goals
  7. Legal and ethical questions
  8. Types of reverse engineering, and important distinctions (as-built vs. as-designed; dynamic vs. static analysis; “behavioral” vs. code-based analysis; etc.)
  9. Reverse engineering tools, and general tool concepts
  10. Teardowns and composition analysis: Using components and modularity in reverse engineering
  11. Acquiring the target: The sometimes-surprisingly-difficult task of obtaining the product or process to be examined
  12. Formulating narrow technical questions that can be answered with reverse engineering

Part Two: Simple software reverse engineering: Treating code as data

  1. Software reverse engineering as an example of reverse engineering generally
  2. Code is also data: “Unstructured” or format-agnostic inspection
  3. Hex dumpers and editors
  4. Text inside binaries: strings
  5. “Magic numbers,” signatures, and scanning

Part Three: Simple software reverse engineering with format-specific tools

  1. “Structured” inspection: executable file formats
  2. Using dynamic-linking and shared-library import and export headers
  3. Mapping inter-module dependencies
  4. Using debug symbol files and library files
  5. Inspecting menus, dialogs, and other resources
  6. Inspecting Apple OSX and iOS binaries
  7. Inspecting .NET, Android, and ELF binaries

Part Four: Using the output of simple reverse-engineering tools

  1. Reverse engineering is a tool for answering questions, not an end in itself
  2. Using the command line (CLI), and tools with plain-text output
  3. Correlating reverse engineering with public information (and with non-public documents such as company internal emails accessed during the discovery phase of litigation)
  4. Scripting to answer specific questions
  5. Repositories and “Big Code”: Building databases, and the importance of continuity
  6. Moving to static and dynamic reverse engineering; legal implications of simple reverse engineering

Part Five: Static reverse engineering with disassemblers

  1. Introduction to static reverse engineering: disassembly and decompilation
  2. “Use the Source, Luke” (UTSL): Source code or near-source code may already be available
  3. Producing a disassembly listing
  4. Navigating a disassembly listing: calls and jumps
  5. Navigating an Apple OSX/iOS Objective-C disassembly listing
  6. ARM, other processors, and special languages
  7. Scripting to extract information from disassembly listings [see ancient example of NiceDbg]
  8. Understanding and improving a disassembly listing
  9. Using symbols, strings, “magic numbers” and signatures to identify code, including library code and compiled open source
  10. Recognizing basic C/C++ constructions in assembly language
  11. Code/data separation, data structures, and tables
  12. Function pointers, jump tables, on-event handlers, and hooks

Part Six: Static reverse engineering with decompilers

  1. Introduction to decompilation with Java and Android
  2. Decompiling .NET (COM/OCX/OLE) code
  3. Decompiling with NSA Ghidra and IDA Pro
  4. Code obfuscation and de-obfuscation, including Java and JS deobfuscators
  5. Using source-code tools with decompilation listings
  6. Moving from simple and static, to dynamic reverse engineering

Part Seven: Dynamic reverse engineering with monitoring tools

  1. Introduction to dynamic reverse engineering, and contrast to static reverse engineering
  2. Network monitoring (“packet sniffing”)
  3. Web monitoring with Fiddler, including AJAX client/server traffic
  4. Encrypted web traffic (HTTPS), and mobile devices (iOS & Android)
  5. Wireshark, pcap, and non-web protocols
  6. Inferring server operation from client/server communications
  7. Operating-system monitoring and logging tools
  8. Walking live OS data structures
  9. Monitoring application programming interface (API) usage
  10. Mobile OS logging: Android, iOS, and Bluetooth
  11. Event hooking
  12. Memory inspection/forensics
  13. Module removal and replacement: shimming, code injection, and other intrusive/active methods

Part Eight: Dynamic reverse engineering with debuggers

  1. How using a debugger for reverse engineering differs from normal developer debugging
  2. Web-browser debuggers and the document object model (DOM)
  3. OS-level debuggers: breakpoints and intrusive testing
  4. Back-tracing: “How did I get here?”
  5. Debugging for Apple OSX/iOS and Android
  6. Combining static and dynamic reverse engineering methods

Part Nine: Hardware reverse engineering [tentative outline; this section possibly to be written by GreyB]

  1. Introduction to hardware reverse engineering: how it resembles and differs from software reverse engineering
  2. Microscopy and spectrometry tools: SEM/TEM, EDX, XPS, AFM, TOF, dynamics SIMS
  3. Other tools: signal generators and oscilloscopes
  4. Product teardown: Identifying internal boards, components, and ICs
  5. Material categorization and composition
  6. Thin-film layer categorization: electrical and magnetic properties
  7. Chip-level circuit analysis
  8. IC signal analysis
  9. Chip-level code analysis: HDLs

Part Ten: Next steps in reverse engineering

  1. Security and RE
  2. Static & dynamic inspection to find security holes
  3. Static inspection of known malware
  4. Malware detection methods
  5. Overcoming encryption and obfuscation; legal issues
  6. Examining software from embedded devices (firmware)
  7. Reverse engineering as a tool for litigation-related investigation
  8. Project management: Time/budget to reverse engineer
  9. Possible futures for reverse engineering: AI, reverse engineering machine learning (ML) models, “algorithmic transparency”, supply-chain traceability & transparency, visualization

Appendices

  1. Glossary
  2. Summary of key points about software reverse engineering
  3. Common reverse-engineering errors
  4. Bibliography

Posted in Uncategorized | Comments closed

A blast from the past (1994?!): Disassembling DOS

from Undocumented DOS: A Programmer’s Guide to Reserved MS-DOS Functions and Data Structures by Andrew Schulman et al.(2nd edition, Addison-Wesley, 1994)

This nearly-ancient text (along with other selected chapters from Undocumented DOS and Undocumented Windows) is being presented as a case study in some methodologies of software reverse engineering, applied to mass-market software. Note that this chapter appeared in the 2nd edition of the book, not in the 1st edition:

http://www.softwarelitigationconsulting.com/wp-content/uploads/2020/08/schulman_blast_from_the_past_disassembling_dos.html

A tiny amount of testing was done in the DOS box of a Windows 95 virtual machine from PCjs.org — the disks from Undocumented DOS and Undocumented Windows have been provided at PCjs.org, and can be loaded into the PCjs VM’s A: drive, and utilities run inside the VM.

For those still interested in this twenty-five year old computer book, second-hand copies are available from Amazon, and the entire book is available for loan from the Internet Archive.

 

 

Posted in Uncategorized | Comments closed

Computer software source code in litigation

Slides available here (though they may not entirely make sense without the audio, available as on-demand continuing legal education from ProLawCLE).

Posted in Uncategorized | Comments closed