Section 6.2. Toolkit 101

6.2. Toolkit 101

The symptomatic code approach requires a combination of manual and automated testing tools. At a minimum, these tools must include the following:

Source code viewer: The tester uses this tool, which typically is a text editor, to browse through the source or drill down a particular piece of code flagged by static analysis tools. When available, an Integrated Development Environment (IDE) is a powerful tool for quickly navigating through sources and tracing method-call hierarchies.
Vulnerability tracking database: This isn't a testing tool, but no discussion of source code analysis is complete without mentioning the need to track identified vulnerabilities. Tracking can range from recording issues in a simple text file to logging them in a bug-tracking database such as Bugzilla. At a minimum, the database should provide a place to document the vulnerability, including file location and line number of the insecure code, and steps for reproducing the vulnerability. Documenting this type of information can be a nuisance. You realize its true value only when presenting findings to management or developers.
Static analysis tools: These tools assist the tester by pointing to specific lines of code, which can be examined more closely within the source code viewer. Database-driven scanning tools that have plug-ins for popular IDEs are ideal. From the IDE console, they allow the tester to launch and view scan results as well as drill down on individual instances of flagged code with a single click.

Static analysis tools are the core component of the tester's toolkit. At a minimum, these tools employ pattern-matching technology common to utilities such as grep, and most database-driven source code scanning tools such as Flawfinder and RATS. Patterns constructed for these tools can represent a simple string or a complex regular expression. The primary benefit of a utility such as grep is ad hoc searches of the source, whereas scanning tools provide a default set of rules for identifying insecure code. Some scanning tools have knowledge about the semantics of the target code, allowing for more intelligent analysis than traditional pattern-matching utilities. grep is valuable when database-driven scanning tools are not available for the target source. This is often the case for web application scripting technologies such as Active Server Pages (VBScript).

The output from static analysis tools produced at the beginning of the review provide an initial road map for identifying known or suspected patterns of insecure code. These tools facilitate tracking down instances of custom code that the tester might otherwise notice only once he's familiar with the source. Compiling a robust symptom code database improves the effectiveness of static analysis tools.

6.2.1. Symptom Code Databases

A symptom code database serves as an initial test plan at the start of each code review and can be continuously updated as new symptoms are discovered. How you construct symptom code depends on which static analysis tool you use and the programming languages it supports. Pattern-matching tools describe symptom code as a combination of regular expressions, and you can build custom regular expressions for any programming language (VBScript, C#, VB.NET, Java, PHP, etc.). Table 6-3 is an updated version of Table 6-2 that includes examples of Perl 5 regular expressions representing potential Java symptom code.

This is not a complete list of potential symptom code regular expressions. In fact, some of these examples might produce false positives, and others might produce false negatives. All special characters that are to be treated as literals are escaped with the \ character.

Table 6-3. Java symptom code

Symptom

Perl 5 regexes for Java code

Vulnerability/attack

Dynamic SQL

select.+from insert.+into update.+set

SQL injection

Methods for executing commands

(Runtime|getRuntime\(\)){0,1}\.exec

Command injection

File I/O methods

new\s+(java\.io\.){0,1}File\s*\( new\s+(java\.io\.){0,1}FileReader\s*\(

Arbitrary file creation, reading

Writing inline request objects

\<\s*%\s*=.+request

Cross-site scripting

Cookie access methods

getCookies addCookie

Broken access control

Plaintext database connection strings

jdbc\:

Information leakage, unauthorized access

You should also build regular expressions to flag code that might indicate secure coding practices, such as possible sanitization attempts. By quickly identifying possible sanitization techniques, you might save time overall by avoiding blind exploitation attempts and tailoring attacks to subvert known validation logic. An example of this might be the inclusion of a single JSP file that houses methods for certain input validation routines:

\@\s*include\s+file\s*=\s*\"validate\.jsp\"

As you become more familiar with the code base during the review, you can tune the regular expressions to more accurately capture symptom code. For example, if the code is well documented, it might be useful to search for all instances of a particular developer's name. The analysis tool can run multiple times against the same source tree, revealing new symptom code on each pass. A systematic and iterative approach to source code analysis ensures greater code coverage, increased symptom code detection, and ultimately, real vulnerability identification.

Source code analysis tools and symptom code databases are just components of the symptomatic code approach and they can't find all vulnerabilities. The tool is only as good as its symptom database and the tester's ability to construct meaningful regular expressions. It's important to remember that source code analysis tools and symptom code databases are intended to equip and enable the tester, not to provide a complete solution.