6.2. Toolkit 101
Static analysis tools are the core component of the tester's toolkit. At a minimum, these tools employ pattern-matching technology common to utilities such as grep, and most database-driven source code scanning tools such as Flawfinder and RATS. Patterns constructed for these tools can represent a simple string or a complex regular expression. The primary benefit of a utility such as grep is ad hoc searches of the source, whereas scanning tools provide a default set of rules for identifying insecure code. Some scanning tools have knowledge about the semantics of the target code, allowing for more intelligent analysis than traditional pattern-matching utilities. grep is valuable when database-driven scanning tools are not available for the target source. This is often the case for web application scripting technologies such as Active Server Pages (VBScript).
The output from static analysis tools produced at the beginning of the review provide an initial road map for identifying known or suspected patterns of insecure code. These tools facilitate tracking down instances of custom code that the tester might otherwise notice only once he's familiar with the source. Compiling a robust symptom code database improves the effectiveness of static analysis tools.
6.2.1. Symptom Code Databases
A symptom code database serves as an initial test plan at the start of each code review and can be continuously updated as new symptoms are discovered. How you construct symptom code depends on which static analysis tool you use and the programming languages it supports. Pattern-matching tools describe symptom code as a combination of regular expressions, and you can build custom regular expressions for any programming language (VBScript, C#, VB.NET, Java, PHP, etc.). Table 6-3 is an updated version of Table 6-2 that includes examples of Perl 5 regular expressions representing potential Java symptom code.
This is not a complete list of potential symptom code regular expressions. In fact, some of these examples might produce false positives, and others might produce false negatives. All special characters that are to be treated as literals are escaped with the \ character.
You should also build regular expressions to flag code that might indicate secure coding practices, such as possible sanitization attempts. By quickly identifying possible sanitization techniques, you might save time overall by avoiding blind exploitation attempts and tailoring attacks to subvert known validation logic. An example of this might be the inclusion of a single JSP file that houses methods for certain input validation routines:
As you become more familiar with the code base during the review, you can tune the regular expressions to more accurately capture symptom code. For example, if the code is well documented, it might be useful to search for all instances of a particular developer's name. The analysis tool can run multiple times against the same source tree, revealing new symptom code on each pass. A systematic and iterative approach to source code analysis ensures greater code coverage, increased symptom code detection, and ultimately, real vulnerability identification.
Source code analysis tools and symptom code databases are just components of the symptomatic code approach and they can't find all vulnerabilities. The tool is only as good as its symptom database and the tester's ability to construct meaningful regular expressions. It's important to remember that source code analysis tools and symptom code databases are intended to equip and enable the tester, not to provide a complete solution.