ch03lev1sec5.html

3.5. GHH — Google Hack Honeypot

The Google Hack Honeypot (GHH) is a new type of honeypot that has been enabled by the ubiquitous knowledge of search engines. Search engines allow an adversary to find sensitive information due to misconfigured web servers or even identify hosts that run vulnerable web applications. The example given by the GHH developers is a search for:

"# -FrontPage-" inurl:service.pwd

If you get results for this search on your favorite search engine, you are most likely looking at the plaintext administrator passwords for Microsoft's FrontPage hosting system.

A large number of these search queries have been categorized and made public by Johnny Long. These queries allow you to find passwords, web cams, vulnerable servers, sensitive log files, and so forth. You can find all of them at http://johnny.ihackstuff.com/index.php?module=prodreviews.

These web searches give an adversary access to sensitive information that may have been accidentally leaked by sites and might even give her full access to the underlying web server or operating system. Clearly, this is an opportunity for further honeypot research. We would like to know how prevalent this kind of activity is and what kind of attacks are being launched based on carefully crafted search queries. This is where the GHH enters the picture. Using GHH, you set up a web server that contains many seemingly vulnerable web applications or other misconfigurations. After you have installed GHH, you wait for the web crawlers to hit your site and put it in the index of their respective search engines. Once your GHH is in the index of a search engine, it will be returned as a result to the queries, and you get to analyze what kind of traffic you get as a result. Besides search engines, GHH will also allow you to detect if others are conducting deep crawls of your website.

To prevent regular visitors from stumbling over GHH and creating false positives, GHH is hidden behind a transparent link that is not visible to humans but can be found by web crawlers.

3.5.1. General Installation

GHH offers different types of web application honeypots. Each comes with its own installation files, but they all follow similar installation procedures. At the time of this writing, GHH offered the following honeypot types:

Everything: This is a tar ball that contains all GHH honeypots in one package. When you think about installing GHH, you might as well install all of them.
Haxplorer: This is a honeypot for a web-based file manager that enables browsing the filesystem on the web server and supports file operations like rename, delete, download, and copy. Installations can be found using the following search: filetype:php HAXPLORER "Server Files Browser".
PHP_Ping: This web application sends pings to the specified IP address, and it had some security vulnerabilities in the past. Installations can be found with this search: "Enter ip"inurl:"php-ping.php".
AIMBuddyList: This honeypot simulates an exported home directory that contains your AIM Buddy List. This may reveal sensitive information to somebody with whom you chat regularly. An example search for this honeypot is inurl:BuddyList.blt.
FileUploadManager: This simulates thepeak File Upload Manager, which can be used to download and upload files to a web server. There have also been allegations that it was vulnerable to arbitrary command execution. GHH's sample search for this application is "File Upload Manager v1.3" "rename to".
Passlist.txt: As the name indicates, this honeypot pretends to be an unprotected list of passwords that can be found by the following search: inurl:passlist.txt. GHH will create some interesting stuff semirandomly.
Passwd.list: Just like the preceding, this is another unprotected password list.
PHPBB_Installer: This honeypot targets phpBB installations where the maintainer forgot to remove the installation kit. It can be found with inurl:"install/install.php".
PHPFM: This is honeypot for the PHPFM — a php-based file manager. This file manager should normally not be accessible from the Internet or without authentication, but this search finds such unprotected installations: "Powered by PHPFM" filetype:php -username.
PHP_Shell: This is a shell to your system written in PHP. It's supposed to replace telnet or other remote login tool. A honeypot for a shell can yield very interesting information. An adversary might try to download his toolkit onto your machine. The following search may be used to find PHP shell installations: intitle:"PHP Shell *""Enable stderr"filetype:php.
PhpSysInfo: This is a php script that shows system information. The associated query is inurl:phpSysInfo/ "created by phpsysinfo".
SquirrelMail: This simulates the web login interface of SquirrelMail. This honeypot seems broken at the time of this writing. After typing in the password, GHH generates an error message due to a missing file. It can be found by "SquirrelMail version 1.4.4"inurl:src ext:php.
WebUtil2.7: WebUtil is a collection of networking and convenience tools. For example, WebUtil provides a ping and a traceroute program. You can find installations with inurl:webutil.pl.

In the following, we provide an example of installing Haxplorer. You need a web server that supports PHP for GHH to work on your system. The instructions and necessary changes to get Haxplorer installed are very similar to the steps required for the other honeypots. For this example, we assume that your web server can be reached at http://www.example.com.

1.	Download the GHH Haxplorer from http://prdownloads.sourceforge.net/ghh/GHHv1.1.1-Haxplorer.tar.gz
2.	Extract it in `/tmp/` with `tar -xzf GHHv1.1.1-Haxplorer.tar.gz`.
3.	Find a suitable directory on your web server to install GHH in. For example, if your web server's root directory is `/var/www/`, you could create a subdirectory with a meaningless name like `/var/www/qwo121`.
4.	Make sure that your web server can read it: `chmod a`rx /var/www/qwol21+.
5.	Copy the files from GHH into the new directory by executing cp -p /tmp/GHH v1.1 - Haxplorer/*/var/www/qwol21. This command will copy three files into the `qwol21` directory: `1.php`, `config.php`, and `README.txt`.
6.	Edit `1.php`, and change the value of `$ConfigFile` to /var/www/qwol21/config.php and the value of `$SafeReferer` to http://www.example.com/qwol21/index.php
7.	Remove the `README` file by executing `rm README.txt`.

8.	There are only two steps missing before GHH becomes useful: installing an invisible link that can be picked up by a search engine and configuring logging. Because these steps are somewhat more complicated, we describe each in a separate section.

Before we continue, make sure that visiting http://www.example.org/qwol21/1.php shows a screen similar to the one in Figure 3.6 . If you just see an empty screen, it's possible that your PHP configuration has RegisterGlobals set to true. That is not very secure and can be potentially dangerous. To make our example work anyway, change the RegisterGlobals variable in /var/www/qwol21/config.php to false.

Figure 3.6. A browser screenshot of GHH's Haxplorer. GHH pretends to be a backdoor to your web server's filesystem that can be discovered by using clever search engine queries.

[View full size image]

3.5.2. Installing the Transparent Link

We have hidden the GHH behind a randomly named directory so nobody accidentally stumbles across it. To make it known only to adversaries, we need to carefully disclose its existence. One way to do this is to install a hyperlink on the main page of your web server that points to the secret qwol21 directory. The link needs to be invisible to humans but easy to find by search engines.

To do this, insert a link to the honeypot on the main page of the web server. The following simplified web page serves as example for the web server's index page:

<html><head><title>My WebServer</title> <link rel="stylesheet" type="text/css" href="/styles/layout.css"> </head> <body><h1>This is my home</h1> Some text. </body></html>

<html><head><title>My WebServer</title> <link rel="stylesheet" type="text/css" href="/styles/layout.css"> </head> <body><h1>This is my home</h1> Some text. </body></html>

Insert the following HTML code into the main index file right before the line starting with </body>:

<div class="invisible"><a href="qwol21/1.php" color="#eeeeee">.<a><div>

The resulting web page should look as follows:

<html><head><title>My WebServer</title> <link rel="stylesheet" type="text/css" href="/styles/layout.css"> </head> <body><h1>This is my home</h1> Some text. <div class="invisible"><a href="qwol21/1.php" color="#eeeeee">.</a></div> </body></html>

<html><head><title>My WebServer</title> <link rel="stylesheet" type="text/css" href="/styles/layout.css"> </head> <body><h1>This is my home</h1> Some text. <div class="invisible"><a href="qwol21/1.php" color="#eeeeee">.</a></div> </body></html>

Because we used a single dot as anchor text to link to GHH's Haxplorer, you might notice an ugly underlined dot on your main page. That is not very subtle, and a regular visitor can see it easily. To truly hide the honeypot from innocent visitors, we need to make it invisible. We can do this by using cascading style sheets. Let's assume that your main page stores its CSS file at /var/www/styles/layout.css. The simplest solution would be to define the following style:

.invisible { display: none }

Unfortunately, some search engines might figure out that this makes the text invisible and refuse to index the link, which defeats the whole purpose of the exercise. A more promising approach might be the following style:

.invisible { background-color: #eeeeee; color: #eeeeee; } .invisible A:link{color: #eeeeee} .invisible A:visited{color: #eeeeee} .invisible A:active{color: #eeeeee}

.invisible { background-color: #eeeeee; color: #eeeeee; } .invisible A:link{color: #eeeeee} .invisible A:visited{color: #eeeeee} .invisible A:active{color: #eeeeee}

Upon reloading http://www.example.org/, you should no longer see the dot, and that should be the case for everyone else but search engines. There are other ways to install such a transparent link. You could try to use an invisible image, but it's not always guaranteed that search engines will follow an image link and index the result as an HTML page. Moreover, you could also register this web page directly at a search engine or also install a transparent link at another web page. Before you can expect anyone to access your GHH, you need to wait for search engines to detect the link, follow it, and index it. This might take several days or weeks, depending on how popular your website is.

3.5.3. Access Logging

Although people can access your honeypot by looking at your web server's logs, GHH provides its own logging mechanism. It supports logging either to a file or a MySQL database. The default logging mechanism is CSV, for comma separated values. To enable it, you need to provide a filename for the logs in

/var/www/qwol21/config.php

If your web server writes its logs to /var/logs/httpd, then setting the filename to

/var/logs/httpd/ggh.haxplorer.log

would be a good choice. After you have enabled logging, you should see log entries like this:

HAXPLORER,04-16-2006 06:33:42 PM,192.168.1.1,/qwol21/1.php? cmd=newfilelastcmd=., http//www.example.org/qwol21/1.php, ..., keep alive,300,Mozilla/5.0 (X11; U; Linux i686; en US; rv:1.8.0.1) Gecko/20060209 Debian/1.5.dfsg+1.5.0.1 2 Firefox/1.5.0.1,

HAXPLORER,04-16-2006 06:33:42 PM,192.168.1.1,/qwol21/1.php? cmd=newfilelastcmd=., http//www.example.org/qwol21/1.php, ..., keep alive,300,Mozilla/5.0 (X11; U; Linux i686; en US; rv:1.8.0.1) Gecko/20060209 Debian/1.5.dfsg+1.5.0.1 2 Firefox/1.5.0.1,

Based on the HTTP Referer header, GHH automatically detects how people found your honeypot and avoids logging requests that might have come from normal visitors. Additionally, GHH detects if your honeypot was discovered by certain search queries that are commonly used to detect vulnerable installations. These are the search queries that we listed for each GHH previously. After your honeypot has been running for a few weeks, you should try these queries on your favorite search engine and see if you can find your own site in the results.

Instead of logging to a file, it's also possible to log to a MySQL database. MySQL can be quite complicated, and we do not attempt to give you a full introduction into SQL here. However, we will briefly explain how to set up MySQL so that GHH can log to a central database.

1.	In addition to the specific GHH honeypot you installed, we also need to download the following file: GHHv1.1-CentralDatabase.tar.gz
2.	Create a directory on your web server that can only be accessed by you — for example, `/var/www/admin/`.
3.	Change your directory to `/var/www/admin/` and unpack the central database tar file. You should find the following files: `index.php` and `CreateDatabase.sql`.
4.	Connect to MySQL with the following command: `mysql -u root -p`. You will be prompted for the administrator password. After you enter the password correctly, you should get the following prompt: `mysql>`.
5.	We plan on using `GHH` as a database to which the honeypots log their information and need to create the database with the following command: `create database ghh;`.
6.	Now, we need to create a new MySQL user that has access to that database. We choose ghh as the username and foobar as the password. This user can be created with the following command: GRANT ALL PRIVILEGES ON ghh.* TO 'ghh'@'localhost' IDENTIFIED BY 'foobar';
7.	Before we can run the table creation script, we switch to the GHH database with `use ghh;`.
8.	We invoke the table creation script by executing `source CreateDatabase.sql;`.

This should be sufficient to set up the MySQL database. You now also need to configure the right user names and passwords in index.php. We just set them both to ghh and foobar. The address of the MySQL server is 127.0.0.1 in our example, but it could also be a remote location.

To tell GHH to log to the database, we need to go back to our /var/www/qwol21/ directory and change a few variables in config.php. Follow these steps to turn on logging to the database:

1.	Change `$LogType` to `MySQL`. This instructs GHH to switch logging from CSV to the database.
2.	Change `$Owner` to `haxplorer`. When multiple GHHs are installed, the `Owner` fields allows us to disambiguate the log entries.
3.	Change `$Server` to `127.0.0.1`.
4.	Change `$DBUser` to `ghh`.
5.	Change `$DBPass` to `foobar`.

If everything worked correctly, you should be able to go to http://www.example.org/admin/ and after authentication, see each and every access to your GHHs. The last column is probably the most interesting because it tells you where your honeypot was found. Most likely this is going to be a search engine query. For our example, it took about three days before various search engines found the new entry.