Previous Page Next Page

3.5. GHH — Google Hack Honeypot

The Google Hack Honeypot (GHH) is a new type of honeypot that has been enabled by the ubiquitous knowledge of search engines. Search engines allow an adversary to find sensitive information due to misconfigured web servers or even identify hosts that run vulnerable web applications. The example given by the GHH developers is a search for:

"# -FrontPage-" inurl:service.pwd

If you get results for this search on your favorite search engine, you are most likely looking at the plaintext administrator passwords for Microsoft's FrontPage hosting system.

A large number of these search queries have been categorized and made public by Johnny Long. These queries allow you to find passwords, web cams, vulnerable servers, sensitive log files, and so forth. You can find all of them at http://johnny.ihackstuff.com/index.php?module=prodreviews.

These web searches give an adversary access to sensitive information that may have been accidentally leaked by sites and might even give her full access to the underlying web server or operating system. Clearly, this is an opportunity for further honeypot research. We would like to know how prevalent this kind of activity is and what kind of attacks are being launched based on carefully crafted search queries. This is where the GHH enters the picture. Using GHH, you set up a web server that contains many seemingly vulnerable web applications or other misconfigurations. After you have installed GHH, you wait for the web crawlers to hit your site and put it in the index of their respective search engines. Once your GHH is in the index of a search engine, it will be returned as a result to the queries, and you get to analyze what kind of traffic you get as a result. Besides search engines, GHH will also allow you to detect if others are conducting deep crawls of your website.

To prevent regular visitors from stumbling over GHH and creating false positives, GHH is hidden behind a transparent link that is not visible to humans but can be found by web crawlers.

3.5.1. General Installation

GHH offers different types of web application honeypots. Each comes with its own installation files, but they all follow similar installation procedures. At the time of this writing, GHH offered the following honeypot types:

In the following, we provide an example of installing Haxplorer. You need a web server that supports PHP for GHH to work on your system. The instructions and necessary changes to get Haxplorer installed are very similar to the steps required for the other honeypots. For this example, we assume that your web server can be reached at http://www.example.com.

1.
2.
Extract it in /tmp/ with tar -xzf GHHv1.1.1-Haxplorer.tar.gz.

3.
Find a suitable directory on your web server to install GHH in. For example, if your web server's root directory is /var/www/, you could create a subdirectory with a meaningless name like /var/www/qwo121.

4.
Make sure that your web server can read it: chmod arx /var/www/qwol21+.

5.
Copy the files from GHH into the new directory by executing

cp -p /tmp/GHH v1.1 - Haxplorer/*/var/www/qwol21.

This command will copy three files into the qwol21 directory: 1.php, config.php, and README.txt.

6.
Edit 1.php, and change the value of $ConfigFile to

/var/www/qwol21/config.php

and the value of $SafeReferer to

http://www.example.com/qwol21/index.php

7.
Remove the README file by executing rm README.txt.

8.
There are only two steps missing before GHH becomes useful: installing an invisible link that can be picked up by a search engine and configuring logging. Because these steps are somewhat more complicated, we describe each in a separate section.

Before we continue, make sure that visiting http://www.example.org/qwol21/1.php shows a screen similar to the one in Figure 3.6. If you just see an empty screen, it's possible that your PHP configuration has RegisterGlobals set to true. That is not very secure and can be potentially dangerous. To make our example work anyway, change the RegisterGlobals variable in /var/www/qwol21/config.php to false.

Figure 3.6. A browser screenshot of GHH's Haxplorer. GHH pretends to be a backdoor to your web server's filesystem that can be discovered by using clever search engine queries.


3.5.2. Installing the Transparent Link

We have hidden the GHH behind a randomly named directory so nobody accidentally stumbles across it. To make it known only to adversaries, we need to carefully disclose its existence. One way to do this is to install a hyperlink on the main page of your web server that points to the secret qwol21 directory. The link needs to be invisible to humans but easy to find by search engines.

To do this, insert a link to the honeypot on the main page of the web server. The following simplified web page serves as example for the web server's index page:

<html><head><title>My WebServer</title>
<link rel="stylesheet" type="text/css" href="/styles/layout.css">
</head>
<body><h1>This is my home</h1>
Some text.
</body></html>


Insert the following HTML code into the main index file right before the line starting with </body>:

<div class="invisible"><a href="qwol21/1.php" color="#eeeeee">.<a><div>

The resulting web page should look as follows:

<html><head><title>My WebServer</title>
<link rel="stylesheet" type="text/css" href="/styles/layout.css">
</head>
<body><h1>This is my home</h1>
Some text.
<div class="invisible"><a href="qwol21/1.php" color="#eeeeee">.</a></div>
</body></html>


Because we used a single dot as anchor text to link to GHH's Haxplorer, you might notice an ugly underlined dot on your main page. That is not very subtle, and a regular visitor can see it easily. To truly hide the honeypot from innocent visitors, we need to make it invisible. We can do this by using cascading style sheets. Let's assume that your main page stores its CSS file at /var/www/styles/layout.css. The simplest solution would be to define the following style:

.invisible { display: none }

Unfortunately, some search engines might figure out that this makes the text invisible and refuse to index the link, which defeats the whole purpose of the exercise. A more promising approach might be the following style:

.invisible { background-color: #eeeeee; color: #eeeeee; }

.invisible A:link{color: #eeeeee}
.invisible A:visited{color: #eeeeee}
.invisible A:active{color: #eeeeee}


Upon reloading http://www.example.org/, you should no longer see the dot, and that should be the case for everyone else but search engines. There are other ways to install such a transparent link. You could try to use an invisible image, but it's not always guaranteed that search engines will follow an image link and index the result as an HTML page. Moreover, you could also register this web page directly at a search engine or also install a transparent link at another web page. Before you can expect anyone to access your GHH, you need to wait for search engines to detect the link, follow it, and index it. This might take several days or weeks, depending on how popular your website is.

3.5.3. Access Logging

Although people can access your honeypot by looking at your web server's logs, GHH provides its own logging mechanism. It supports logging either to a file or a MySQL database. The default logging mechanism is CSV, for comma separated values. To enable it, you need to provide a filename for the logs in

/var/www/qwol21/config.php

If your web server writes its logs to /var/logs/httpd, then setting the filename to

/var/logs/httpd/ggh.haxplorer.log

would be a good choice. After you have enabled logging, you should see log entries like this:

HAXPLORER,04-16-2006 06:33:42 PM,192.168.1.1,/qwol21/1.php?
 cmd=newfilelastcmd=., http//www.example.org/qwol21/1.php, ...,
keep alive,300,Mozilla/5.0 &#40;X11; U; Linux i686; en US; rv:1.8.0.1&#41;
Gecko/20060209 Debian/1.5.dfsg&#43;1.5.0.1 2 Firefox/1.5.0.1,


Based on the HTTP Referer header, GHH automatically detects how people found your honeypot and avoids logging requests that might have come from normal visitors. Additionally, GHH detects if your honeypot was discovered by certain search queries that are commonly used to detect vulnerable installations. These are the search queries that we listed for each GHH previously. After your honeypot has been running for a few weeks, you should try these queries on your favorite search engine and see if you can find your own site in the results.

Instead of logging to a file, it's also possible to log to a MySQL database. MySQL can be quite complicated, and we do not attempt to give you a full introduction into SQL here. However, we will briefly explain how to set up MySQL so that GHH can log to a central database.

1.
In addition to the specific GHH honeypot you installed, we also need to download the following file:

GHHv1.1-CentralDatabase.tar.gz

2.
Create a directory on your web server that can only be accessed by you — for example, /var/www/admin/.

3.
Change your directory to /var/www/admin/ and unpack the central database tar file. You should find the following files: index.php and CreateDatabase.sql.

4.
Connect to MySQL with the following command: mysql -u root -p. You will be prompted for the administrator password. After you enter the password correctly, you should get the following prompt: mysql>.

5.
We plan on using GHH as a database to which the honeypots log their information and need to create the database with the following command: create database ghh;.

6.
Now, we need to create a new MySQL user that has access to that database. We choose ghh as the username and foobar as the password. This user can be created with the following command:

GRANT ALL PRIVILEGES ON ghh.* TO 'ghh'@'localhost' IDENTIFIED BY
 'foobar';

7.
Before we can run the table creation script, we switch to the GHH database with use ghh;.

8.
We invoke the table creation script by executing source CreateDatabase.sql;.

This should be sufficient to set up the MySQL database. You now also need to configure the right user names and passwords in index.php. We just set them both to ghh and foobar. The address of the MySQL server is 127.0.0.1 in our example, but it could also be a remote location.

To tell GHH to log to the database, we need to go back to our /var/www/qwol21/ directory and change a few variables in config.php. Follow these steps to turn on logging to the database:

1.
Change $LogType to MySQL. This instructs GHH to switch logging from CSV to the database.

2.
Change $Owner to haxplorer. When multiple GHHs are installed, the Owner fields allows us to disambiguate the log entries.

3.
Change $Server to 127.0.0.1.

4.
Change $DBUser to ghh.

5.
Change $DBPass to foobar.

If everything worked correctly, you should be able to go to http://www.example.org/admin/ and after authentication, see each and every access to your GHHs. The last column is probably the most interesting because it tells you where your honeypot was found. Most likely this is going to be a search engine query. For our example, it took about three days before various search engines found the new entry.

Previous Page Next Page