Previous Page Next Page

7.2. Potemkin

The Potemkin[2] Virtual Honeyfarm takes scalability to a completely new level [104]. Developed by researchers at the University of California, San Diego, it shares similar goals with Collapsar in that Potemkin aims to provide high-interaction honeypots to very large address spaces. Unlike Collapsar, Potemkin has been used to emulate over 64,000 honeypots in live deployment — using only a few physical machines! In addition to being scalable, Potemkin also tries to solve the problems of fidelity and containment. The three attributes — scalability, fidelity, and containment — are obviously in conflict with one another. Scalability often implies that we have to take shortcuts and might not be able to provide a high level of interaction for an adversary. For example, Honeyd is a system that sacrifices fidelity for scalability. On the other hand, high fidelity makes containment more difficult. Some adversaries — be they human or automated worms — might depend on outbound connectivity to function properly. A containment policy, dictating that outbound traffic is not allowed, reduces the fidelity of the honeypots. However, Potemkin solves this apparent conflict in an elegant fashion, as we explain next.

[2] The name is probably due to Potemkin villages, which were, purportedly, fake settlements erected at the direction of Russian minister Grigori Aleksandrovich Potemkin to fool Empress Catherine II during her visit to Crimea in 1787. Conventional wisdom has it that Potemkin had hollow facades of villages constructed along the desolate banks of the Dnieper River to impress the monarch and her travel party with the value of her new conquests; see http://en.wikipedia.org/wiki/Potemkin_village.

Before we provide an overview of Potemkin's architecture, let's discuss some of the underlying principles that allow Potemkin to achieve its goals. One of the key insights from the Potemkin paper is the following:

To paraphrase Bishop Berkeley: If a host exposes a vulnerability, but no one exploits it, was it really vulnerable? We argue that since dedicated honeypots have no independent computational purpose, only tasks driven by external input have any value.

From that point of view, dedicated high-interaction honeypots waste most of their CPU and memory resources. When idle, that is not serving any requests, all the resources could be used better elsewhere. But even when serving traffic, most of a honeypot's CPU and memory are not utilized either. To achieve efficient resource usage, Potemkin employs late binding of resources. If there are no active connections, Potemkin does not have any active honeypots. When new traffic arrives, a special gateway router binds the destination IP addresses to one of a number of physical machines. For each active IP address, the physical server creates a new virtual machine by cloning a reference image. Each virtual machine represents a high-interaction honeypot. New physical memory is allocated only if the honeypot starts to diverge from the reference image. To regain resources, a honeypot is reclaimed when it becomes idle. As a result, Potemkin is able to support hundreds of high-interaction honeypots on a single physical machine. For later analysis, Potemkin offers the option to save a snapshot of the virtual machine before it is reclaimed. When adversaries install backdoors on compromised machines, the act of reclaiming will remove them and might make adversaries suspicious if it's gone the next time they visit. The first time might be explained with a vigilant administrator, subsequent incidents might give away the honeypot installation.

When a virtual machine becomes compromised, an adversary gains full access to the system and can use it to launch further attacks. The most common example at the moment is a rapidly spreading worm that attempts to infect new targets from the infected honeypot. For a responsible deployment of such a system, it's important that the honeypots cannot be used to cause damage elsewhere. As we have seen in Collapsar, one possible containment policy is to apply heavy bandwidth limits and packet corruption to outbound traffic. Other policies could disallow all outbound traffic or allow traffic only in reply to externally established connections. However, most modern threats, such as botnets or worms, require the ability to contact and receive instructions from command and control hosts. A honeypot that does not allow such transactions will not be able to gain insights into many of these threats, since interesting activity usually happens only after instructions from a remote site have been received. Potemkin places the responsibility of enforcing containment policy onto a gateway router. This gateway router is used to keep track of all flows and which physical servers are responsible for which active destination IP addresses. To support successful containment without significantly reducing fidelity, the gateway router also knows how to proxy well-known outbound services such as DNS. Furthermore, it is also able to reflect traffic back to the honeyfarm if the containment policy determines that particular outbound traffic is not allowed. The reflection causes a new virtual honeypot to be created that is responsible for the destination IP addresses corresponding to the denied outbound packet. This new IP address is not part of the monitored address space. Reflection is used by Potemkin to virtualize the whole Internet that, among other things, can be used to observe the propagation behavior of a worm as it tries to spread across the Internet.

To understand how these ideas have been realized, there is a brief overview of Potemkin's architecture shown in Figure 7.2. Using GRE tunnels, routers forward traffic from specific address prefixes to the gateway just mentioned. The gateway is responsible for scheduling traffic to a honeyfarm consisting of a number of physical servers. The gateway also keeps track physical server is responsible for which destination IP address. Each physical server supports a number of virtual machines that can be created and destroyed on the fly to make best use of the available resources. A virtual machine monitor (VMM) that runs on each physical server is responsible for creating and managing these virtual machines.

Figure 7.2. This figure shows an overview of Potemkin's architecture. Routers all over the Internet are configured to tunnel an address prefix to Potemkin's network gateway. The gateway is responsible for sending traffic to a honeyfarm server. The honeyfarm server will create a new high-interaction honeypot on demand for each active destination IP address. Outbound traffic is subject to policy to prevent abuse. However, even if outbound traffic is now allowed, Potemkin supports redirecting the outbound traffic to another honeypot in the honeyfarm allowing potential worms to propagate without any danger.


Most of the intelligence lives in the gateway router. It is responsible for four different functions: directing inbound traffic to physical servers, containing outbound traffic, managing resources, and interfacing with detection and analysis components.

Potemkin's gateway router can receive traffic via two separate means. One method requires advertising IP prefixes via BGP (Border Gateway protocol) and making the gateway the last hop for such routes. The other mechanism requires configuring external routers to forward parts of their address space via GRE to the gateway router. If one has the ability to make BGP announcements, this is probably the simplest way to receive traffic. By using routing to receive traffic, the system also does not incur any additional network latency. The flip side of the coin is that anyone with traceroute can detect the final destination of the traffic, which makes the honeyfarm very visible. Using GRE tunnels is probably more attractive because it allows the honeypots to stay invisible within the network topology. It might also make it easier to get other network operators to participate. On the other hand, as already seen with Collapsar, GRE tunnels add latency to each packet that travels through them.

Once a packet arrives, the gateway router needs to determine to which physical server it should be sent to. For an active destination IP address, a physical server is already known. However, if a packet arrives for an IP address that currently has no active virtual machine, the gateway sends the packet to a honeyfarm server with spare capacity. The gateway uses these "IP address to physical server bindings" to load-balance the available resources. To avoid Network Address Translation at the gateway, physical servers are not addressed by IP address but rather by their link-layer address — for example, their Ethernet MAC address. This allows the gateway to forward packets completely unchanged and results in higher performance.

Because the gateway is also the only connection to the Internet at large, it is the natural place to implement containment. Potemkin currently implements several different containment policies:

As mentioned earlier, the gateway is also responsible for resource management. The main difficulty is not when to create a new virtual machine to serve as a honeypot but rather when to reclaim an existing virtual machine. Usually, the state of a honeypot is interesting only when it has been compromised. If an attack on a honeypot has failed, it should be reclaimed. However, it is not always clear how to detect if an attack was successful. An easy measure is to look for outbound network activity. If a honeypot has not produced any packets after a configurable time period, it could be marked as reclaimable. Conversely, virtual machines that are known to be uncompromised can be reclaimed if they have not received any incoming traffic for a while.

Although the gateway is responsible for managing traffic, containment, and the binding of destination IP addresses to physical server, the VMM on each machine is responsible for efficiently creating new VMs as needed. As Potemkin runs multiple virtual machines on each server, one of the main duties of the VMM is to provide isolation between the different honeypots. A compromise of one honeypot should not affect the state of any other honeypots on that machine at all. When the VMM receives a packet for an IP address that has no corresponding virtual machine yet, it quickly creates one and then delivers the packet to it. If the virtual machine is already running, it will receive any packet sent to it directly. When instructed by the gateway, the VMM is also responsible for destroying idle VMs. The main challenge is to create new honeypots very quickly without requiring too many resources. For example, a single virtual machine running Windows XP might typically require at least several hundred megabytes of main memory. However, in the context of a honeypot, we know that its first activity after being started is to answer to network traffic. This usually requires only a small portion of the available CPU and memory. Potemkin takes this insight to make the creation of new virtual machines extremely fast and resource inexpensive. Each VMM manages a reference image containing a memory snapshot of a preinitialized virtual machine. When a new virtual machine is needed, it is sufficient to copy the memory snapshot and change it just to reflect the new IP address, DNS servers, and gateway. Instead of copying, Potemkin takes the optimization to the next level, and memory is only referenced from the immutable snapshot. For the most parts, answering a single network packet is not going to change most of the memory anyway. Potemkin uses a technique called copy-on-write (COW) that allocates new memory only if a memory page is about to be changed. As a result, startup time is significantly faster and resource consumption much lower than in traditional virtual machines.

The UCSD researchers measured how many virtual machines they could support on a single physical machine by cloning a 128MB Linux reference image as many times as possible. Their VMM consisted of a modified version of Xen. Using their copy-on-write optimization, it was possible to create 116 VMs before running into Xen limitations. In their experiment, the 116 VMs together used only about 98MB plus the 128MB required by the reference image. Without these limitations, it would be feasible to support 1500 VMs on a single 2GB server.

Potemkin was deployed on a /16 network for live testing. When the gateway was configured to recycle virtual machines after 500 milliseconds of inactivity, the steady state operation required about 58 active VMs. However, during peak activity over 10,000 VMs were required. This resulted in the development of a scan filter that significantly reduces the number of required VMs. The scan filter achieves this by limiting how many inbound packets for the same destination port and transport protocol may be sent by an external IP address. When Potemkin receives more than one scan packet in a 60-second time frame, all subsequent scan packets are dropped. As a result, a single IP address cannot create thousands of virtual machines by just scanning a network for a given port.

Although no experience reports of Potemkin are available in the paper, it is clear that Potemkin is a platform that allows for extremely interesting research into worm and potentially botnet behavior. Recycling of unused honeypots, the efficient creation of new virtual machines and internal reflection are key mechanisms for providing scalability, fidelity, and containment simultaneously.

Previous Page Next Page