While the move to smaller transistors has been a boon for performance it has dramatically increased the cost to fabricate chips using those smaller transistors. This forces the vast majority of chip design companies to trust a third party—often overseas—to fabricate their design. To guard against shipping chips with errors (intentional or otherwise) chip design companies rely on post-fabrication testing. Unfortunately, this type of testing leaves the door open to malicious modifications since attackers can craft attack triggers requiring a sequence of unlikely events, which will never be encountered by even the most diligent tester. In this paper, we show how a fabrication-time attacker can leverage analog circuits to create a hardware attack that is small (i.e., requires as little as one gate) and stealthy (i.e., requires an unlikely trigger sequence before affecting a chip's functionality). In the open spaces of an already placed and routed design, we construct a circuit that uses capacitors to siphon charge from nearby wires as they transit between digital values. When the capacitors are fully charged, they deploy an attack that forces a victim flip-flop to a desired value. We weaponize this attack into a remotely controllable privilege escalation by attaching the capacitor to a controllable wire and by selecting a victim flip-flop that holds the privilege bit for our processor. We implement this attack in an OR1200 processor and fabricate a chip. Experimental results show that the purposed attack works. It eludes activation by a diverse set of benchmarks and evades known defenses.
The trend toward smaller transistors in integrated circuits, while beneficial for higher performance and lower power, has made fabricating a chip expensive. For example, it costs 15% more to set up the fabrication line for each successive process node and by 2020 it is expected that setting up a fabrication line for the smallest transistor size will require a $20 billion upfront investment.18 To amortize the cost of fabrication development, most hardware companies outsource fabrication.
Outsourcing of chip fabrication opens up hardware to attack. These hardware attacks can evade software checks because software must trust hardware to faithfully implement the instructions.6, 12 Even worse, if there is an attack in hardware, it can contaminate all layers of a system that depend on the hardware and violates high-level security policies correctly implemented by software.
The most pernicious fabrication-time attack is the dopant-level Trojan.2, 10 Dopant-level Trojans convert trusted circuitry into malicious circuitry by changing the dopant ratio on the input pins to victim transistors. Converting existing circuits makes dopant-level Trojans very difficult to detect since there are no added or removed gates or wires. In fact, detecting dopant-level Trojans requires a complete chip delayering and comprehensive imaging with a scanning electron microscope.17 However, this elusiveness comes at the cost of expressiveness. Dopant-level Trojans are limited by existing circuits, making it difficult to implement sophisticated attack triggers.10 The lack of a sophisticated trigger means that dopant-level Trojans are more detectable by post-fabrication functional testing. Thus, dopant-level Trojans represent an extreme on a trade-off space between detectability during a physical inspection and detectability during testing.
To defend against malicious hardware inserted during fabrication, researchers have proposed two fundamental defenses: (1) using side-channel information (e.g., power and temperature) to characterize acceptable behavior in an effort to detect anomalous (i.e., malicious) behavior,1, 7, 13, 15 and (2) adding sensors to the chip that directly measure and characterize features of the chip's behavior (e.g., signal propagation delay) in order to identify dramatic changes in those features (presumably caused by activation of a malicious circuit).3, 8, 11 Using side channels as a defense works well against large Trojans added to purely combinational circuits where it is possible to test all inputs and there exists a reference chip to compare against. While this accurately describes most existing fabrication-time attacks, we show that it is possible to implement a stealthy and powerful processor attack using only a single added gate without affecting features measured by existing on-chip sensors.
We create a new fabrication-time attack that is controllable, stealthy, and small, which borrows the idea of counter-based triggers commonly used to hide design-time malicious hardware19, 20 and adapt it to fabrication-time. Based on analog behaviors, the attack replaces the hundreds of gates required by conventional counter-based digital triggers with analog components—a capacitor and a few transistors wrapped up in a single gate.
This paper presents three contributions. (1) We design and implement the first fabrication-time processor attack that mimics the triggered attacks often added during design time. As a part of our implementation, we are the first to show how a fabrication-time attacker can leverage the empty space common in chip layouts to implement malicious circuits, (2) We show how an analog attack can be much smaller and more stealthy than its digital counterpart. Our attack diverts charge from unlikely signal transitions to implement its trigger, so it is invisible to all known side-channel defenses. Additionally, as an analog circuit, our attack is under the digital layer and missed by functional verification performed on the hardware description language, and (3) We fabricate an openly malicious processor and then evaluate the behavior of our fabricated attacks across many chips and changes in environmental conditions. We compare these results to Simulation Program with Integrated Circuit Emphasis (SPICE) simulation models.
The typical design and fabrication process of integrated circuits is as shown in Figure 1. See Rostami16. This process often involves collaboration between different parties all over the world and each step is likely done by different teams even if they are in the same company. Therefore, the designs are vulnerable to malicious attacks by rogue engineers involved in any of the above steps.
The design house implements the specification for the chip's behavior in some Hardware Description Language (HDL). Once the specification is implemented in an HDL and that implementation has been verified, the design is passed to a back-end house, which places and routes the circuit.
Conventional digital Trojans can only be inserted in design phase and are easier to be detected by design phase verifications. Fabrication-time attacks inserted in back-end and fabrication phases can evade these defenses. Since it is strictly more challenging to implement attacks at the fabrication phase due to limited information and ability to modify the design compared to the back-end phase, we focus on that threat model for our attack.
The attacker starts with a Graphic Database System II (GDSII) file that is a polygon representation of the completely laid-out and routed circuit. Our threat model assumes that the delivered GDSII file represents a perfect implementation—at the digital level of abstraction—of the chip's specification. This is very restrictive as it means that the attacker can only modify existing circuits or—as we are the first to show in this paper—add attack circuits to open spaces in the laid-out design. The attacker can not increase the dimensions of the chip or move existing components around. This restrictive threat model also means that the attacker must perform some reverse engineering to select viable victim flip-flops and wires to tap. After the untrusted fabrication house completes fabrication, it sends the fabricated chips off to a trusted party for post-fabrication testing. Our threat model assumes that the attacker has no knowledge of the test cases used for post-fabrication testing. Such a model dictates the use of a sophisticated trigger to hide the attack.
A hardware attack is composed of a trigger and a payload. The trigger monitors wires and state within the design and activates the attack payload under very rare conditions such that the attack stays hidden during normal operation and testing. Previous research has identified that evading detection is a critical property for hardware Trojans designers.5 Evading detection involves more than just avoiding attack activation during normal operation and testing, it includes hiding from visual/side-channel inspection. There is a tradeoff at play between the two in that the more complex the trigger (i.e., the better that it hides at run time), the larger the impact that trigger has on the surrounding circuit (i.e., the worse that it hides from visual/side-channel inspection).
We propose A2, a fabrication-time attack that is small, stealthy, and controllable. To achieve these outcomes, we develop trigger circuits that operate in the analog domain. The circuits are based on charge accumulating on a capacitor from infrequent events inside the processor. If the charge-coupled infrequent events occur frequently enough, the capacitor will fully charge and the payload is activated to deploy a privilege escalation attack. Our analog trigger is similar to the counter-based triggers often used in digital triggers, except that using the capacitor has the advantage of a natural reset condition due to leakage. Compared to traditional digital hardware Trojans, the analog trigger maintains a high level of stealth and controllability, while dramatically reducing the impact on area, power, and timing due to the attack. An added benefit of a fabrication-time attack compared to a design-time attack (when digital-only triggers tend to get added) is that it has to pass through fewer verification stages.
3.1. Single stage trigger circuit
Based on our threat model, the high-level design objectives of our analog trigger circuit are as follows:
To achieve these design objectives, we propose an attack based on charge accumulation inside capacitors. A capacitor performs analog integration of charge from a victim wire while at the same time being able to reset itself through leakage current. A behavior model of capacitor based trigger circuits comprises charge accumulation and leakage as shown in Figure 2.
Every time the victim wire that feeds the trigger circuit's capacitor toggles, the capacitor increases in voltage by some ΔV. After a number of toggles, the capacitor's voltage exceeds a predefined threshold voltage and enables the trigger's output—deploying the attack payload. The time it takes to activate the trigger is defined as trigger time (Figure 2).
On the other hand, leakage current exists all the time and it dumps charge from the trigger circuit's capacitor. The attacker can design the capacitor's leakage to be weaker than its accumulation when the trigger input is active. On the other hand, when the trigger input is inactive, leakage gradually reduces the capacitor's voltage, eventually disabling an already activated trigger. This mechanism ensures that the attack is not expressed when no intentional attack happens. The time it takes to reset trigger output after trigger input stops is defined as retention time.
Because of leakage, a minimum toggling frequency must be reached to successfully trigger the attack. At the minimum frequency, charge added in each cycle equals charge leaked away. Trigger time and retention time are the two main design metrics in the analog trigger circuits that we can make use of to create flexible trigger conditions and more complicated trigger patterns as discussed in Section 3.2. A stricter triggering condition (i.e., faster toggling rate and more toggling cycles) reduces the probability of a false trigger during normal operation or testing, but non-idealities in circuits and process, temperature and voltage variations can cause the attack to fail—impossible to trigger or trivial to accidentally trigger—for some chips. As a result, a tradeoff should be made between a reliable attack that can be expressed in every chip and a more stealthy attack that can only be triggered for certain chips under certain conditions.
The conventional current-based charge pump is not suitable for the attack due to area and power constraints. A new charge pump circuit based on charge sharing is specifically designed for the attack purpose as shown in Figure 3. During the negative phase of Clk, Cunit is charged to VDD. Then during positive phase of Clk, the two capacitors are shortened together, causing the two capacitors to share charges. After charge sharing, final voltage of the two capacitors is the same and ΔV on Cmain is as,
where V0 is initial voltage on Cmain before the transition happens. We can achieve different trigger time by sizing the two capacitors. The capacitor keeps leaking over time and finally ΔV equals the voltage drop due to leakage, which sets the maximum capacitor voltage.
A transistor-level schematic of the proposed analog trigger is as shown in Figure 4. Cunit and Cmain are implemented with Metal Oxide Semiconductor (MOS) caps. M0 and M1 are the two switches as shown in Figure 3. A detector is used to compare cap voltage with a threshold voltage and can be implemented by inverters or Schmitt triggers. An inverter has a switching voltage depending on its sizing and when the capacitor voltage is higher than the switching voltage, the output is 0; otherwise, the output is 1. A Schmitt trigger is an inverter with hysteresis. It has a large threshold when input goes from low to high and a small threshold when input goes from high to low. The hysteresis is beneficial for our attack because it extends both trigger time and retention time. To balance the leakage current through M0 and M1, an additional leakage path to ground (NMOS M2 as shown in Figure 4) is added to the design.
A SPICE simulation waveform is as shown in Figure 5 to illustrate the operation of our analog trigger circuit after optimization. The operation is same as the behavioral model that we proposed as shown in Figure 2, allowing us to use the behavior model for system-level attack design.
3.2. Multi-stage trigger circuit
The one-stage trigger circuit described in the previous section takes only one victim wire as an input. Using only one trigger input limits the attacker in two ways: (1) Because fast toggling of one signal for tens of cycles triggers the single stage attack, there is still a chance that normal operations or certain benchmarks can expose the attack, and (2) Certain instructions are required to create fast toggling of a single trigger input and there is not much room for a flexible and stealthy attack program.
We note that an attacker can make a logical combination of two or more single-stage trigger outputs to create a variety of more flexible multi-stage analog triggers. Basic operations to combine two triggers include AND and OR. When analyzing the behavior of logic operations on single stage trigger output, it should be noted that the single-stage trigger outputs 0 when triggered. Thus, for AND operation, the final trigger is activated when either A or B triggers fire. For OR operation, the final trigger is activated when both A and B triggers fire. It is possible for an attacker to combine these simple AND and OR-connected triggers into an arbitrarily complex multi-level multi-stage trigger.
3.3. Triggering the attack
For A2, the payload design is independent of the trigger mechanism, so our proposed analog trigger is suitable for various pay-loads to achieve different attacks. Since the goal of this work is to achieve a Trojan that is nearly invisible while providing a powerful foothold for a software-level attacker, we couple our analog triggers to a privilege escalation attack,9 which provides maximum capabilities to an attacker. We propose a simple design to overwrite security critical registers directly by adding one AND/OR gate to asynchronous set or reset pins of the registers. These reset/set pins are specified in original designs for processor reset. These reset signals are asynchronous with no timing constraints so that adding one gate into the reset signal of one register does not affect functionality or timing constraints of the design. Because there are no timing constraints on asynchronous inputs, the payload circuit can be inserted manually after final placement and routing in a manner consistent with our threat model.
3.4. Selecting victims
It is important that the attacker validate their choice of victim signal. This requires verifying that the victim wire has low baseline activity and its activity level is controllable given the expected level of access of the attacker. To validate that the victim wire used in A2 has a low background activity, we use benchmarks from the MiBench embedded systems benchmark suite. For cases where the attacker does not have access to such software or the attacked processor will see a wide range of use, the attacker can follow A2's example and use a multi-stage trigger with wires that toggle in a mutually-exclusive fashion and require inputs that are unlikely to be produced using off-the-shelf tools (e.g., GNU Compiler Collection (GCC)).
Validating that the victim wire is controllable requires that the attacker reason about their expected level of access to the end user system for the attacked processor. In A2, we assume that the attacker can load and execute any unprivileged instruction. This allows us to create hand-crafted assembly sequences that activate the attack. This model works for attackers that have an account on the system, attackers in a virtual machine, or even attackers that can convince users to load code.
To experimentally verify A2, we implement and fabricate an open source processor with the proposed analog Trojans inserted in 65nm General Purpose Complementary Metal-Oxide-Semiconductor (CMOS) technology. Multiple attacks are implemented in the chip. One set of attacks are Trojans aimed at exposing A2's end-to-end operation, while the other set of attacks are implemented outside the processor, directly connected to Input/Output (IO) pins so that we can investigate trigger behavior directly.
4.1. Attacking a real processor
We implemented an open source OR1200 processor14 to verify our A2 attack including software triggers, analog triggers and payload. The OR1200 Central Processing Unit (CPU) is an implementation of the 32-bit OR1K instruction set with a five stage pipeline. The implemented system in silicon consists of a OR1200 core with 128B instruction cache and an embedded 128KB main program memory connected through a Wishbone bus. The OR1K instruction set specifies the existence of a privileged register called the Supervision Register (SR). The SR contains bits that control how the processor operates (e.g., Memory Management Units (MMU) and caches enabled) and flags (e.g., carry flag). One particular bit is interesting for security purposes; SR controls the privilege mode of user, with 0 denoting user mode and 1 denoting supervisor mode. By overwriting the value of this register, an attacker can escalate a user mode process to supervisor mode as a backdoor to deploy various high-level attacks.5, 9 Therefore, we make the payload of our attack setting this bit in the SR to 1 to give a user mode process full control over the processor.
Our analog trigger circuits require trigger inputs that can have a high switching activity under certain (attacker) programs but are almost inactive during testing or common case operation so that the Trojan is not exposed. To search for suitable victim wires as trigger inputs, we run a series of programs from MiBench (see Section 5) on the target processor in an HDL simulator, capturing the toggling rates of all wires. The result shows that approximately 3% of total wires have nearly zero activity rate, which provides a wide range of options for an attacker. The target signals must also be easy to control by attack programs. In our attack, we select divide by zero flag signal as the trigger for the one-stage attack, because it is unlikely for normal programs to continuously perform division-by-zero while it is simple for an attacker to deliberately perform such operations in a tight loop. For the two-stage trigger, we select wires that report whether the division was signed or unsigned as trigger inputs. The attack program alternatively switches the two wires by performing signed, then unsigned division, until both analog trigger circuits are activated, deploying the attack payload.
Triggering the attack in usermode-only code is only the first part of a successful attack. For the second part, the attacker must be able to verify that the triggering software works—without risk of alerting the operating system. To check whether the attack is successful, we take advantage of a special feature of some registers on the OR1200: some privileged registers are able to be read by user mode code, but the value reported has some bits redacted. We use this behavior to let the attacker's code know whether it gets privileged access to the processor or not.
4.2. Analog activity trigger
We implement both the one-stage and two-stage trigger circuits in 65nm GP CMOS technology based on SPICE simulations. Both trigger circuits are inserted into the processor to demonstrate the attack.
Implementation in 65nm GP technology. For prototype purposes, we optimize the trigger circuit towards a reliable version and building a reliable circuit under process, temperature, and voltage (PVT) variations is always more challenging than only optimizing for a certain PVT range—that is, we construct our attacks so that they work in all fabricated processors at all corner-case environments. 65nm CMOS technology is not a favorable technology for our attack because the gate oxide is thinner than older technologies due to dimension scaling and also thinner than latest technologies because high- metal gate techniques now being employed to reduce gate leakage. However, through careful sizing, it's still possible to design a circuit robust across PVT variations, but this requires trading-off trigger time and retention time.
To reduce gate leakage, another solution is to use thick oxide transistors commonly used in IO cells as the MOS cap for Cmain, which shows negligible gate leakage. This option provides larger space for the configuration of trigger time and retention time but requires larger area due to design rules. Trigger circuit using IO device is implemented for the two-stage attack and the one without IO device is used for the one-stage attack in the system.
Inserting A2 into existing chip layouts. Since A2's analog trigger circuit is designed to follow sizing and routing constraints of standard cells and has the area of a single standard cell, inserting the trigger circuit to the layout at fabrication time is not complicated. In typical placement and routing cases, around 60% to 70% of total area is used for standard cells, otherwise routing can not complete due to routing congestions (our chip is more challenging to attack as it has 80% area utilization). Therefore, in any layout of digital designs, empty space exists. This empty space presents an opportunity for attackers as they can occupy the free space with their own malicious circuit. In our case, we require as little space as one cell. There are four steps to insert a trigger into the layout of a design:
The first step is to locate the signals chosen as trigger inputs and the target registers to attack. The insertion of A2 attack can be done at both back-end and fabrication stage. Our threat model focuses on the fabrication stage because it is significantly more challenging and implies a more stealthy attack over compared to attack at back-end stage attacks. The back-end stage attacker has access to the netlist of the design, so locating the desired signal is trivial. But an attack inserted at back-end stage can still be discovered by SPICE simulation and layout checks, though the chance is extremely low if no knowledge about the attack exists. In contrast, fabrication time attacks can only be discovered by post-silicon testing, which is believed to be very expensive and difficult to find small Trojans. To insert an attack during chip fabrication, some insights about the design are needed, which can be extracted from layout through physical verification tools and digital simulations or from a co-conspirator involved in the design phase.
The next step is to find empty space around the victim wire and insert the analog trigger circuit. Unused space is usually automatically filled with filler cells or capacitor cells by placement and routing tools. Removing these cells will not affect the functionality or timing.
To insert the attack payload circuit, the reset wire needs to be cut as discussed in Section 3.3. It has been shown that timing of reset signal is flexible, so the AND or OR gate only need to be placed somewhere close to the reset signal. Because the added gates can be a minimum strength cell, their area is small and finding space for them is trivial.
The last step is to manually do the routing from trigger input wires to analog trigger circuit and then to the payload circuits. There is no timing requirement on this path so that the routing can go around existing wires at same metal layer (jogging) or jump over existing wires by going to another metal layer (jumping). If long and high metal wires become a concern of the attacker due to potentially easier detection, repeaters (buffers) can be added to break long wire into small sections. Furthermore, it is possible that the attacker can choose different trigger input wires and/or payload according to the existing layout of the target design.
In our OR1200 implementation, inserting the attack following the steps above is trivial, even with the design's 80% area utilization. Routing techniques including jogging and jumping are used, but such routing approach is very common for automatic routing tools so the information leaked by such wires is limited.
Side-channel information. For the attack to be stealthy and defeat existing protections, the area, power and timing overhead of the analog trigger circuit should be minimized. High accuracy SPICE simulation is used to characterize power and timing overhead of implemented trigger circuits. Comparisons with several variants of NAND2 and DFlip–Flop standard cells from commercial libraries are summarized in Table 1. The area of the trigger circuit not using IO device is similar to a X4 strength DFlip–Flop. Using an IO device increases trigger circuit size significantly, but area is still similar to the area of two standard cells, which ensures it can be inserted into empty space in final design layout. AC power is the total energy consumed by the circuits when input changes, the power numbers are simulated with SPICE on a netlist including extracted parasitics. Standby power is the power consumption of the circuits when inputs are static, which comes from leakage currents of CMOS devices.
After inserting A2, post-layout simulation with extracted parasitics shows that the extra delay of victim wires is 1.2ps on average, which is only 0.33% of 4ns clock period and well below the process variation and noise range. In practice, such delay difference is nearly impossible to measure, unless a high-resolution time to digital converter is included on chip, which is impractical due to its large area and power overhead.
Comparison to digital-only attacks. If we look at a previously proposed, digital only and smallest implementation of a privilege escalation attack,5 it requires 25 gates and 80μm2 while our analog attack requires as little as one gate for the same effect. Our attack is also much more stealthy as it requires dozens of consecutive rare events, where the other attack only requires two. We also implement a digital only, counter-based attack that aims to mimic A2. The digital version of A2 requires 91 cells and 382μm2, almost two orders-of-magnitude more than the analog counterpart. These results demonstrate how analog attacks can provide attackers the same power and control as existing digital attacks, but much more difficult to catch.
We perform all experiments with our fabricated 2.1mm2 malicious OR1200 processor as shown in Figure 6. Figure 6 also marks the locations of A2 attacks, with two levels of zoom to aide in understanding the challenges of identifying A2 in a sea of non-malicious logic. In fact, A2 occupies less than 0.08% of the chip's area. Our fabricated chip contains two sets of attacks: the first set of attacks are one and two-stage triggers baked-in to the processor that we use to assess the end-to-end impact of A2. The second set of attacks exist outside of the processor and are used to fully characterize A2's operation.
We use the testing setup as shown in Figure 7 to evaluate our attacks' response to changing environmental conditions and a variety of software benchmarks. The chip is packaged and mounted on a custom testing board to interface with a PC. Through a custom scan chain, we can load programs into the processor's memory and also check the values of the processor's registers. The system's clock is provided by an on-chip 240MHz clock generator at the nominal condition (1V supply voltage and 25°C).
5.1. Does the attack work?
To prove the effectiveness of A2, we evaluate it from two perspectives. One is a system evaluation that explores the end-to-end behavior of our attack by loading attack-triggering programs on the processor, executing them in user mode, and verifying that after executing the trigger sequence, they have escalated privilege on the processor. The other perspective seeks to explore the behavior of our attacks by directly measuring the performance of the analog trigger circuit, the most important component in our attack, but also the most difficult aspect of our attack to verify using simulation.
System attack. Malicious programs described in Section 4.1. are loaded to the processor and then we check the target register values. In the program, we initialize the target registers SR (the mode bit) to user mode (i.e., 0) and SR (a free register bit that we can use to test the two-stage trigger) to 1. When the respective trigger deploys the attack, the single-stage attack will cause SR to suddenly have a 1 value, while the two-stage trigger will cause SR to have a 0 value—the opposite of their initial values. Because our attack relies on analog circuits, environmental aspects dictate the performance of our attack. Therefore, we test the chip at six temperatures from –25°C to 100°C to evaluate the robustness of our attack. Measurement results confirm that both the one-stage and two-stage attacks in all ten tested chips successfully overwrite the target registers at all temperatures.
Analog trigger circuit measurement results. Figure 8 shows the measured distribution of retention time and trigger cycles at three different trigger toggling frequencies across ten chips. The results show that our trigger circuits have a regular behavior in the presence of real-world manufacturing variances, confirming SPICE simulation results. retention time at the nominal condition (1V supply voltage and 25°C) is around 1μs for the trigger with only core devices and 5μs for attacks constructed using IO devices. It is verified that the number of cycles to trigger attack for both trigger circuits (i.e., with and without IO devices) are very close in chip measurements and SPICE simulations. The results indicate that SPICE is capable of providing results of sufficient accuracy for these unusual attack circuits.
To verify the implemented trigger circuits are robust across voltage and temperature variations (as SPICE simulation suggests), we characterize each trigger circuit under different supply voltage and temperature conditions. We confirmed that the trigger circuit can be activated when the victim wire toggles between 0.46MHZ and 120MHz, the supply voltage varies between 0.8V and 1.2V, and the ambient temperature varies between –25°C and 100°C.
As expected, different conditions yield different minimum toggling rates to activate the trigger. Temperature has a stronger impact than voltage on the trigger condition because of leakage current's exponential dependence on temperature. At higher temperature, more cycles are required to trigger and higher switching activity is required because leakage from capacitor is larger.
5.2. Is the attack triggered by non-malicious benchmarks?
Another important property for any hardware Trojan is not exposing itself under normal operations. Because A2's trigger circuit is connected only to the trigger input signal, digital simulation of the design is enough to acquire the activity of the signals. However, since we make use of analog characteristics to attack, analog effects should also be considered as potential effects to accidentally trigger the attack. We use MiBench4 as test bench because it targets the class of processor that best fits the OR1200 and it consists of a set of well-understood applications that are popular benchmarks in both academia and in industry. To validate that A2's trigger avoids spurious activations from a wide variety of software, we select five benchmark applications from MiBench, each from a different class. This ensures that we thoroughly test all subsystems of the processor—exposing likely activity rates for the wires in the processor. Again, in all programs, the victim registers are initialized to opposite states that A2 puts them in when its attack is deployed. The processor runs all five programs at six different temperatures from –25°C to 100°C. Results prove that neither the one-stage nor the two-stage trigger circuit is exposed when running these benchmarks across such wide temperature range.
5.3. Existing protections
Existing protections against fabrication-time attacks are mostly based on side-channel information, for example, power, temperature, and delay. In A2, we only add one gate in the trigger, thus minimizing power and temperature perturbations caused by the attack.
Table 2 summarizes the average power consumption measured when the processor runs our five benchmark programs, at the nominal condition (1V supply voltage and 25°C). Direct measurement of trigger circuit power is infeasible in our setup, so simulation is used as an estimation. Simulated trigger power consumption in Table 1 translates to 5.3nW and 0.5μW for trigger circuits constructed with and without IO devices. These numbers are based on the assumption that trigger inputs keep toggling at 1/4 of the clock frequency of 240MHz, which is the maximum switching activity that our attack program can achieve. In the common case of non-attacking software, the switching activity is much lower—approaching zero—and only lasts a few cycles so that the extra power due to our trigger circuit is even smaller. In our experiments, the power of the attack circuit is orders-of-magnitude less than the normal power fluctuations that occur in a processor while it executes different instructions. Further discussions about possible defenses such as split manufacturing and runtime verifications are presented in our original A2 paper.21
Experimental results with our fabricated malicious processor show that a new style of fabrication-time attack is possible, which applies to a wide range of hardware, spans the digital and analog domains, and affords control to a remote attacker. Experimental results also show that A2 is effective at reducing the security of existing software, enabling unprivileged software full control over the processor. Finally, the experimental results demonstrate the elusive nature of A2: (1) A2 is as small as a single gate—two orders of magnitude smaller than a digital-only equivalent; (2) attackers can add A2 to an existing circuit layout without perturbing the rest of the circuit; (3) a diverse set of benchmarks fail to activate A2 and (4) A2 has little impact on circuit power, frequency, or delay.
Our results expose two weaknesses in current malicious hardware defenses. First, existing defenses analyze the digital behavior of a circuit using functional simulation or the analog behavior of a circuit using circuit simulation. Functional simulation is unable to capture the analog properties of an attack, while it is impractical to simulate an entire processor for thousands of clock cycles in a circuit simulator—this is why we had to fabricate A2 to verify that it worked. Second, the minimal impact on the run-time properties of a circuit (e.g., power, temperature, and delay) due to A2 suggests that it is an extremely challenging task for side-channel analysis techniques to detect this new class of attacks. We believe that our results motivate a different type of defense, where trusted circuits monitor the execution of untrusted circuits, looking for out-of-specification behavior in the digital domain.
This work was supported in part by C-FAR, one of the six SRC STARnet Centers, sponsored by MARCO and DARPA. This work was also partially funded by the National Science Foundation. Any opinions, findings, conclusions, and recommendations expressed in this paper are solely those of the authors.
1. Agrawal, D., Baktir, S., Karakoyunlu, D., Rohatgi, P., Sunar, B. Trojan detection using IC fingerprinting. In Symposium on Security and Privacy (S&P, Washington, DC, 2007). IEEE Computer Society, 296–310.
2. Becker, G.T., Regazzoni, F., Paar, C., Burleson, W.P. Stealthy dopant-level hardware Trojans. In International Conference on Cryptographic Hardware and Embedded Systems (CHES, Berlin, Heidelberg, 2013). Springer-Verlag, 197–214.
3. Forte, D., Bao, C., Srivastava, A. Temperature tracking: An innovative run-time approach for hardware Trojan detection. In International Conference on Computer-Aided Design (ICCAD, 2013). IEEE, 532–539.
4. Guthaus, M.R., Ringenberg, J.S., Ernst, D., Austin, T.M., Mudge, T., Brown, R.B. MiBench: A free, commercially representative embedded benchmark suite. In Workshop on Workload Characterization (Washington D.C., 2001). IEEE Computer Society, 3–14.
5. Hicks, M., Finnicum, M., King, S.T., Martin, M.M.K., Smith, J.M. Overcoming an untrusted computing base: Detecting and removing malicious hardware automatically. USENIX;login 35, 6 (Dec. 2010), 31–41.
6. Hicks, M., Sturton, C., King, S.T., Smith, J.M. Specs: A lightweight runtime mechanism for protecting software from security-critical processor bugs. In Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS, Istanbul, Turkey, 2015). ACM, 517–529.
9. King, S.T., Tucek, J., Cozzie, A., Grier, C., Jiang, W.n., Zhou, Y. Designing and implementing malicious hardware. In Workshop on Large-Scale Exploits and Emergent Threats, volume 1 of LEET (USENIX Association, Apr. 2008).
10. Kumar, R., Jovanovic, P., Burleson, W., Polian, I. Parametric Trojans for fault-injection attacks on cryptographic hardware. In Workshop on Fault Diagnosis and Tolerance in Cryptography (IEEE, FDT, 2014), 18–28.
11. Li, J., Lach, J. At-speed delay characterization for IC authentication and Trojan horse detection. In Hardware-Oriented Security and Trust (HOST, Washington, DC, 2008). IEEE Computer Society, 8–14.
12. Li, M.-L., Ramachandran, P., Sahoo, S.K., Adve, S.V., Adve, V.S., Zhou, Y. Understanding the propagation of hard errors to software and implications for resilient system design. In International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS, Seattle, WA, Mar. 2008). ACM, 265–276.
13. Narasimhan, S., Wang, X., Du, D., Chakraborty, R.S., Bhunia, S. TeSR: A robust temporal self-referencing approach for hardware Trojan detection. In Hardware-Oriented Security and Trust (HOST, San Diego, CA, June 2011). IEEE Computer Society, 71–74.
14. OpenCores.org. OpenRISC OR1200 processor.
16. Rostami, M., Koushanfar, F., Rajendran, J., Karri, R. Hardware security: Threat models and metrics. In Proceedings of the International Conference on Computer-Aided Design (ICCAD, San Jose, CA, 2013). IEEE Press, 819–823.
17. Sugawara, T., Suzuki, D., Fujii, R., Tawa, S., Hori, R., Shiozaki, M., Fujino, T. Reversing stealthy dopant-level circuits. In International Conference on Cryptographic Hardware and Embedded Systems (CHES, New York, NY, 2014). Springer-Verlag, 112–126.
20. Wang, X., Narasimhan, S., Krishna, A., Mal-Sarkar, T., Bhunia, S. Sequential hardware trojan: Side-channel aware design and placement. In Computer Design (ICCD), 2011 IEEE 29th International Conference on (IEEE, Oct 2011), 297–300.
a. Several layers of metal wires are used in modern CMOS technologies to connect cells together, lower level metal wires are closer to transistors at bottom for short interconnections, while higher metal layers are used for global routing.
The original version of this paper is entitled "A2: Analog Malicious Hardware" and was published in 2016 IEEE International Symposium on Security and Privacy.
©2017 ACM 0001-0782/17/09
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and full citation on the first page. Copyright for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or fee. Request permission to publish from email@example.com or fax (212) 869-0481.
The Digital Library is published by the Association for Computing Machinery. Copyright © 2017 ACM, Inc.
No entries found