acm-header
Sign In

Communications of the ACM

ACM TechNews

Honeypot Security Technique Can Stop Attacks in Natural Language Processing


View as: Print Mobile App Share: Send by email Share on reddit Share on StumbleUpon Share on Hacker News Share on Tweeter Share on Facebook
Artist's representation of digial protection.

Borrowing a technique commonly used in cybersecurity to defend against these universal trigger-based attacks, researchers at the Penn State College of Information Sciences and Technology developed a machine learning framework that can defend against the same types of attacks in natural language processing applications 99% of the time.

Credit: Adobe Stock

A machine learning framework can proactively counter universal trigger attacks—a phrase or series of words that deceive an indefinite number of inputs—in natural language processing (NLP) applications.

Scientists at Pennsylvania State University (Penn State) and South Korea's Yonsei University engineered the DARCY model to catch potential NLP attacks using a honeypot, offering up words and phrases that hackers target in their exploits.

DARCY searches and injects multiple trapdoors into a textual neural network to detect and thresh out malicious content produced by universal trigger attacks.

When tested on four text classification datasets and used to defend against six different potential attack scenarios, DARCY outperformed five existing adversarial detection algorithms.

From Penn State News
View Full Article

 

Abstracts Copyright © 2021 SmithBucklin, Washington, DC, USA


 

No entries found