MTU Cybersecurity Colloquium

Organizer -

Dr. Bo Chen (Computer Science)

Coordinators -

Dr. Xinyu Lei (Computer Science)

Dr. Kaichen Yang (Electrical and Computer Engineering)

Dr. Ronghua Xu (Applied Computing)


Next Colloquium:

9/26/2025

4PM - 5PM

Rekhi G009

"PD-AutoR: Towards Automatic Restoration of Poisoned Examples in Machine Learning"

Presenter: Haoyang Chen

Abstract: Machine learning (ML)-based systems are increasingly being deployed in real-world applications with high societal impacts. A pivotal factor that contributes to the success of ML techniques is the availability of high-quality training datasets. However, there are many attack vectors (exploitable by attackers) to launch various data poisoning (DP) attacks against ML systems since training datasets are often collected from untrusted data sources. One direct negative consequence of DP attacks is that the data quality of the poisoned dataset can be significantly deteriorated compared with the original clean dataset. To mitigate the low-data-quality issue, we design a neural network (NN)-based Poisoned Data Automatic Restoration (PD-AutoR) engine to automatically detect and restore PD prior to ML model training. Our high-level methodology is to develop a transductive learning supported pipeline, which allows the target PD (that needs to be restored) to be used in PD-AutoR training, so PD-AutoR can achieve very high restoration accuracy. In addition, we design transformer-based networks (with a self-attention mechanism) to enable PD-AutoR to precisely and automatically pay attention to the areas that need to be restored, enabling PD-AutoR to restore the PD even if the attacker’s poisoning strategy is agnostic. Our theoretical analysis and preliminary experimental results show that PD-AutoR can simultaneously fulfill the three design goals, including high PD detection accuracy, high PD restoration accuracy, and strong fault tolerance.

 

Past Colloquiums