Reinforcement Learning for Odd Cycle Transversal via Per-Action Dueling DQN DemoReinforcement Learning for Odd Cycle Transversal via Per-Action Dueling DQN Demo Repeat

Reinforcement Learning for Odd Cycle Transversal via Per-Action Dueling DQN

Achieves 92.3% success rate on small graphs and 50.3% on medium instances — significantly outperforming greedy and random baselines on TWO CYCLE benchmarks

Abstract

The Odd Cycle Transversal (OCT) problem aims to identify a minimum set of vertices whose removal makes a graph bipartite. This work introduces a per-action Dueling Deep Q-Network (DQN) that learns node-removal strategies through interaction with synthetic graph environments. Each action is represented by compact, interpretable features combining structural properties (degree, clustering coefficient, conflict ratio) and weak cycle hints derived from local odd cycle detection. The agent is trained using n-step Double DQN updates with soft target averaging and a short imitation warm-start guided by a simple heuristic teacher. The proposed model achieves 92.3% success on small, 50.3% on medium, and 8.0% on heavy instances — outperforming both random and greedy baselines by significant margins.

Publication

Authors

Mahsa Sadeghi, Yangjun Chen

Affiliation

Applied Computer Science, University of Winnipeg, Canada

Conference/Journal

Under Review

Keywords

Odd Cycle Transversal, Reinforcement Learning, Dueling DQN, Graph Bipartization, Deep Q-Learning

Key Results

92.3% success rate on small instances

50.3% on medium, 8.0% on heavy graphs (TWO CYCLE benchmark)

3.2x better than greedy heuristic on average

Lightweight per-action Q-learning policy

Establishes a strong baseline for learning-based OCT heuristics

Like what you see?

I'm open to new opportunities and exciting projects.

Let's Talk
Mahsa Sadeghi | Software Developer Portfolio