(1)
Gupta, S. Reducing Hardware-Related Interruptions In AI Clusters: Strategies For Resilient GPU Infrastructure. JICRCR 2025, 44-53.