Opara: Exploiting Operator Parallelism for Expediting DNN Inference on GPUs
IEEE TRANSACTIONS ON COMPUTERS(2025)
关键词
Graphics processing units,Streams,Parallel processing,Artificial neural networks,Solid modeling,Resource management,Kernel,Interference,Computational modeling,Runtime,DNN inference,DNN operator parallelism,scheduling,GPU resource utilization
AI 理解论文
溯源树
样例

生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要