Enabling Switch Memory Management for Distributed Training with In-Network Aggregation.
INFOCOM(2023)
Key words
distributed training,DT job schedulers,in-network aggregation,INA-empowered DT jobs,INAlloc,JCT,job completion time,nondisruptive runtime switch memory reallocation,physical switch memory,resource allocation,shared clusters,switch memory allocation,switch memory management layer
AI Read Science
Must-Reading Tree
Example

Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined