Skip to content

A curated list of awesome projects and papers for distributed training or inference

Notifications You must be signed in to change notification settings

Shenggan/awesome-distributed-ml

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 

Repository files navigation

Awesome Distributed Machine Learning System

Awesome PRs Welcome

A curated list of awesome projects and papers for distributed training or inference especially for large model.

Contents

Open Source Projects

Papers

Survey

Pipeline Parallelism

Sequence Parallelism

Mixture-of-Experts System

Graph Neural Networks System

Hybrid Parallelism & Framework

Memory Efficient Training

Tensor Movement

Auto Parallelization

Communication Optimization

Fault-tolerant Training

Inference and Serving

Applications

Contribute

All contributions to this repository are welcome. Open an issue or send a pull request.