RedCoast
Red Coast (redco) is a lightweight and user-friendly tool designed to automate distributed training and inference for large models while simplifying the ML pipeline development process without necessitating MLSys expertise from users.
RedCoast supports Large Models + Complex Algorithms, in a lightweight and user-friendly way:
- Large Models beyond Transformers, e.g, Stable Diffusion, etc.
- Complex algorithms beyond cross entropy, e.g., Meta Learning, DP Training, etc.
With RedCoast, to define a ML pipeline, only three functions are needed:
- Collate function: convert raw data into model inputs (e.g., text tokenization);
- Loss function: execute the model and compute loss (e.g., cross-entropy);
- Predict function: run the model and deliver outcomes (e.g., beam search).
Redco automates all the remaining of pipeline execution such as data and model parallelism, multi-host related processing, distributed checkpointing, randomness controlling, logging, etc.