Petals

Run large language models at home, BitTorrent-style.

Description:  
Author 
Organization 
Similar Projects
Oops! Something went wrong while submitting the form.
Petals is a collaborative platform enabling users to run large language models like LLaMA 2 (70B, 70B-Chat), LLaMA-65B, Guanaco-65B, or BLOOM in a BitTorrent-style setup. By loading only a portion of the model and teaming up with others serving different parts, Petals facilitates efficient inference and fine-tuning. It achieves remarkable speed with single-batch inference running up to 6 steps/sec for LLaMA 2 (70B) and about 1 step/sec for BLOOM, outperforming traditional offloading methods by up to 10 times. The platform also allows for custom fine-tuning and sampling methods, granting users the flexibility of PyTorch combined with the convenience of an API. Petals is a significant project associated with the BigScience research workshop, making it an innovative and promising development in the world of language models.
Tags: 
Machine Learning, NLP, Chatbot
.