Title
Add phi-2 to KAITO supported model list.
Glossary
N/A
Summary
- Model description: Launched during Microsoft Ignite last November, Phi-2 model is intended for QA, chat, and code purposes. With only 2.7 billion parameters, Phi-2 impressively surpasses the performance of Mistral and Llama-2 models at 7B and 13B on various aggregated benchmarks. Most notably it achieves better performance compare to the 25x larger llama-2-70B model on particular reasoning tasks like coding and math.
- Model usage statistics: In the past month, phi-2 has garnered 535,163 downloads on Hugging Face, reflecting its widespread popularity.
- Model license: phi-2 is distributed under the MIT license, ensuring broad usability and modification rights.
Requirements
The following table describes the basic model characteristics and the resource requirements of running it.
Field | Notes |
---|---|
Family name | phi-2 |
Type | text generation |
Download site | https://huggingface.co/microsoft/phi-2 |
Version | b10c3eba545ad279e7208ee3a5d644566f001670 |
Storage size | 30GB |
GPU count | 1 |
Total GPU memory | 12GB |
Per GPU memory | N/A |
Runtimes
This section describes how to configure the runtime framework to support the inference calls.
Options | Notes |
---|---|
Runtime | Huggingface Transformer |
Distributed Inference | False |
Custom configurations | Precision: FP16. Can run on one machine with total of 12GB of GPU Memory. |
History
- 02/06/2024: Open proposal PR.