Skip to main content

Title

Add phi-2 to KAITO supported model list.

Glossary

N/A

Summary

  • Model description: Launched during Microsoft Ignite last November, Phi-2 model is intended for QA, chat, and code purposes. With only 2.7 billion parameters, Phi-2 impressively surpasses the performance of Mistral and Llama-2 models at 7B and 13B on various aggregated benchmarks. Most notably it achieves better performance compare to the 25x larger llama-2-70B model on particular reasoning tasks like coding and math.
  • Model usage statistics: In the past month, phi-2 has garnered 535,163 downloads on Hugging Face, reflecting its widespread popularity.
  • Model license: phi-2 is distributed under the MIT license, ensuring broad usability and modification rights.

Requirements

The following table describes the basic model characteristics and the resource requirements of running it.

FieldNotes
Family namephi-2
Typetext generation
Download sitehttps://huggingface.co/microsoft/phi-2
Versionb10c3eba545ad279e7208ee3a5d644566f001670
Storage size30GB
GPU count1
Total GPU memory12GB
Per GPU memoryN/A

Runtimes

This section describes how to configure the runtime framework to support the inference calls.

OptionsNotes
RuntimeHuggingface Transformer
Distributed InferenceFalse
Custom configurationsPrecision: FP16. Can run on one machine with total of 12GB of GPU Memory.

History

  • 02/06/2024: Open proposal PR.