How To connect Phi-3 Mini ONNX model in Semantic Kernel
What is the Connector
The Connector is a core concept of the Semantic Kernel framework. It is not just a connector but a powerful bridge that seamlessly integrates the advanced features of the Semantic Kernel with external large language models.
For example, when we reference the Microsoft.SemanticKernel Nuget package, it implicitly includes connector Microsoft.SemanticKernel.Connectors.OpenAI, enabling our applications to have chat capabilities comparable to OpenAI.
Microsoft recently released the ONNX version of the Phi-3 Mini model, to make it easier for programmer to harness the power of this model in the Semantic Kernel, I created ‘Connectors.OnnxRuntimeGenAI’, which is a custom connector for the Semantic Kernel to simplify the integration process.
Instructions for Use
Prerequisites
You need to download the required ONNX model, such as Phi-3 Mini-4K-Instruct
.
git lfs install
git clone https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-onnx
Demo
At first, You create a new console application and choose the appropriate Nuget package based on your hardware configuration:
-- for CPU
feiyun0112.SemanticKernel.Connectors.OnnxRuntimeGenAI.CPU
-- for CUDA
feiyun0112.SemanticKernel.Connectors.OnnxRuntimeGenAI.CUDA
Then, with just a few lines of code, you can start generating chat content:
Kernel kernel = Kernel.CreateBuilder()
.AddOnnxRuntimeGenAIChatCompletion(
modelPath: @"d:\Phi-3-mini-4k-instruct-onnx\cpu_and_mobile\cpu-int4-rtn-block-32-acc-level-4")
.Build();
string prompt = @"Write a joke";
await foreach (string text in kernel.InvokePromptStreamingAsync<string>(prompt,
new KernelArguments(new OnnxRuntimeGenAIPromptExecutionSettings() { MaxLength = 2048 })))
{
Console.Write(text);
}
Conclusion
The source code is publicly available, and you are welcome to download and use it from https://github.com/feiyun0112/SemanticKernel.Connectors.OnnxRuntimeGenAI.