How To connect Phi-3 Mini ONNX model in Semantic Kernel

feiyun0112
Towards Dev
Published in
2 min readMay 3, 2024
Photo by Steve Johnson on Unsplash

What is the Connector

The Connector is a core concept of the Semantic Kernel framework. It is not just a connector but a powerful bridge that seamlessly integrates the advanced features of the Semantic Kernel with external large language models.

For example, when we reference the Microsoft.SemanticKernel Nuget package, it implicitly includes connector Microsoft.SemanticKernel.Connectors.OpenAI, enabling our applications to have chat capabilities comparable to OpenAI.

Microsoft recently released the ONNX version of the Phi-3 Mini model, to make it easier for programmer to harness the power of this model in the Semantic Kernel, I created ‘Connectors.OnnxRuntimeGenAI’, which is a custom connector for the Semantic Kernel to simplify the integration process.

Instructions for Use

Prerequisites

You need to download the required ONNX model, such as Phi-3 Mini-4K-Instruct.

git lfs install
git clone https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-onnx

Demo

At first, You create a new console application and choose the appropriate Nuget package based on your hardware configuration:

-- for CPU
feiyun0112.SemanticKernel.Connectors.OnnxRuntimeGenAI.CPU

-- for CUDA
feiyun0112.SemanticKernel.Connectors.OnnxRuntimeGenAI.CUDA

Then, with just a few lines of code, you can start generating chat content:

Kernel kernel = Kernel.CreateBuilder()
.AddOnnxRuntimeGenAIChatCompletion(
modelPath: @"d:\Phi-3-mini-4k-instruct-onnx\cpu_and_mobile\cpu-int4-rtn-block-32-acc-level-4")
.Build();

string prompt = @"Write a joke";

await foreach (string text in kernel.InvokePromptStreamingAsync<string>(prompt,
new KernelArguments(new OnnxRuntimeGenAIPromptExecutionSettings() { MaxLength = 2048 })))
{
Console.Write(text);
}

Conclusion

The source code is publicly available, and you are welcome to download and use it from https://github.com/feiyun0112/SemanticKernel.Connectors.OnnxRuntimeGenAI.

Published in Towards Dev

A publication for sharing projects, ideas, codes, and new theories.

Responses (1)

What are your thoughts?

Hello , thanks for sharing this .. I need a little help please give me a way to DM you .. thx

--