OpenVector
  • OpenVector
  • Introduction
    • Our Thesis
    • Our Solution
    • Other Approaches
    • Benchmarks
  • Network Architecture
    • Overview
    • Networking
    • Deploy Compute Node
    • Deploy CoFHE Node
    • Deploy Client Node
  • Quick start
    • Overview
    • Setting up Client Node
    • Encrypting Data
    • Tensor Multiplication
    • Decrypting the Ouput
    • Verifying the Output
    • Running the Program
  • Tutorials
    • Building MLP Block
  • Use Cases
    • Confidential LLM Inference
    • Training on Encrypted Data
    • Vector Search on Encrypted Data
    • Encrypted Data Access Control
  • API references
    • CryptoSystem Interface
    • ClientNode Class
    • ComputeRequest Class
    • ComputeResponse Class
  • PYTHON API REFERENCES
    • Overview
    • Tensor
    • nn.Module
    • nn.Functional
  • Contribution
    • Call for Research
    • CoFHE Library
Powered by GitBook
On this page
  1. Use Cases

Confidential LLM Inference

Future of LLM inference is in the clouds. Confidential LLM inference allows anyone, could be an AI agent or human being or even an edge device to use a remote LLM without leaking any data.

The libcofhe provides a set of APIs for performing encrypted inference using these primitives and also builds on these primitives to provide higher-level APIs for common machine learning tasks. There are also tools for converting existing machine learning models into encrypted models. Also python bindings are provided for the libcofhe library.

First we need to preprocess the model . This is done by converting the model to a format that can be used by CoFHE. This process involves encrypting the model parameters, saving what all layers are encrypted and other metadata required for encrypted inference. The following code snippet demonstrates how to preprocess a model:

import cofhe

cofhe.init("path/to/config.json")

# huggingface model
config = {
    "model": model,
    "layers": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11],
}
cofhefy_model = cofhe.cofhefy(config)

# Save the cofhe-fy model
cofhe.save_model(cofhefy_model, "/path/to/save")

The following code snippet demonstrates how to perform encrypted inference using CoFHE:

import cofhe

# Initialize the CoFHE library using a configuration file
# The configuration file contain the parameters required for network initialization
# For example, the parameters required for homomorphic encryption, secure multi-party computation, other node details etc.
cofhe.init("path/to/config.json")
# Load a pre-trained model, this model should be cofhe-fy already
model = cofhe.load_model("/path/to/model")
prompt = input("Enter a prompt: ")
# Encrypt the input data
encrypted_input = cofhe.encrypt(prompt)
# Perform inference on the encrypted data
encrypted_output = model(encrypted_input)
# Decrypt the output
output = cofhe.decrypt(encrypted_output)
print(output)

In this example, we first initialize the CoFHE library using a configuration file. For more details about the syntax see the API reference. We then load a pre-trained model that has been encrypted using CoFHE. We then encrypt the input data using the cofhe.encrypt function. We then perform inference on the encrypted data using the model. Finally, we decrypt the output using the cofhe.decrypt function and print the result.

Here are some samples of the encrypted inference in action for gpt-2 model(not from this exact code, uses cpp cofhe lib directly):

Compute Node

PreviousBuilding MLP BlockNextTraining on Encrypted Data

Last updated 6 months ago

User Machine