Planning it all out

Siddarth p -

This past week, I’ve been formulating a plan to actually implement private set intersection (PSI) into a large language model (LLM). 

When integrating PSI into LLMs, the goal is to ensure privacy-preserving interactions. Take, for example, finance – if a bank wanted to compare a customer’s transaction history with a fraud detection model, they would not want to expose sensitive details when “talking to” the LLM. To set the privacy restraints, I want to adhere to the Payment Card Industry Data Security Standard (PCI DSS). 

PCI DSS is a set of security standards that ensures that all companies handling credit cards maintain a secure environment. It helps prevent frauds and data breaches.

Since Elliptic Curve Cryptography (ECC) is typically more efficient than traditional Diffie-Hellman key exchange, the steps would go as follows: Both parties would generate ECC key pairs. To use ECC, you have to pick a specific curve. Different curves have different strengths and weaknesses. The curve I’m going to use is SECP256R1. This curve is available in the python cryptography library. It is a widely trusted and standardized curve.

Without getting into too much technical jargon, I’ll use the ECC technology to implement PSI.   

To then implement this into the LLM, I’ll use an open source LLM like LLaMa from Meta. I’ll be able to self host it while tweaking it however I need to. I can’t use something like OpenAI’s ChatGPT because I can’t modify their internal processes. 

So, I’ll integrate the PSI into the LLM’s workflow by sending the PSI-encrypted queries from the users to the LLM backend.

I’ll then ensure that only non-sensitive data is revealed by checking for it in the LLM’s output with the PSI-encrypted queries. 

Finally, I am still trying to think of a quantifiable metric to determine efficacy of this method. In otherwords, I need a way to figure out how to quantify how feasible my method is.

More Posts

Comments:

All viewpoints are welcome but profane, threatening, disrespectful, or harassing comments will not be tolerated and are subject to moderation up to, and including, full deletion.

    Ryan_d
    Your plan to implement PSI is very interesting! Do you think different LLMs other than LLaMa could change how successful the encryptions are due to training or model structure?
    siddarth_p
    Thanks! While using different LLMs could slightly alter results, I believe how the encryption and comparison process are structured within the model's workflow is far more important in determining the success of encryptions.

Leave a Reply

Your email address will not be published. Required fields are marked *