InPSItion: A Protocol Within a Protocol
This week, I ran a simulation which specifically tested how implementing elliptic curve cryptography (ECC) based private set intersection (PSI) affects the efficiency and performance of a large language model (LLM) when handling user input.
I had an LLM process a list of keywords – imitating symptoms in a healthcare use case – with and without PSI. In the PSI usage process, the user’s set of keywords is intersected with a server side list before the LLM forms a response. I measured the time across various input sizes ranging from 5 to 500 keywords.
As you can see from the results, using PSI consistently caused an increase in processing time. This is because all the calculations of PSI are taking place while they aren’t in the non-PSI data. The overhead (the extra computing time compared to the non-PSI) ranged from ~28% to over 100%. For example, at 100 keywords, the response time without PSI was approximately 20.56 ms, while with PSI it increased to 28.37 ms — a ~38% increase. The difference was usually more pronounced at smaller keyword sizes.
These results demonstrated that while PSI does increase latency (processing time), it is still viable (at least for small scale operations). As input sizes increase, so does the performance cost linearly. However, the increase in data confidentiality might outweigh the slight performance cost. For applications like chatbots or emergency tools, a slight delay may be acceptable — especially when sensitive personal data is involved.
Steering away from the technical things, I’ve also started putting together my powerpoint presentation this week. I’m focusing on keeping it quite intuitive. I want to make sure that even people who aren’t really in the cybersecurity space can still leave with a good understanding of what I am trying to do here. I want to send the final takeaway as: privacy and AI don’t have to be mutually exclusive.
Comments:
All viewpoints are welcome but profane, threatening, disrespectful, or harassing comments will not be tolerated and are subject to moderation up to, and including, full deletion.