Can LLMs Have a Fever? Investigating The Effects of Temperature on LLM Security

Chan, David; CHOW, BAN HOE; Chan, James; Poh, Phyllis

doi:10.46254/SA05.20240024

Track: High School STEM Competition

Abstract

Large Language Models (LLMs) play a crucial role in increasing productivity across various applications, yet concerns about their security persist. We defined a novel research area for LLM security – grey box testing, which involved adjustments of partially known information about LLMs, i.e. parameters such as temperature. This is the first investigation on the impact of temperature on LLM security, more specifically, jailbreaking. Jailbreak prompts, concatenated with inappropriate questions, were sent into top-performing open-source LLMs – Vicuna, Llama 2, and Mistral, with benchmarking against ChatGPT. Experiments were repeated over 5 temperature points: 0.0, 0.25, 0.5, 0.75 and 1.0, revealing a strong correlation between temperature and jailbreaking success. An increase in temperature generally results in higher jailbreaking success for Llama 2 and ChatGPT, and the converse is true for Vicuna and Mistral. We revised our hypothesis such that LLMs with low jailbreak success rate at 0 temperature are more vulnerable to jailbreaking as temperature increases, while LLMs with high jailbreak success rate at 0 temperature are less susceptible to jailbreaking as temperature rises. This research emphasises the importance of cautious usage of open-source LLMs and highlights the need for further research under grey box settings in other areas including data extraction and hallucinations.

Can LLMs Have a Fever? Investigating The Effects of Temperature on LLM Security

David Chan, BAN HOE CHOW, James Chan & Phyllis Poh

Publisher: IEOM Society International

Track: High School STEM Competition

Abstract

Related Research