Junyuan Hong¹, Jiachen T. Wang², Chenhui Zhang³, Zhangheng Li¹, Bo Li⁴, Zhangyang Wang¹
¹ University of Texas at Austin, ² Princeton University, ³ MIT, ⁴ University of Chicago

Abstract

(Privacy issue in LLM) LLM 은 prompt tuning 을 통해 많은 task 에서 압도적인 성능을 보여준다. 그러나, 민감한 개인 정보에 dependency 가 있는 경우 문제가 생길 수 있다. 하나의 방법은 local LLM 을 host 하여 prompt 에 녹이는 방법이지만, closed-source 일 경우 불가능하다.
( DP-OPT ) 이 논문에서는 Differentially-Private Offsite Prompt Tuning (DP-OPT) 라는 방법론을 통해 문제를 해결한다. 이 방법론은 client side 에서 prompt 를 처리하고, 이 처리된 discrete prompt 를 cloud model 에 보내서 학습을 하는 방법이다. 저자들은 이 방법론이 성능 타협 없이 prompt 를 cloud model 에 잘 전달함을 보인다.
(Differentially-private (DP) ensemble) Prompt 가 개인 정보를 누출(leak)하지 않음을 보장하기 위하여, private prompt generation 메커니즘인 Differentially-private (DP) ensemble 방법을 제안한다.
(Experiment) DP-OPT 방법은 Vicuna-7B 를 통해 privacy-preserving prompt 를 쓰면서도, (private 정보를 쓰지 않은) GPT3.5 혹은 local private prompt tuning 방법과 유사하거나 좋은 성능을 보인다.

1. INTRODUCTIONS

▶ Prompt Engineering
Large Language Model (LLM) 이 강력한 pre-training 으로 방대한 task 에서 매우 압도적인 성능을 보여주지만, prompt engineering 은 cost-efficient 하게 downstream task 에 adatable 하게 할 수 있는 방법이다. Model parameter 를 resource-heavy 하게 optimize 하는 대신, prompt engineering 은 API access 등을 통해 prompts 만을 iteratively refine 해주면 된다. Manual Prompt Engineering 은 많은 task 에서 매우 인상적인 성능을 보여줬지만, legal judgement, healthcare, art 등의 전문가적인 downstream task 에 대해서는 domain knowledge 에 기반한 prompt design 에 human experience 가 개입되어야 하는 단점이 있다. 이를 위해, data-driven prompt tuning 인 soft prompt tuning 이 고안되었고, 이 방법은 prompt 를 trainable embedding vector 로 표현한뒤 training instance 에 따라 embedding vector 를 refine 한다.

▶ Data Privacy Issue
그러나 prompt tuning 의 적용의 major한 장벽이 되는 것이 data privacy 문제이다. ChatGPT 와 같은 LLM API 에 prompt 를 넣을 때, privacy-sensitive 한 정보를 넣게 되면 문제가 발생한다. 예를 들어 1) Data Confidentiality (Confidential data 가 입력이 되는 경우) 나 2)Information Leakage (누출되면 안되는 정보가 누출되는 경우) 등이다. 이름, 주소, 전화번호 같은 개인정보가 pre-training phase 나 fine-tuning data 에 포함된다면, 특정 parmaeter 를 통해 retrieve 될 수 있다.

이 문제 해결을 위한 Straighforward 접근은 local device 에서 entire prompt process 를 진행하는 것이다. 그러나, GPT 시리즈와 같이 closed-source 모델의 경우, substantial cost 는 말할 것도 없이 local hosting 자체가 불가능하다.

▶ Differentially-Private Offsite Prompt Tuning (DP-OPT)

이 문제 해결을 위해 저자들은 Differentially-Private Offsite Prompt Tuning (DP-OPT) 방법론을 제안한다. 이 방법론은 LLM 으로 하여금 private and transferable prompt 를 cloud-hosted LLM 을 위해 가공할 수 있게 한다. 위의 그림과 같이, privacy protection 의 중요한 부분 (crux) 은 client 에서만 운용된다. Confidential datatset 으로, DP-OPT 는 적은 sample 만으로 local LLM 이 prompt 를 생성할 수 있다. 이 local assistant LLM 은 coud-based LLM 에 비해 상대적으로 매우 작다. 또한, 이 prompt generation process 는 Differentially-Private (DP) ensemble of in-context learning 으로 가능하다. 실험 결과, 여러 언어처리 태스크에서, open-source VIcuna-7B 에 tuned 된 prompt 가 closed-source 인 GPT-3.5 나 LLama-2 보다 강력한 성능을 보인다.

2. PRELIMINARIES

2.1. Large Language Models (LLMs) and Prompt Tuning.

GPT, Llama, OPT 와 같은 LLM 은 이전의 context 로 부터 다음 token 을 생성한다. 수식적으로는 conditional probability $p_{LM}^t(y|x)$ 를 생성한다. 여기서 $x$는 prompt 이고, $y$는 output, $t$ 는 temperature 이다. 이때, task intsruction 과 같은 front-end prompt $\pi$를 사용한다면, prompt tuning 은 $F(\pi,x)$ 에서 $\pi$를 potimize 하여, 최종적으로, $p_{LM}^t(y|F(\pi,x))$ 를 향상시키는 것을 목적으로 한다.

2.2. Differential Privacy

Differential Privacy 는 머신 러닝 알고리즘의 privacy guarantee 를 측정하는 de-facto gold standard 이다. 수식적으로, 특정한 space $X$ 에 대해, 두 개의 dataset $D,D’ \in \mathbb{N}^X$ 에 대해, 하나의 data point 부터 다른 data point 를 adding/removing 을 통해 생성할 수 있으면 두 데이터셋은 adjacent 하다고 한다. (e.g. $D=D’ \bigcup z$ for some $z \in X$)

이 definition 이 의미하는 바는, neighboring dataset 의 임의의 pair 에 대하여, DP 알고리즘은 구분할 수 없는(indistinguishable) output distribution 을 내뱉어야 하며, 데이터셋으로부터의 출력을 구분할 수 있는 adversary 를 방지할 수 있어야한다. 이 연구에서는 이 메커니즘 $M$ 이 prompt generation 알고리즘으로 사용된다.

3. METHOD

Assumptions
Cloud model 의 강력한 성능을 이용하기 위해, local client model 에서 prompt tuning 을 하는데 세 가지 가정을 한다.

1) Data Confidentialty : client 는 cloud-model 과 데이터를 공유하지 않는다.
2) Information Privacy : Tuned prompt 는 private info 를 누출하지 않는다.
3) Model Ownership : Cloud model 의 parameter 는 client 와 공유되지 않는다.

Threat Model
Private info 를 얻길 위하는 cloud vendor 를 adversary 로 정의한다. Adverasry 는 client 로부터 tuned prompt 만을 받아서 어떠한 LLM 이든 공격하고자 한다. 몇몇 연구에서 prompt 에서 private info 를 얻어낼 수 있음을 밝혔다.

Main Idea
Data confidentiality 와 privacy 를 보존하기 위해, 저자들은 Differentially-Private Offsite Prompt Tuning (DP-OPT) 를 제안한다. 이는 cloud model 로부터 data 와 prompt tuning 을 분리시키는 방법이다. 앞선 Figure 처럼, 1) Private Prompt Engineering 으로 localized model 에서 prompt $\pi$ 를 학습하고, 2) Prompt Transfer 로 public inference 를 통해 cloud model 로 prompt 를 deploy 한다.

이를 위해선 두 가지 major technical challenge 가 존재한다.

(1) How to engineer a model-transferable prompt?
(2) How to guarantee that the prompts do not leak private information?

3.1. TRANSFERABLE DISCRETE PROMPTS ENABLE OFFSITE PROMPT TUNING

Cloud model 로 prompt 를 transferable 하게 하기 위해서는, 어떠한 model-specific embedding 이나 tokenization 전략이 포함되지 않는 discrete prompt 가 필요하다. 최근 연구에서 discrete prompt 가 domain 에 걸쳐 자연스럽게 transferable 하다는 결과가 있다. Wen et al. 에서는 자신들의 PEZ 라는 방법을 통해 GPT-2 755M 에서의 soft prompt 가 GPT-2 1.3B 의 큰 모델이나 OPT 와 같은 다른 아키텍쳐에 쓰일 수 있음을 보였다. 그러나, 이러한 transfer 는 심각한 performance loss 를 가져온다. 위의 연구에서 밝힌 ppro trasnferability 의 주된 이유는 tuned prompt 의 incoherence 이다. 이는 방법론이 Semantic 한 것을 생성하지 않고 모델의 훈련을 촉구하기 위해서만 embedding space 에 여전히 크게 의존할 수 있음을 의미한다.

따라서 저자들은 semantically transferable prompt 를 찾기 위해 노력한다. Embedding space 의 함정을 피하기 위해, embedding space 에서 backward 하는 것이 아니라 fluent 하고 coherent prompt 를 찾기 위해 노력한다. Automatic Prompt Engineering (APE) 에 영감을 받아, LLM 이 ideal tool 을 스스로 찾게끔 한다. 잘 훈련된 LLM 이라면, APE 가 context 와 prompt smaple 을 입력으로 받아 fluent, coherent, (perhaps) transferable 한 prompt 를 생성하기를 바란다. 즉 다시 말해, LLM 이 해주길 바란다 (discrete prompts crafted by one LLM may transfer to another with target-model-dependent performance on the same task.)

Make LLM Prompt Engineer
최고의 성능을 위해 State-of-the-Art APE method 인 Deep Language Network (DLN) 를 사용한다. DLN 은 gradient-based optimization 을 mimic 하여 forward-backward 방식으로 prompt 를 학습한다. Forward pass 에서 prompt 를 생성하고, backward pass 에서 LLM 의 in-context example 을 통한 prediction 을 통해 $\pi$ 를 sample 한다. Candidate prompt set 에서 DLN-1 은 highest log prob 을 갖는 best prompt 를 선택한다.

LLM-Engineered Prompts Are Transferrable

Vicuna-7B 를 통해 DLN-1 으로 prompt 를 학습시켜보았다. 이후 더 크고 같은 형태의(homogenous-architecture) LLama-2-70B 와, closed-source model 인 Davinci-003 에 적용해보았다. 결과는 위의 표와 같이, DLN-1 은 target model 에 competitive performance 를 보인다. 심지어 Davinci-003 에 대해서는 8% 의 성능 향상도 얻는다.

실제 DLN-1이 생성한 prompt 의 예시는 아래와 같다.

3.2. DIFFERENTIALLY-PRIVATE OFFSITE PROMPT TUNING (DP-OPT)

Private Prompt Generation

Private Selection among Generated Prompts

4. EXPERIMENTS

Tasks

Setup

4.1. PRIVATE OFFSITE PROMPT TUNING

4.2. ABLATION STUDIES

Examples of Privacy Leakage in Generated Prompts

DISCUSSION AND CONCLUSION

With the rising popularity of prompt tuning, our research endeavors to extend this tool to applications with heightened privacy concerns. We introduce the pioneering end-to-end system designed to derive differentially-private prompts from confidential training datasets and deploy these prompts on cloud models. Our approach is underpinned by theoretical validations of its privacy assurances, and through empirical analysis, we highlight the advantageous balance it strikes between utility and data privacy caused by the strong performance of scaled LLMs.

초록색볼드체
초록색배경 빨간색배경

▶

Yongil Kim

[ICLR2024] DP-OPT: MAKE LARGE LANGUAGE MODEL YOUR PRIVACY-PRESERVING PROMPT ENGINEER