Periodic watermarking for copyright protection of large language models in cloud computing security

Pei Gen Ye, Zhengdao Li, Zuopeng Yang, Pengyu Chen, Zhenxin Zhang, Ning Li, Jun Zheng*

*此作品的通讯作者

科研成果: 期刊稿件文章同行评审

2 引用 (Scopus)

摘要

Large Language Models (LLMs) have become integral in advancing content understanding and generation, leading to the proliferation of Embedding as a Service (EaaS) within cloud computing platforms. EaaS leverages LLMs to offer scalable, on-demand linguistic embeddings, enhancing various cloud-based applications. However, the proprietary nature of EaaS makes it a target for model extraction attacks, where the timing of such infringements often remains elusive. This paper introduces TimeMarker, a novel framework that enhances temporal traceability in cloud computing environments by embedding distinct watermarks at different sub-periods, marking the first attempt to identify the timing of model extraction attacks. TimeMarker employs an adaptive watermark strength method based on information entropy and frequency domain transformations to refine the detection accuracy of model extraction attacks within cloud infrastructures. The granularity of time frame identification for theft improves as more sub-periods are used. Furthermore, our approach investigates single sub-period theft and extends to multi-sub-period theft scenarios where attackers steal data across many sub-periods to train their models in cloud settings. Validated across five widely used datasets, TimeMarker is capable of detecting model extraction over various sub-periods and assessing its impact on the accuracy and robustness of large models deployed in the cloud. The results demonstrate that TimeMarker effectively identifies different periods of extraction attacks, enhancing EaaS security within cloud computing and extending traditional watermarking to offer copyright protection for LLMs.

源语言英语
文章编号103983
期刊Computer Standards and Interfaces
94
DOI
出版状态已出版 - 8月 2025

指纹

探究 'Periodic watermarking for copyright protection of large language models in cloud computing security' 的科研主题。它们共同构成独一无二的指纹。

引用此