AuditLLM: A Tool for Auditing Large Language Models Using Multiprobe Approach
CoRR(2024)
摘要
As Large Language Models (LLMs) gain wider adoption in various contexts, it
becomes crucial to ensure they are reasonably safe, consistent, and reliable
for an application at hand. This may require probing or auditing them. Probing
LLMs with varied iterations of a single question could reveal potential
inconsistencies in their knowledge or functionality. However, a tool for
performing such audits with simple workflow and low technical threshold is
lacking. In this demo, we introduce "AuditLLM," a novel tool designed to
evaluate the performance of various LLMs in a methodical way. AuditLLM's core
functionality lies in its ability to test a given LLM by auditing it using
multiple probes generated from a single question, thereby identifying any
inconsistencies in the model's understanding or operation. A reasonably robust,
reliable, and consistent LLM should output semantically similar responses for a
question asked differently or by different people. Based on this assumption,
AuditLLM produces easily interpretable results regarding the LLM's
consistencies from a single question that the user enters. A certain level of
inconsistency has been shown to be an indicator of potential bias,
hallucinations, and other issues. One could then use the output of AuditLLM to
further investigate issues with the aforementioned LLM. To facilitate
demonstration and practical uses, AuditLLM offers two key modes: (1) Live mode
which allows instant auditing of LLMs by analyzing responses to real-time
queries; (2) Batch mode which facilitates comprehensive LLM auditing by
processing multiple queries at once for in-depth analysis. This tool is
beneficial for both researchers and general users, as it enhances our
understanding of LLMs' capabilities in generating responses, using a
standardized auditing platform.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要