LLMChain: Blockchain-based Reputation System for Sharing and Evaluating Large Language Models
arxiv(2024)
摘要
Large Language Models (LLMs) have witnessed rapid growth in emerging
challenges and capabilities of language understanding, generation, and
reasoning. Despite their remarkable performance in natural language
processing-based applications, LLMs are susceptible to undesirable and erratic
behaviors, including hallucinations, unreliable reasoning, and the generation
of harmful content. These flawed behaviors undermine trust in LLMs and pose
significant hurdles to their adoption in real-world applications, such as legal
assistance and medical diagnosis, where precision, reliability, and ethical
considerations are paramount. These could also lead to user dissatisfaction,
which is currently inadequately assessed and captured. Therefore, to
effectively and transparently assess users' satisfaction and trust in their
interactions with LLMs, we design and develop LLMChain, a decentralized
blockchain-based reputation system that combines automatic evaluation with
human feedback to assign contextual reputation scores that accurately reflect
LLM's behavior. LLMChain not only helps users and entities identify the most
trustworthy LLM for their specific needs, but also provides LLM developers with
valuable information to refine and improve their models. To our knowledge, this
is the first time that a blockchain-based distributed framework for sharing and
evaluating LLMs has been introduced. Implemented using emerging tools, LLMChain
is evaluated across two benchmark datasets, showcasing its effectiveness and
scalability in assessing seven different LLMs.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要