An Empirical Study on Capability of Large Language Models in Understanding Code Semantics

Thu-Trang Nguyen,Thanh Trong Vu,Hieu Dinh Vo,Son Nguyen

Information and Software Technology（2025）

Faculty of Information Technology

Cited 0|Views13

Abstract

Large Language Models for Code (code LLMs) have demonstrated remarkable performance across various software engineering (SE) tasks, increasing the application of code LLMs in software development. Despite the success of code LLMs, there remain significant concerns about the actual capabilities and reliability of these models, “whether these models really learn the semantics of code from the training data and leverage the learned knowledge to perform the SE tasks”. In this paper, we introduce Empica, a comprehensive framework designed to systematically and empirically evaluate the capabilities of code LLMs in understanding code semantics. Specifically, Empica systematically introduces controlled modifications/transformations into the input code and examines the models’ responses. In general, code LLMs must be robust to semantically equivalent code inputs and be sensitive to non-equivalent ones. Specifically, for every SE task, given an input code snippet c and its semantic equivalent variants, code LLMs must robustly produce consistent/equivalent outputs, while they are expected to generate different outputs for c and its semantic non-equivalent variants. Our experimental results with eight state-of-the-art code LLMs on six representative code understanding tasks reveal that the robustness and sensitivity of code LLMs to code transformations vary significantly across tasks and transformation operators. In addition, code LLMs exhibit better robustness to the semantic preserving transformations than their sensitivity to the semantic non-preserving transformations. These results highlight a need to enhance the model’s capabilities of understanding code semantics, especially the sensitivity property.

Translated text

Key words

Large language models for code,Code generation,Code understanding,Code semantics,Program transformation

Bibtex

AI Read Science

Must-Reading Tree

Example

Generate MRT to find the research sequence of this paper

Data Disclaimer

The page data are from open Internet sources, cooperative publishers and automatic analysis results through AI technology. We do not make any commitments and guarantees for the validity, accuracy, correctness, reliability, completeness and timeliness of the page data. If you have any questions, please contact us by email: report@aminer.cn

Chat Paper

Summary is being generated by the instructions you defined