WeChat Mini Program
Old Version Features

Beyond Language Models: Byte Models Are Digital World Simulators

CoRR(2024)

Cited 0|Views96
Abstract
Traditional deep learning often overlooks bytes, the basic units of thedigital world, where all forms of information and operations are encoded andmanipulated in binary format. Inspired by the success of next token predictionin natural language processing, we introduce bGPT, a model with next byteprediction to simulate the digital world. bGPT matches specialized models inperformance across various modalities, including text, audio, and images, andoffers new possibilities for predicting, simulating, and diagnosing algorithmor hardware behaviour. It has almost flawlessly replicated the process ofconverting symbolic music data, achieving a low error rate of 0.0011 bits perbyte in converting ABC notation to MIDI format. In addition, bGPT demonstratesexceptional capabilities in simulating CPU behaviour, with an accuracyexceeding 99.99prediction, models like bGPT can directly learn from vast binary data,effectively simulating the intricate patterns of the digital world.
More
Translated text
PDF
Bibtex
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Data Disclaimer
The page data are from open Internet sources, cooperative publishers and automatic analysis results through AI technology. We do not make any commitments and guarantees for the validity, accuracy, correctness, reliability, completeness and timeliness of the page data. If you have any questions, please contact us by email: report@aminer.cn
Chat Paper
Summary is being generated by the instructions you defined