基本信息
views: 381
![](https://originalfileserver.aminer.cn/sys/aminer/icon/show-trajectory.png)
Bio
I currently work on solving the key challenges preventing Reinforcement Learning algorithms from working on real-world applications at scale. This includes a focus on Reinforcement Learning from Human Feedback (RLHF) in the context of Large Language Models (LLMs).
Research Interests
Papers共 48 篇Author StatisticsCo-AuthorSimilar Experts
By YearBy Citation主题筛选期刊级别筛选合作者筛选合作机构筛选
时间
引用量
主题
期刊级别
合作者
合作机构
Daniel J. Mankowitz,Andrea Michi,Anton Zhernov,Marco Gelmi,Marco Selvi,Cosmin Paduraru,Edouard Leurent,Shariq Iqbal,Jean-Baptiste Lespiau,Alex Ahern, Thomas Köppe, Kevin Millikin,
Natureno. 7964 (2023): 257-263
CoRRpp.11144-11172, (2023)
Pengming Wang,Mikita Sazanovich,Berkin Ilbeyi,Phitchaya Mangpo Phothilimthana, Manish Purohit, Han Yang Tay,Ngân Vũ,Miaosen Wang,Cosmin Paduraru,Edouard Leurent,Anton Zhernov,Julian Schrittwieser,
CoRR (2023)
ArXiv (2022)
Jerry Luo,Cosmin Paduraru,Octavian Voicu,Yuri Chervonyi, Scott Munns,Jerry Li,Crystal Qian,Praneet Dutta,Jared Quincy Davis, Ningjia Wu, Xingwei Yang, Chu-Ming Chang,
CoRR (2022)
International Conference on Learning Representations (ICLR) (2022)
Load More
Author Statistics
Co-Author
Co-Institution
D-Core
- 合作者
- 学生
- 导师
Data Disclaimer
The page data are from open Internet sources, cooperative publishers and automatic analysis results through AI technology. We do not make any commitments and guarantees for the validity, accuracy, correctness, reliability, completeness and timeliness of the page data. If you have any questions, please contact us by email: report@aminer.cn