基本信息
浏览量:0
职业迁徙
个人简介
Research Interests
I have two broad directions that interest me the most in NLP, stemming from two distinct points of view I harbor.
First, I am curious about the inner workings of (Large) Language Models. We can throw together a nice looking loss function, a reasonable training loop, some compute and lots of data - and voila! A model starts generating near-fluent text. But, what does it learn? Does it reverse-engineer rules of grammar? In this context, I am interested in two counterposing approaches:
How can we best port human knowledge of Natural Language (e.g. linguistic structure, disambiguation of context, and so on) to a Language Model by modifying the model, training process and/or the data? More practically, can this lead us to better parameter and data efficiency?
Humans find it hard to learn languages without any visual cues or explanations, but it is easy (for a generous definition of easy) for LMs to do so. Do they know something we don’t? Can we reverse engineer more efficient ways to think about Language from them? This is a more abstract question that nonetheless excites me as much as the previous one.
Second, as LMs become more commonplace, their potential for both benefit and harm is bound to increase. We want them to be helpful, factual and relevant, among other desiderata. I am interested in exploring Controllable/Interpretable Text Generation to steer the models towards the behavior we want, and away from undesirable and harmful behavior (e.g. hallucinations).
More generally, I think NLP (research) is fascinating in its own right. Many of the current challenges (think ChatGPT hallucinations, lack of logical reasoning, and so on) are daunting, but by the same coin quite thrilling. I believe that going forward, principled approaches that generalize well are likely to be the ones that power through them.
I have two broad directions that interest me the most in NLP, stemming from two distinct points of view I harbor.
First, I am curious about the inner workings of (Large) Language Models. We can throw together a nice looking loss function, a reasonable training loop, some compute and lots of data - and voila! A model starts generating near-fluent text. But, what does it learn? Does it reverse-engineer rules of grammar? In this context, I am interested in two counterposing approaches:
How can we best port human knowledge of Natural Language (e.g. linguistic structure, disambiguation of context, and so on) to a Language Model by modifying the model, training process and/or the data? More practically, can this lead us to better parameter and data efficiency?
Humans find it hard to learn languages without any visual cues or explanations, but it is easy (for a generous definition of easy) for LMs to do so. Do they know something we don’t? Can we reverse engineer more efficient ways to think about Language from them? This is a more abstract question that nonetheless excites me as much as the previous one.
Second, as LMs become more commonplace, their potential for both benefit and harm is bound to increase. We want them to be helpful, factual and relevant, among other desiderata. I am interested in exploring Controllable/Interpretable Text Generation to steer the models towards the behavior we want, and away from undesirable and harmful behavior (e.g. hallucinations).
More generally, I think NLP (research) is fascinating in its own right. Many of the current challenges (think ChatGPT hallucinations, lack of logical reasoning, and so on) are daunting, but by the same coin quite thrilling. I believe that going forward, principled approaches that generalize well are likely to be the ones that power through them.
研究兴趣
论文共 9 篇作者统计合作学者相似作者
按年份排序按引用量排序主题筛选期刊级别筛选合作者筛选合作机构筛选
时间
引用量
主题
期刊级别
合作者
合作机构
SIGNAL PROCESSING (2024): 109233-109233
CoRR (2023): 7053-7074
引用1浏览0EI引用
1
0
arXiv (Cornell University) (2023)
arxiv(2022)
引用21浏览0EI引用
21
0
作者统计
合作学者
合作机构
D-Core
- 合作者
- 学生
- 导师
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn