s complexity measure, $d$ is the dimension of the representation (usually $d\\ll \\mathcal{C}(\\Phi)$) and $n$ is the number of samples for the new task. Thus the required $n$ is $O(\\kappa d H^4)$ for the sub-optimality to be close to zero, which is much smaller than $O(\\mathcal{C}(\\Phi)^2\\kappa d H^4)$ in the setting without multitask representation learning, whose sub-optimality gap is $\\tilde{O}(H^2\\sqrt{\\frac{\\kappa \\mathcal{C}(\\Phi)^2d}{n}})$. This theoretically explains the power of multitask representation learning in reducing sample complexity. Further, we note that to ensure high sample efficiency, the LAFA criterion $\\kappa$ should be small. In fact, $\\kappa$ varies widely in magnitude depending on the different sampling distribution for new task. This indicates adaptive sampling technique is important to make $\\kappa$ solely depend on $d$. Finally, we provide empirical results of a noisy grid-world environment to corroborate our theoretical findings. ","authors":[{"id":"63ab9cb7cb0eafdb3179bcd3","name":"Rui Lu"},{"id":"540835d9dabfae44f0870362","name":"Gao Huang"},{"id":"562e261a45cedb33990beecb","name":"Simon S. Du"}],"create_time":"2021-06-18T02:45:06.261Z","flags":[{"flag":"affirm_author","person_id":"540835d9dabfae44f0870362"},{"flag":"affirm_author","person_id":"562e261a45cedb33990beecb"}],"hashs":{"h1":"pmrll","h3":"m"},"id":"60cbe98b91e011eef576dab8","lang":"en","num_citation":6,"pdf":"https:\u002F\u002Fstatic.aminer.cn\u002Fstorage\u002Fpdf\u002Farxiv\u002F21\u002F2106\u002F2106.08053.pdf","pdf_src":["https:\u002F\u002Farxiv.org\u002Fpdf\u002F2106.08053"],"title":"On the Power of Multitask Representation Learning in Linear MDP","urls":["db\u002Fjournals\u002Fcorr\u002Fcorr2106.html#abs-2106-08053","https:\u002F\u002Farxiv.org\u002Fabs\u002F2106.08053"],"versions":[{"id":"60cbe98b91e011eef576dab8","sid":"2106.08053","src":"arxiv","year":2021},{"id":"64564789d68f896efae27054","sid":"journals\u002Fcorr\u002Fabs-2106-08053","src":"dblp","year":2021}],"year":2021},{"abstract":"For e-commerce platforms such as Taobao and Amazon, advertisers play an important role in the entire digital ecosystem: their behaviors explicitly influence users' browsing and shopping experience; more importantly, advertiser's expenditure on advertising constitutes a primary source of platform revenue. Therefore, providing better services for advertisers is essential for the long-term prosperity for e-commerce platforms. To achieve this goal, the ad platform needs to have an in-depth understanding of advertisers in terms of both their marketing intents and satisfaction over the advertising performance, based on which further optimization could be carried out to service the advertisers in the correct direction. In this paper, we propose a novel Deep Satisfaction Prediction Network (DSPN), which models advertiser intent and satisfaction simultaneously. It employs a two-stage network structure where advertiser intent vector and satisfaction are jointly learned by considering the features of advertiser's action information and advertising performance indicators. Experiments on an Alibaba advertisement dataset and online evaluations show that our proposed DSPN outperforms state-of-the-art baselines and has stable performance in terms of AUC in the online environment. Further analyses show that DSPN not only predicts advertisers' satisfaction accurately but also learns an explainable advertiser intent, revealing the opportunities to optimize the advertising performance further.\n\n","authors":[{"email":"liyiguo1995@sjtu.edu.cn","id":"63736d149bb5705eda8a7f8b","name":"Liyi Guo","org":"Shanghai Jiao Tong University, Shanghai, China","orgid":"5f71b54b1c455f439fe502b0","orgs":["Shanghai Jiao Tong University, Shanghai, China#TAB#"]},{"id":"63ab9cb7cb0eafdb3179bcd3","name":"Rui Lu","org":"Alibaba Group, Beijing, China","orgid":"5f71b3181c455f439fe406bb","orgs":["Alibaba Group, Beijing , China#TAB#"]},{"email":"zhanghaoqi39@sjtu.edu.cn","id":"562cc81f45cedb3398cc14af","name":"Haoqi Zhang","org":"Shanghai Jiao Tong University, Shanghai, China","orgid":"5f71b54b1c455f439fe502b0","orgs":["Shanghai Jiao Tong University, Shanghai, China#TAB#"]},{"email":"junqi.jjq@alibaba-inc.com","id":"562cb2f245cedb3398c9b1fe","name":"Junqi Jin","org":"Alibaba Group, Beijing, China","orgid":"5f71b3181c455f439fe406bb","orgs":["Alibaba Group, Beijing , China#TAB#"]},{"email":"zhengzhenzhe@sjtu.edu.cn","id":"53f43d81dabfaee1c0ad528b","name":"Zhenzhe Zheng","org":"Shanghai Jiao Tong University, Shanghai, China","orgid":"5f71b54b1c455f439fe502b0","orgs":["Shanghai Jiao Tong University, Shanghai, China#TAB#"]},{"email":"fwu@cs.sjtu.edu.cn","id":"53f47e33dabfaee02addc55c","name":"Fan Wu","org":"Shanghai Jiao Tong University, Shanghai, China","orgid":"5f71b54b1c455f439fe502b0","orgs":["Shanghai Jiao Tong University, Shanghai, China#TAB#"]},{"email":"jingshi.gk@taobao.com","id":"5631658a45ce1e59689fa4cb","name":"Jin Li","org":"Alibaba Group, Beijing, China","orgid":"5f71b3181c455f439fe406bb","orgs":["Alibaba Group, Beijing , China#TAB#"]},{"id":"562d872245cedb3398e45d49","name":"Haiyang Xu","org":"Alibaba Group, Beijing, China","orgid":"5f71b3181c455f439fe406bb","orgs":["Alibaba Group, Beijing , China#TAB#"]},{"email":"lihan.lh@alibaba-inc.com","id":"542bc160dabfae1bbfd1780e","name":"Han Li","org":"Alibaba Group, Beijing, China","orgid":"5f71b3181c455f439fe406bb","orgs":["Alibaba Group, Beijing , China#TAB#"]},{"email":"lwkmf@mail.tsinghua.edu.cn","id":"53f4c7d8dabfaee57977d308","name":"Wenkai Lu","org":"Tsinghua University, Beijing, China","orgid":"5f71b2881c455f439fe3c860","orgs":["Tsinghua Univ Beijing, China#TAB#"]},{"id":"5d43d51f7390bff0db5ff255","name":"Jian Xu","org":"Alibaba Group, Beijing, China","orgid":"5f71b3181c455f439fe406bb","orgs":["Alibaba Group, Beijing , China#TAB#"]},{"id":"53f43449dabfaeecd694b202","name":"Kun Gai","org":"Alibaba Group, Beijing, China","orgid":"5f71b3181c455f439fe406bb","orgs":["Alibaba Group, Beijing , China#TAB#"]}],"citations":{"google_citation":5,"last_citation":5},"create_time":"2020-08-21T13:00:39.881Z","doi":"10.1145\u002F3340531.3412681","hashs":{"h1":"dpnua","h3":"is"},"id":"5f3f9c6291e011d38f9243c8","isbn":"978-1-4503-6859-9","num_citation":9,"pages":{"end":"2508","start":"2501"},"pdf":"https:\u002F\u002Fstatic.aminer.cn\u002Fstorage\u002Fpdf\u002Farxiv\u002F20\u002F2008\u002F2008.08931.pdf","pdf_src":["https:\u002F\u002Farxiv.org\u002Fpdf\u002F2008.08931","https:\u002F\u002Fdl.acm.org\u002Fdoi\u002Fpdf\u002F10.1145\u002F3340531.3412681"],"title":"A Deep Prediction Network for Understanding Advertiser Intent and Satisfaction","update_times":{"u_a_t":"2020-08-22T13:05:25.58Z","u_c_t":"2023-11-05T18:06:48.053Z","u_v_t":"2023-04-12T14:21:00.718Z"},"urls":["http:\u002F\u002Fwww.webofknowledge.com\u002F","https:\u002F\u002Farxiv.org\u002Fabs\u002F2008.08931","https:\u002F\u002Fdoi.org\u002F10.1145\u002F3340531.3412681","https:\u002F\u002Fdblp.uni-trier.de\u002Fdb\u002Fconf\u002Fcikm\u002Fcikm2020.html#GuoLZJZWLXLLXG20","https:\u002F\u002Fdl.acm.org\u002Fdoi\u002Fpdf\u002F10.1145\u002F3340531.3412681","https:\u002F\u002Fdl.acm.org\u002Fdoi\u002Fabs\u002F10.1145\u002F3340531.3412681","https:\u002F\u002Fdblp.org\u002Frec\u002Fconf\u002Fcikm\u002FGuoLZJZWLXLLXG20","db\u002Fjournals\u002Fcorr\u002Fcorr2008.html#abs-2008-08931"],"venue":{"info":{"name":"CIKM '20: The 29th ACM International Conference on Information and Knowledge Management\n\t\t Virtual Event\n\t\t Ireland\n\t\t October, 2020","name_s":"CIKM"},"volume":"abs\u002F2008.08931"},"venue_hhb_id":"5ea1b61aedb6e7d53c00c891","versions":[{"id":"5f3f9c6291e011d38f9243c8","sid":"2008.08931","src":"arxiv","year":2020},{"id":"5f8d6be69fced0a24bbaaf55","sid":"10.1145\u002F3340531.3412681","src":"acm","vsid":"cikm","year":2020},{"id":"5ff882fc91e011c83267151f","sid":"conf\u002Fcikm\u002FGuoLZJZWLXLLXG20","src":"dblp","vsid":"conf\u002Fcikm","year":2020},{"id":"5ff68c78d4150a363cd2826c","sid":"3102471315","src":"mag","vsid":"1194094125","year":2020},{"id":"645647aed68f896efae386b4","sid":"journals\u002Fcorr\u002Fabs-2008-08931","src":"dblp","year":2020},{"id":"64d46afb3fda6d7f068fb1f7","sid":"WOS:000749561302105","src":"wos","year":2020}],"year":2020},{"abstract":"Speech separation aims to separate individual voices from an audio mixture of multiple simultaneous talkers. Audio-only approaches show unsatisfactory performance when the speakers are of the same gender or share similar voice characteristics. This is due to challenges on learning appropriate feature representations for separating voices in single frames and streaming voices across time. Visual signals of speech (e.g., lip movements), if available, can be leveraged to learn better feature representations for separation. In this paper, we propose a novel audio–visual deep clustering model (AVDC) to integrate visual information into the process of learning better feature representations (embeddings) for Time–Frequency (T–F) bin clustering. It employs a two-stage audio–visual fusion strategy where speaker-wise audio–visual T–F embeddings are first computed after the first-stage fusion to model the audio–visual correspondence for each speaker. In the second-stage fusion, audio–visual embeddings of all speakers and audio embeddings calculated by deep clustering from the audio mixture are concatenated to form the final T–F embedding for clustering. Through a series of experiments, the proposed AVDC model is shown to outperform the audio-only deep clustering and utterance-level permutation invariant training baselines and three other state-of-the-art audio–visual approaches. Further analyses show that the AVDC model learns a better T–F embedding for alleviating the source permutation problem across frames. Other experiments show that the AVDC model is able to generalize across different numbers of speakers between training and testing and shows some robustness when visual information is partially missing.","authors":[{"avatar":"\u002Fmediastore_new\u002FIEEE\u002Fcontent\u002Ffreeimages\u002F6570655\u002F8784423\u002F8762221\u002Flu-2928140-small.gif","email":"lur13@mails.tsinghua.edu.cn","id":"63ab9cb7cb0eafdb3179bcd3","name":"Rui Lu","org":"State Key Lab of Intelligent Technologies and SystemsBeijing National Research Center for Information Science and Technology (BNRist)Institute for Artificial Intelligence","orgs":["Tsinghua Univ, Inst Artificial Intelligence, Beijing 100084, Peoples R China","Tsinghua Univ, State Key Lab Intelligent Technol & Syst, Beijing 100084, Peoples R China","Tsinghua Univ, Beijing Natl Res Ctr Informat Sci & Technol BNRis, Beijing 100084, Peoples R China","Tsinghua Univ, Dept Automat, Beijing 100084, Peoples R China"]},{"avatar":"\u002Fmediastore_new\u002FIEEE\u002Fcontent\u002Ffreeimages\u002F6570655\u002F8784423\u002F8762221\u002Fduan-2928140-small.gif","id":"53f43745dabfaeb2ac05ab8d","name":"Zhiyao Duan","org":"Department of Electrical and Computer Engineering, Department of Computer Science, University of Rochester, Rochester, NY, USA","orgid":"5f71b7181c455f439fe5c8f6","orgs":["Univ Rochester, Dept Elect & Comp Engn, Rochester, NY 14627 USA","Univ Rochester, Dept Comp Sci, Rochester, NY 14627 USA"]},{"avatar":"\u002Fmediastore_new\u002FIEEE\u002Fcontent\u002Ffreeimages\u002F6570655\u002F8784423\u002F8762221\u002Fwang-2928140-small.gif","id":"53f43149dabfaeb2ac01b574","name":"Changshui Zhang","org":"State Key Lab of Intelligent Technologies and SystemsBeijing National Research Center for Information Science and Technology (BNRist)Institute for Artificial Intelligence","orgs":["Tsinghua Univ, Inst Artificial Intelligence, Beijing 100084, Peoples R China","Tsinghua Univ, State Key Lab Intelligent Technol & Syst, Beijing 100084, Peoples R China","Tsinghua Univ, Beijing Natl Res Ctr Informat Sci & Technol BNRis, Beijing 100084, Peoples R China","Tsinghua Univ, Dept Automat, Beijing 100084, Peoples R China"]}],"citations":{"google_citation":33,"last_citation":33},"create_time":"2019-08-03T13:03:42.265Z","doi":"10.1109\u002FTASLP.2019.2928140","flags":[{"flag":"affirm_author","person_id":"53f43149dabfaeb2ac01b574"}],"hashs":{"h1":"adcss"},"id":"5d44414f275ded87f9801d86","issn":"2329-9290","keywords":["Visualization","Lips","Spectrogram","Hidden Markov models","Training","Speech processing","Robustness"],"num_citation":41,"pages":{"end":"1712","start":"1697"},"pdf":"https:\u002F\u002Fstatic.aminer.cn\u002Fupload\u002Fpdf\u002F820\u002F1775\u002F614\u002F5d44414f275ded87f9801d86_0.pdf","title":"Audio–Visual Deep Clustering for Speech Separation","update_times":{"u_a_t":"2019-08-03T19:17:17.849Z","u_c_t":"2023-11-07T09:44:15.647Z","u_v_t":"2023-11-11T21:01:02.562Z"},"urls":["https:\u002F\u002Fdoi.org\u002F10.1109\u002FTASLP.2019.2928140","https:\u002F\u002Fdl.acm.org\u002Fcitation.cfm?id=3370717&picked=prox&preflayout=flat","http:\u002F\u002Fwww.webofknowledge.com\u002F","https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F8762221\u002F","https:\u002F\u002Fdl.acm.org\u002Fdoi\u002F10.1109\u002FTASLP.2019.2928140","https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F8762221","db\u002Fjournals\u002Ftaslp\u002Ftaslp27.html#LuDZ19"],"venue":{"info":{"name":"IEEE\u002FACM Transactions on Audio, Speech, and Language Processing"},"issue":"11","volume":"27"},"venue_hhb_id":"61aaadb650d7dfeeb22893ad","versions":[{"id":"5d44414f275ded87f9801d86","sid":"8762221","src":"ieee","vsid":"6570655","year":2019},{"id":"5d79a6fa47c8f76646d43e90","sid":"journals\u002Ftaslp\u002FLuDZ19","src":"dblp","vsid":"journals\u002Ftaslp","year":2019},{"id":"5dc1497edf1a9c0c414d3ce0","sid":"3370717","src":"acm","vsid":"J1508","year":2019},{"id":"5d9ed7a247c8f76646005212","sid":"2959214850","src":"mag","vsid":"199497470","year":2019},{"id":"5fc9dca0b0d046820d3d3c5f","sid":"WOS:000480309600004","src":"wos","vsid":"IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING","year":2019},{"id":"62376bdc5aee126c0f0b3ca4","sid":"8762221","src":"dblp","vsid":"journals\u002Ftaslp","year":2019},{"id":"5dc1497edf1a9c0c414d3ce0","sid":"10.1109\u002FTASLP.2019.2928140","src":"acm","year":2019}],"year":2019},{"abstract":"Source permutation, i.e., assigning separated signal snippets to wrong sources over time, is a major issue in the state-of-the-art speaker-independent speech source separation methods. In addition to auditory cues, humans also leverage visual cues to solve this problem at cocktail parties: matching lip movements with voice fluctuations helps humans to better pay attention to the speaker of interes...","authors":[{"email":"lur13@mails.tsinghua.edu.cn","id":"63ab9cb7cb0eafdb3179bcd3","name":"Rui Lu","org":"Department of Automation, Tsinghua University, Beijing, China","orgid":"5f71b2881c455f439fe3c860","orgs":["Department of Automation, Tsinghua University, Beijing, China"]},{"email":"zhiyao.duan@rochester.edu","id":"53f43745dabfaeb2ac05ab8d","name":"Zhiyao Duan","org":"Department of Electrical and Computer Engineering, University of Rochester, Rochester, NY, USA","orgid":"5f71b7181c455f439fe5c8f6","orgs":["Department of Electrical and Computer Engineering, University of Rochester, Rochester, NY, USA"]},{"id":"53f43149dabfaeb2ac01b574","name":"Changshui Zhang","org":"Department of Automation, Tsinghua University, Beijing, China","orgid":"5f71b2881c455f439fe3c860","orgs":["Department of Automation, Tsinghua University, Beijing, China"]}],"citations":{"google_citation":30,"last_citation":23},"create_time":"2018-09-03T02:41:00.669Z","doi":"10.1109\u002FLSP.2018.2853566","flags":[{"flag":"affirm_author","person_id":"53f43149dabfaeb2ac01b574"}],"hashs":{"h1":"lamas","h3":"ss"},"id":"5b8c9f4517c44af36f8b5ef1","isbn":"","issn":"1070-9908","keywords":["Visualization","Predictive models","Lips","Training","Radio frequency","Spectrogram","Source separation"],"lang":"en","num_citation":32,"num_wos_citation":0,"pages":{"end":"1319","start":"1315"},"pdf":"https:\u002F\u002Fstatic.aminer.cn\u002Fupload\u002Fpdf\u002F1690\u002F476\u002F307\u002F5b8c9f4517c44af36f8b5ef1_0.pdf","retrieve_info":{},"title":"Listen and Look: Audio-Visual Matching Assisted Speech Source Separation.","update_times":{"u_a_t":"2019-09-22T17:37:19.067Z","u_c_t":"2023-03-27T13:10:56.564Z","u_v_t":"2023-04-12T14:20:59.429Z"},"urls":["https:\u002F\u002Fdoi.org\u002F10.1109\u002FLSP.2018.2853566","https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F8404105","http:\u002F\u002Fwww.webofknowledge.com\u002F"],"venue":{"info":{"name":"IEEE Signal Processing Letters"},"issue":"9","volume":"25"},"venue_hhb_id":"5ea543acedb6e7d53c0344bd","versions":[{"id":"5b8c9f4517c44af36f8b5ef1","sid":"journals\u002Fspl\u002FLuDZ18","src":"dblp","vsid":"journals\u002Fspl","year":2018},{"id":"5ce2ce84ced107d4c62fae7e","sid":"2844030168","src":"mag","vsid":"120629676","year":2018},{"id":"5f31e80a9fced0a24bfc27d4","sid":"8404105","src":"ieee","vsid":"97","year":2018},{"id":"619b60611c45e57ce95d41ac","sid":"WOS:000439625100002","src":"wos","vsid":"IEEE SIGNAL PROCESSING LETTERS","year":2018}],"year":2018}],"profilePubsTotal":17,"profilePatentsPage":0,"profilePatents":null,"profilePatentsTotal":null,"profilePatentsEnd":false,"profileProjectsPage":1,"profileProjects":{"success":true,"msg":"","data":null,"log_id":"2ZMPWKt1w0xD1McTArnT7M6DXHw"},"profileProjectsTotal":0,"newInfo":null,"checkDelPubs":[]}};