' for convex case and $\\mathcal{O}(1\u002F{T}^{1\u002F4})$ bound in terms of the gradient of the Moreau envelope function for weakly convex case. Furthermore, we provide convergence results for non-Lipschitz convex and weakly convex objective functions using proper diminishing rules on the step sizes. In particular, when $f$ is convex, we show $\\mathcal{O}(\\log(k)\u002F\\sqrt{k})$ rate of convergence in terms of the suboptimality gap. With an additional quadratic growth condition, the rate is improved to $\\mathcal{O}(1\u002Fk)$ in terms of the squared distance to the optimal solution set. When $f$ is weakly convex, asymptotic convergence is derived. The central idea is that the dynamics of properly chosen step sizes rule fully controls the movement of the subgradient method, which leads to boundedness of the iterates, and then a trajectory-based analysis can be conducted to establish the desired results. To further illustrate the wide applicability of our framework, we extend the complexity results to the truncated subgradient, the stochastic subgradient, the incremental subgradient, and the proximal subgradient methods for non-Lipschitz functions. ","authors":[{"id":"562cb08445cedb3398c96634","name":"Xiao Li"},{"id":"6324850fc03fbd5be1f8d28d","name":"Lei Zhao"},{"id":"5429e123dabfaec7081c3fad","name":"Daoli Zhu"},{"id":"5440fb46dabfae805a717115","name":"Anthony Man-Cho So"}],"create_time":"2023-05-24T04:58:48.942Z","hashs":{"h1":"rsmcc","h3":"lc"},"id":"646d8642d68f896efa0a2f16","num_citation":0,"pdf":"https:\u002F\u002Fcz5waila03cyo0tux1owpyofgoryroob.aminer.cn\u002FEB\u002FD2\u002F56\u002FEBD256D820E9B16B928AB6E51CC9FA13.pdf","title":"Revisiting Subgradient Method: Complexity and Convergence Beyond\n Lipschitz Continuity","urls":["db\u002Fjournals\u002Fcorr\u002Fcorr2305.html#abs-2305-14161","https:\u002F\u002Fdoi.org\u002F10.48550\u002FarXiv.2305.14161","https:\u002F\u002Farxiv.org\u002Fabs\u002F2305.14161"],"venue":{"info":{"name":"CoRR"},"volume":"abs\u002F2305.14161"},"versions":[{"id":"646d8642d68f896efa0a2f16","sid":"2305.14161","src":"arxiv","year":2023},{"id":"6479e3aed68f896efa4e80c9","sid":"journals\u002Fcorr\u002Fabs-2305-14161","src":"dblp","year":2023}],"year":2023},{"authors":[{"id":"544566e8dabfae862da1abc7","name":"Peng Wang"},{"name":"Huikang Liu"},{"id":"5440fb46dabfae805a717115","name":"Anthony Man-Cho So"}],"create_time":"2023-07-31T18:33:01.128Z","hashs":{"h1":"lcpam","h2":"mebe","h3":"1npca"},"id":"64c78b983fda6d7f06db4270","num_citation":0,"pages":{"end":"712","start":"684"},"title":"Linear Convergence of a Proximal Alternating Minimization Method with Extrapolation for \\(\\boldsymbol{\\ell_1}\\) -Norm Principal Component Analysis.","urls":["db\u002Fjournals\u002Fsiamjo\u002Fsiamjo33.html#0098LS23","https:\u002F\u002Fdoi.org\u002F10.1137\u002F21m1434507"],"venue":{"info":{"name":"SIAM J. Optim."},"issue":"2","volume":"33"},"versions":[{"id":"64c78b983fda6d7f06db4270","sid":"journals\u002Fsiamjo\u002F0098LS23","src":"dblp","year":2023}],"year":2023},{"abstract":"Given the network and the time-triggered flow requests of a Time Sensitive Network (TSN), configuring the gate control lists (GCL) of IEEE 802.1Qbv for the ports of each node can be formed as a Job Shop Scheduling Problem, which is NP-hard. At present, most of the existing heuristic solutions for such problems consider scenarios where all given traffic flows can be scheduled. In order to solve the undetermined flow scheduling problem in scenarios no matter whether the flows can be scheduled or not, we propose to maximize the remaining time in conjunction with optimizing the network utilization instead of only minimizing the flowspan. Though the new problem is still NP-hard, it is a unified framework capable of covering general scenarios. On the basis of the new framework, we propose a novel Mixed initial population Genetic Algorithm (MGA) to solve the problem. Extensive simulation evaluation shows that MGA performs better and faster in different network scenarios while other methods prevails only in specific scenarios. This feature makes the method attractive in realistic TSN scheduling applications for in most cases it is hard for users to properly classifying the problem.","authors":[{"email":"mwyao@xidian.edu.cn","id":"54087157dabfae450f41ab29","name":"Mingwu Yao","org":"Xidian Univ, Sch Telecommun Engn, Xian 710071, Shaanxi, Peoples R China","orgs":["Xidian Univ, Sch Telecommun Engn, Xian 710071, Shaanxi, Peoples R China"]},{"id":"6537908450dee4c422827e78","name":"Jiamu Liu","org":"Xidian Univ, Sch Telecommun Engn, Xian 710071, Shaanxi, Peoples R China","orgs":["Xidian Univ, Sch Telecommun Engn, Xian 710071, Shaanxi, Peoples R China"]},{"name":"Jing Du","org":"Xidian Univ, Sch Telecommun Engn, Xian 710071, Shaanxi, Peoples R China","orgs":["Xidian Univ, Sch Telecommun Engn, Xian 710071, Shaanxi, Peoples R China"]},{"id":"65173f4d768b11dc72aa885e","name":"Dongqi Yan","org":"Xidian Univ, Sch Telecommun Engn, Xian 710071, Shaanxi, Peoples R China","orgs":["Xidian Univ, Sch Telecommun Engn, Xian 710071, Shaanxi, Peoples R China"]},{"email":"yanxi.zhang@stu.xidian.edu.cn","id":"64db4275b2c670017460f0f7","name":"Yanxi Zhang","org":"Xidian Univ, Sch Telecommun Engn, Xian 710071, Shaanxi, Peoples R China","orgs":["Xidian Univ, Sch Telecommun Engn, Xian 710071, Shaanxi, Peoples R China"]},{"email":"liuweixd@mail.xidian.edu.cn","id":"544096ffdabfae805a6d7cf2","name":"Wei Liu","org":"Xidian Univ, Sch Telecommun Engn, Xian 710071, Shaanxi, Peoples R China","orgs":["Xidian Univ, Sch Telecommun Engn, Xian 710071, Shaanxi, Peoples R China"]},{"id":"5440fb46dabfae805a717115","name":"Anthony Man-Cho So","org":"Chinese Univ Hongkong, Dept Syst Engn & Engn Management, Hong Kong, Peoples R China","orgs":["Chinese Univ Hongkong, Dept Syst Engn & Engn Management, Hong Kong, Peoples R China"]}],"create_time":"2023-07-31T20:53:22.332Z","hashs":{"h1":"ufsmt","h3":"sn"},"id":"64c78b8f3fda6d7f06dafd22","issn":"1389-1286","num_citation":0,"title":"A unified flow scheduling method for time sensitive networks.","urls":["https:\u002F\u002Fdl.acm.org\u002Fdoi\u002F10.1016\u002Fj.comnet.2023.109847","http:\u002F\u002Fwww.webofknowledge.com\u002F","db\u002Fjournals\u002Fcn\u002Fcn233.html#YaoLDYZLS23","https:\u002F\u002Fdoi.org\u002F10.1016\u002Fj.comnet.2023.109847"],"venue":{"info":{"name":"Comput. Networks"},"volume":"233"},"versions":[{"id":"64c78b8f3fda6d7f06dafd22","sid":"journals\u002Fcn\u002FYaoLDYZLS23","src":"dblp","year":2023},{"id":"64f96ea03fda6d7f06dcd2f8","sid":"WOS:001034336100001","src":"wos","year":2023},{"id":"655c63f3939a5f4082d5fd91","sid":"10.1016\u002Fj.comnet.2023.109847","src":"acm","year":2023}],"year":2023},{"abstract":" Fine-tuning a pre-trained model (such as BERT, ALBERT, RoBERTa, T5, GPT, etc.) has proven to be one of the most promising paradigms in recent NLP research. However, numerous recent works indicate that fine-tuning suffers from the instability problem, i.e., tuning the same model under the same setting results in significantly different performance. Many recent works have proposed different methods to solve this problem, but there is no theoretical understanding of why and how these methods work. In this paper, we propose a novel theoretical stability analysis of fine-tuning that focuses on two commonly used settings, namely, full fine-tuning and head tuning. We define the stability under each setting and prove the corresponding stability bounds. The theoretical bounds explain why and how several existing methods can stabilize the fine-tuning procedure. In addition to being able to explain most of the observed empirical discoveries, our proposed theoretical analysis framework can also help in the design of effective and provable methods. Based on our theory, we propose three novel strategies to stabilize the fine-tuning procedure, namely, Maximal Margin Regularizer (MMR), Multi-Head Loss (MHLoss), and Self Unsupervised Re-Training (SURT). We extensively evaluate our proposed approaches on 11 widely used real-world benchmark datasets, as well as hundreds of synthetic classification datasets. The experiment results show that our proposed methods significantly stabilize the fine-tuning procedure and also corroborate our theoretical analysis. ","authors":[{"id":"542e56c3dabfae4b91c40e9d","name":"Zihao Fu"},{"id":"5440fb46dabfae805a717115","name":"Anthony Man-Cho So"},{"id":"5447fb82dabfae87b7dbaf62","name":"Nigel Collier"}],"create_time":"2023-01-29T09:25:10.493Z","hashs":{"h1":"safpm"},"id":"63d340e790e50fcafd910644","lang":"en","num_citation":1,"pdf":"https:\u002F\u002Fcz5waila03cyo0tux1owpyofgoryroob.aminer.cn\u002F06\u002F64\u002F10\u002F066410D2AD011A08657623EA34BBCDF6.pdf","pdf_src":["https:\u002F\u002Farxiv.org\u002Fpdf\u002F2301.09820"],"title":"A Stability Analysis of Fine-Tuning a Pre-Trained Model","update_times":{"u_a_t":"2023-01-29T09:43:29.053Z","u_c_t":"2023-10-24T05:57:58.391Z"},"urls":["https:\u002F\u002Farxiv.org\u002Fabs\u002F2301.09820"],"versions":[{"id":"63d340e790e50fcafd910644","sid":"2301.09820","src":"arxiv","year":2023}],"year":2023},{"abstract":" This paper investigates the problem of exact community recovery in the symmetric $d$-uniform $(d \\geq 2)$ hypergraph stochastic block model ($d$-HSBM). In this model, a $d$-uniform hypergraph with $n$ nodes is generated by first partitioning the $n$ nodes into $K\\geq 2$ equal-sized disjoint communities and then generating hyperedges with a probability that depends on the community memberships of $d$ nodes. Despite the non-convex and discrete nature of the maximum likelihood estimation problem, we develop a simple yet efficient iterative method, called the \\emph{projected tensor power method}, to tackle it. As long as the initialization satisfies a partial recovery condition in the logarithmic degree regime of the problem, we show that our proposed method can exactly recover the hidden community structure down to the information-theoretic limit with high probability. Moreover, our proposed method exhibits a competitive time complexity of $\\mathcal{O}(n\\log^2n\u002F\\log\\log n)$ when the aforementioned initialization condition is met. We also conduct numerical experiments to validate our theoretical findings. ","authors":[{"id":"63245f8bc03fbd5be1f673d3","name":"Jinxin Wang","org":"The Chinese University of Hong Kong","orgid":"5f71b2961c455f439fe3ce4d","orgs":["The Chinese University of Hong Kong"]},{"id":"6372edd1ec88d95668d33f86","name":"Yuen-Man Pun","org":"Australian National University","orgid":"5f71b2971c455f439fe3cead","orgs":["Australian National University"]},{"id":"562b8a1545cedb3398a9cceb","name":"Xiaolu Wang","org":"The Chinese University of Hong Kong","orgid":"62331e330a6eb147dca8a71a","orgs":["The Chinese University of Hong Kong"]},{"id":"544566e8dabfae862da1abc7","name":"Peng Wang","org":"University of Michigan","orgid":"5f71b4bc1c455f439fe4c2d3","orgs":["University of Michigan"]},{"id":"5440fb46dabfae805a717115","name":"Anthony Man-Cho So","org":"The Chinese University of Hong Kong","orgid":"62331e330a6eb147dca8a71a","orgs":["The Chinese University of Hong Kong"]}],"create_time":"2023-07-04T04:59:46.756Z","hashs":{"h1":"ptpmh","h3":"cr"},"id":"64a39885d68f896efa31dfbe","num_citation":1,"pages":{"end":"36307","start":"36285"},"pdf":"https:\u002F\u002Fcz5waila03cyo0tux1owpyofgoryroob.aminer.cn\u002F83\u002FBF\u002F36\u002F83BF362AD3EDE8D7B5FC1C89F44EC3DF.pdf","title":"Projected Tensor Power Method for Hypergraph Community Recovery","update_times":{"u_c_t":"2023-10-24T06:43:21.328Z"},"urls":["db\u002Fconf\u002Ficml\u002Ficml2023.html#WangPW0S23","https:\u002F\u002Fproceedings.mlr.press\u002Fv202\u002Fwang23af.html","https:\u002F\u002Farxiv.org\u002Fabs\u002F2307.00210","https:\u002F\u002Ficml.cc\u002FConferences\u002F2023\u002FSchedule?type=Poster"],"venue":{"info":{"name":"ICML 2023"}},"versions":[{"id":"64a39885d68f896efa31dfbe","sid":"2307.00210","src":"arxiv","year":2023},{"id":"64be63cf3fda6d7f063ece8d","sid":"icml2023#24700","src":"conf_icml","year":2023},{"id":"64f561633fda6d7f06f1b6d4","sid":"conf\u002Ficml\u002FWangPW0S23","src":"dblp","year":2023}],"year":2023}],"profilePubsTotal":213,"profilePatentsPage":0,"profilePatents":null,"profilePatentsTotal":null,"profilePatentsEnd":false,"profileProjectsPage":1,"profileProjects":{"success":true,"msg":"","data":null,"log_id":"2Yr5EdnUkhp7cxUmffpEPU8QrXl"},"profileProjectsTotal":0,"newInfo":null,"checkDelPubs":[]}};