Truncated Linear Regression in High Dimensions

NIPS 2020, 2020.

Cited by: 1|Views6
EI
Weibo:
We prove that there exists a computationally and statistically efficient method for recovering $k$-sparse $n$-dimensional vectors $x^*$ from $m$ truncated samples, which attains an optimal $\ell_2$ reconstruction error of $O(\sqrt{(k \log n)/m})$

Abstract:

As in standard linear regression, in truncated linear regression, we are given access to observations $(A_i, y_i)_i$ whose dependent variable equals $y_i= A_i^{\rm T} \cdot x^* + \eta_i$, where $x^*$ is some fixed unknown vector of interest and $\eta_i$ is independent noise; except we are only given an observation if its dependent varia...More

Code:

Data:

0
Your rating :
0

 

Tags
Comments
s and the noise distribution. We prove that there exists a computationally and statistically efficient method for recovering $k$-sparse $n$-dimensional vectors $x^*$ from $m$ truncated samples, which attains an optimal $\\ell_2$ reconstruction error of $O(\\sqrt{(k \\log n)\u002Fm})$. As a corollary, our guarantees imply a computationally efficient and information-theoretically optimal algorithm for compressed sensing with truncation, which may arise from measurement saturation effects. Our result follows from a statistical and computational analysis of the Stochastic Gradient Descent (SGD) algorithm for solving a natural adaptation of the LASSO optimization problem that accommodates truncation. This generalizes the works of both: (1) [Daskalakis et al. 2018], where no regularization is needed due to the low-dimensionality of the data, and (2) [Wainright 2009], where the objective function is simple due to the absence of truncation. In order to deal with both truncation and high-dimensionality at the same time, we develop new techniques that not only generalize the existing ones but we believe are of independent interest. ","authors":[{"id":"53f46831dabfaee43ecfd993","name":"Constantinos Daskalakis"},{"id":"5607f00345ce1e595f33c632","name":"Dhruv Rohatgi"},{"id":"560f84ef45ce1e5961c7ad02","name":"Manolis Zampetakis"}],"flags":[{"flag":"affirm_author","person_id":"53f46831dabfaee43ecfd993"},{"flag":"affirm_author","person_id":"560f84ef45ce1e5961c7ad02"},{"flag":"affirm_author","person_id":"5607f00345ce1e595f33c632"}],"id":"5f2293bb91e0113629960a19","is_downvoted":false,"is_starring":false,"is_upvoted":false,"num_citation":1,"num_starred":0,"num_upvoted":0,"num_viewed":6,"pdf":"https:\u002F\u002Fstatic.aminer.cn\u002Fstorage\u002Fpdf\u002Farxiv\u002F20\u002F2007\u002F2007.14539.pdf","title":"Truncated Linear Regression in High Dimensions","urls":["https:\u002F\u002Farxiv.org\u002Fabs\u002F2007.14539","https:\u002F\u002Fneurips.cc\u002FConferences\u002F2020\u002FAcceptedPapersInitial","https:\u002F\u002Fdblp.org\u002Frec\u002Fconf\u002Fnips\u002FDaskalakisRZ20","https:\u002F\u002Fproceedings.neurips.cc\u002Fpaper\u002F2020\u002Fhash\u002F751f6b6b02bf39c41025f3bcfd9948ad-Abstract.html","https:\u002F\u002Fdblp.uni-trier.de\u002Fdb\u002Fjournals\u002Fcorr\u002Fcorr2007.html#abs-2007-14539","https:\u002F\u002Farxiv.org\u002Fpdf\u002F2007.14539","https:\u002F\u002Fpapers.nips.cc\u002Fpaper\u002F2020\u002Fhash\u002F751f6b6b02bf39c41025f3bcfd9948ad-Abstract.html","https:\u002F\u002Fwww.arxiv-vanity.com\u002Fpapers\u002F2007.14539\u002F"],"venue":{"info":{"name":"NIPS 2020"},"volume":"33"},"versions":[{"id":"5f2293bb91e0113629960a19","sid":"2007.14539","src":"arxiv","year":2020},{"id":"5f7fdd328de39f0828397dbf","sid":"neurips2020#1062","src":"conf_neurips","year":2020},{"id":"5ff8844791e011c8326762fc","sid":"conf\u002Fnips\u002FDaskalakisRZ20","src":"dblp","vsid":"conf\u002Fnips","year":2020},{"id":"5ff68d01d4150a363cd421e5","sid":"3104170515","src":"mag","vsid":"1127325140","year":2020}],"year":2020,"cited_pubs":0,"total_ref":0,"total_sim":0},"bestpaper":undefined,"simpapers":undefined,"authorsData":[{"avatar":"https:\u002F\u002Fstatic.aminer.org\u002Fupload\u002Favatar\u002F1455\u002F288\u002F1177\u002F53f46831dabfaee43ecfd993.jpeg","id":"53f46831dabfaee43ecfd993","name":"Constantinos Daskalakis","name_zh":""},{"id":"5607f00345ce1e595f33c632","name":"Dhruv Rohatgi"},{"avatar":"https:\u002F\u002Fstatic.aminer.cn\u002Fupload\u002Favatar\u002F39\u002F840\u002F2023\u002F560f84ef45ce1e5961c7ad02_0.jpg","id":"560f84ef45ce1e5961c7ad02","name":"Manolis Zampetakis"}],"redirect_id":undefined,"notfound":undefined,"pdfInfo":{"headline":"We prove that there exists a computationally and statistically efficient method for recovering $k$-sparse $n$-dimensional vectors $x^*$ from $m$ truncated samples, which attains an optimal $\\ell_2$ reconstruction error of $O(\\sqrt{(k \\log n)\u002Fm})$","id":"5f836d6e9e795e984cc6c75e","keywords":["random matrix","Primal-Dual Witness","dependent variable","Stochastic Gradient Descent","unknown vector","reconstruction error","limited dependent variable","linear regression"],"metadata":{"abstract":"As in standard linear regression, in truncated linear regression, we are given access to observations $(A_i, y_i)_i$ whose dependent variable equals $y_i= A_i^{\\rm T} \\cdot x^* + \\eta_i$, where $x^*$ is some fixed unknown vector of interest and $\\eta_i$ is independent noise; except we are only given an observation if its dependent variable $y_i$ lies in some \"truncation set\" $S conditions on the $A_i s and the noise distribution. We prove that there exists a computationally and statistically efficient method for recovering $k$-sparse $n$-dimensional vectors $x^*$ from $m$ truncated samples, which attains an optimal $\\ell_2$ reconstruction error of $O(\\sqrt{(k \\log n)\u002Fm})$. As a corollary, our guarantees imply a computationally efficient and information-theoretically optimal algorithm for compressed sensing with truncation, which may arise from measurement saturation effects. Our result follows from a statistical and computational analysis of the Stochastic Gradient Descent (SGD) algorithm for solving a natural adaptation of the LASSO optimization problem that accommodates truncation. This generalizes the works of both: (1) [Daskalakis et al. 2018], where no regularization is needed due to the low-dimensionality of the data, and (2) [Wainright 2009], where the objective function is simple due to the absence of truncation. In order to deal with both truncation and high-dimensionality at the same time, we develop new techniques that not only generalize the existing ones but we believe are of independent interest.","author":"Constantinos Daskalakis, Dhruv Rohatgi, Manolis Zampetakis","author_conclusions":["We conclude that","We conclude from Theorem D.1 and our choice of ρ that ("],"date":2020,"emails":["costis@mit.edu","mzampet@mit.edu","drohatgi@mit.edu"],"figure_captions":[{"caption":"Truncation in one-dimensional linear regression, along with the linear fit obtained via least squares regression before and after truncation","id":"1"}],"figure_urls":["https:\u002F\u002Fcloud-api.scholarcy.com\u002Fimages\u002F2007.14539.pdf_3dkku4x2_images_a76cyaxi\u002Fimg-001.png"],"funding":[{"award-group":[{"award-id":["1617730","CCF-1901292"],"funding-source":"NSF Awards IIS-1741137, CCF"},{"award-id":["DE-AC05-7"],"funding-source":"Simons Investigator Award,"},{"award-id":["DE-AC05-76RL01830"],"funding-source":"DOE PhILMs"},{"award-id":["HR00111990021"],"funding-source":"DARPA"},{"funding-source":"MIT Undergraduate Research Opportunities Program,"},{"funding-source":"Google PhD Fellowship"}],"funding-statement":"This research was supported by NSF Awards IIS-1741137, CCF-1617730 and CCF-1901292, by a Simons Investigator Award, by the DOE PhILMs project (No DE-AC05-76RL01830), by the DARPA award HR00111990021, by the MIT Undergraduate Research Opportunities Program, and by a Google PhD Fellowship"}],"identifiers":{"arxiv":"2007.14539v1"},"pages":"30","references":"@article{amemiya1973a,\n author = {Amemiya, Takeshi},\n title = {Regression analysis when the dependent variable is truncated normal},\n journal = {Econometrica: Journal of the Econometric Society},\n pages = {997–1016},\n date = {1973},\n language = {}\n}\n@book{balakrishnan2014a,\n author = {Balakrishnan, N. and Cramer, Erhard},\n title = {The art of progressive censoring},\n publisher = {Springer},\n date = {2014},\n language = {}\n}\n@misc{bernoulli-a,\n author = {Bernoulli, Daniel},\n title = {Essai d’une nouvelle analyse de la mortalitecausee par la petite verole, et des avantages de l’inoculation pour la prevenir. Histoire de l’Acad., Roy. Sci.(Paris) avec Mem},\n pages = {1–45},\n language = {}\n}\n@book{boyd2004a,\n author = {Boyd, Stephen and Vandenberghe, Lieven},\n title = {Convex optimization},\n publisher = {Cambridge university press},\n date = {2004},\n language = {}\n}\n@article{breen-a,\n author = {Breen, Richard},\n title = {Regression models: Censored, sample selected, or truncated},\n journal = {data},\n volume = {111},\n more-authors = {true},\n language = {}\n}\n@article{candes2006a,\n author = {Candes, Emmanuel J. and Romberg, Justin K. and Tao, Terence},\n title = {Stable signal recovery from incomplete and inaccurate measurements},\n journal = {Communications on Pure and Applied Mathematics},\n volume = {59},\n pages = {1207–1223},\n date = {2006},\n number = {8},\n language = {}\n}\n@article{chevillard2012a,\n author = {Chevillard, Sylvain},\n title = {The functions erf and erfc computed with arbitrary precision and explicit error bounds},\n journal = {Information and Computation},\n volume = {216},\n pages = {72–95},\n date = {2012},\n language = {}\n}\n@book{cohen2016a,\n author = {Cohen, A.Clifford},\n title = {Truncated and censored samples: theory and applications},\n publisher = {CRC press},\n date = {2016},\n language = {}\n}\n@incollection{daskalakis2019a,\n author = {Daskalakis, Constantinos and Gouleakis, Themis and Tzamos, Christos and Zampetakis, Manolis},\n title = {Computationally and Statistically Efficient Truncated Regression},\n booktitle = {the 32nd Conference on Learning Theory (COLT)},\n date = {2019},\n language = {}\n}\n@misc{davenport2009a,\n author = {Davenport, Mark A. and Laska, Jason N. and Boufounos, Petros T. and Baraniuk, Richard G.},\n title = {A simple proof that random matrices are democratic},\n edition = {arXiv preprint arXiv: 0911.0736},\n date = {2009},\n language = {}\n}\n@article{donoho2006a,\n author = {Donoho, David L.},\n title = {For most large underdetermined systems of linear equations the minimal 1-norm solution is also the sparsest solution. Communications on Pure and Applied Mathematics: A Journal Issued by the Courant},\n journal = {Institute of Mathematical Sciences},\n volume = {59},\n pages = {797–829},\n date = {2006},\n number = {6},\n language = {}\n}\n@article{fisher1931a,\n author = {Fisher, R.A.},\n title = {Properties and applications of Hh functions},\n journal = {Mathematical tables},\n volume = {1},\n pages = {815–852},\n date = {1931},\n language = {}\n}\n@article{galton1897a,\n author = {Galton, Francis},\n title = {An examination into the registered speeds of american trotting horses, with remarks on their value as hereditary data},\n journal = {Proceedings of the Royal Society of London},\n volume = {62},\n pages = {310–315},\n date = {1897},\n number = {379-387},\n language = {}\n}\n@article{hajivassiliou1998a,\n author = {Hajivassiliou, Vassilis A. and McFadden, Daniel L.},\n title = {The method of simulated scores for the estimation of ldv models},\n journal = {Econometrica},\n pages = {863–896},\n date = {1998},\n language = {}\n}\n@article{unknown1977a,\n title = {Jerry A Hausman and David A Wise. Social experimentation, truncated distributions, and efficient estimation},\n journal = {Econometrica: Journal of the Econometric Society},\n pages = {919–938},\n date = {1977},\n language = {}\n}\n@misc{keane1993a,\n editor = {Keane, Michael P.},\n date = {1993},\n title = {simulation estimation for panel data models with limited dependent variables},\n language = {}\n}\n@thesis{laska2010a,\n author = {Laska, Jason N.},\n title = {Democracy in action: Quantization, saturation, and compressive sensing},\n date = {2010},\n language = {},\n type = {PhD thesis}\n}\n@book{maddala1986a,\n author = {Maddala, Gangadharrao S.},\n title = {Limited-dependent and qualitative variables in econometrics. Number 3},\n publisher = {Cambridge university press},\n date = {1986},\n language = {}\n}\n@article{pearson1902a,\n author = {Pearson, Karl},\n title = {On the systematic fitting of frequency curves},\n journal = {Biometrika},\n volume = {2},\n pages = {2–7},\n date = {1902},\n language = {}\n}\n@article{pearson1908a,\n author = {Pearson, Karl and Lee, Alice},\n title = {On the generalised probable error in multiple normal correlation},\n journal = {Biometrika},\n volume = {6},\n pages = {59–68},\n date = {1908},\n number = {1},\n language = {}\n}\n@inproceedings{rudelson2010a,\n author = {Rudelson, Mark and Vershynin, Roman},\n title = {Non-asymptotic theory of random matrices: extreme singular values},\n booktitle = {Proceedings of the International Congress of Mathematicians},\n date = {2010},\n pages = {1576–1602},\n publisher = {World Scientific},\n source = {ICM 2010) (In 4 Volumes) Vol. I: Plenary Lectures and Ceremonies Vols},\n language = {},\n address = {II–IV: Invited Lectures}\n}\n@book{schneider1986a,\n author = {Schneider, Helmut},\n title = {Truncated and censored samples from normal populations},\n publisher = {Marcel Dekker, Inc},\n date = {1986},\n language = {}\n}\n@article{tibshirani1996a,\n author = {Tibshirani, Robert},\n title = {Regression shrinkage and selection via the lasso},\n journal = {Journal of the Royal Statistical Society: Series B (Methodological)},\n volume = {58},\n pages = {267–288},\n date = {1996},\n number = {1},\n language = {}\n}\n@article{tobin1958a,\n author = {Tobin, James},\n title = {Estimation of relationships for limited dependent variables},\n journal = {Econometrica: journal of the Econometric Society},\n pages = {24–36},\n date = {1958},\n language = {}\n}\n@article{voroninski2016a,\n author = {Voroninski, Vladislav and Xu, Zhiqiang},\n title = {A strong restricted isometry property, with an application to phaseless compressed sensing},\n journal = {Applied and Computational Harmonic Analysis},\n volume = {40},\n pages = {386–395},\n date = {2016},\n number = {2},\n language = {}\n}\n@article{wainwright2009a,\n author = {Wainwright, Martin J.},\n title = {Sharp thresholds for high-dimensional and noisy sparsity recovery using l1constrained quadratic programming (lasso)},\n journal = {IEEE transactions on information theory},\n volume = {55},\n pages = {2183–2202},\n date = {2009},\n number = {5},\n language = {}\n}\n@misc{unknown-a,\n title = {Since |zi| For part},\n unknown = {c},\n language = {}\n}\n@misc{unknown-b,\n title = {This vector equality can be written in block form},\n note = {as follows: 1 m},\n language = {}\n}\n@book{unknown-c,\n title = {Lemma I.3 ([9]). For t ∈},\n publisher = {R, E[R2t},\n date = {]},\n language = {}\n}\n@incollection{i-a,\n author = {I.4, Lemma},\n title = {With high probability},\n source = {m},\n language = {}\n}\n@misc{unknown-d,\n note = {Ea∼N(0,1)n [γS(aT x}\n}\n@misc{unknown-e,\n title = {By Lemma I.2, we have log},\n language = {}\n}\n@book{unknown-f,\n title = {Therefore summing over all j ∈},\n publisher = {m},\n language = {}\n}\n@book{unknown-g,\n publisher = {Theorem G.1 gives that},\n language = {}\n}\n@incollection{unknown-h,\n title = {If t ≥ 1\u002Fm, then (w − x)U 2 = 1\u002Fm, and thus w − x 2 ≥ 1\u002Fm. By convexity and this bound, f (x) −},\n url = {f},\n booktitle = {(x) ≥ f (w) − f (x) ≥ min λ2},\n volume = {8},\n language = {}\n}\n","references_ris":"TY - JOUR\nAU - Amemiya, Takeshi\nTI - Regression analysis when the dependent variable is truncated normal\nT2 - Econometrica: Journal of the Econometric Society\nSP - 997\nEP - 1016\nPY - 1973\nDA - 1973\nLA - \nER - \n\nTY - BOOK\nAU - Balakrishnan, N.\nAU - Cramer, Erhard\nTI - The art of progressive censoring\nPB - Springer\nPY - 2014\nDA - 2014\nLA - \nER - \n\nTY - GEN\nAU - Bernoulli, Daniel\nTI - Essai d’une nouvelle analyse de la mortalitecausee par la petite verole, et des avantages de l’inoculation pour la prevenir. Histoire de l’Acad., Roy. Sci.(Paris) avec Mem\nSP - 1\nEP - 45\nLA - \nER - \n\nTY - BOOK\nAU - Boyd, Stephen\nAU - Vandenberghe, Lieven\nTI - Convex optimization\nPB - Cambridge university press\nPY - 2004\nDA - 2004\nLA - \nER - \n\nTY - JOUR\nAU - Breen, Richard\nTI - Regression models: Censored, sample selected, or truncated\nT2 - data\nVL - 111\nC1 - true\nLA - \nER - \n\nTY - JOUR\nAU - Candes, Emmanuel J.\nAU - Romberg, Justin K.\nAU - Tao, Terence\nTI - Stable signal recovery from incomplete and inaccurate measurements\nT2 - Communications on Pure and Applied Mathematics\nVL - 59\nSP - 1207\nEP - 1223\nPY - 2006\nDA - 2006\nIS - 8\nLA - \nER - \n\nTY - JOUR\nAU - Chevillard, Sylvain\nTI - The functions erf and erfc computed with arbitrary precision and explicit error bounds\nT2 - Information and Computation\nVL - 216\nSP - 72\nEP - 95\nPY - 2012\nDA - 2012\nLA - \nER - \n\nTY - BOOK\nAU - Cohen, A.Clifford\nTI - Truncated and censored samples: theory and applications\nPB - CRC press\nPY - 2016\nDA - 2016\nLA - \nER - \n\nTY - CHAP\nAU - Daskalakis, Constantinos\nAU - Gouleakis, Themis\nAU - Tzamos, Christos\nAU - Zampetakis, Manolis\nTI - Computationally and Statistically Efficient Truncated Regression\nT2 - the 32nd Conference on Learning Theory (COLT)\nPY - 2019\nDA - 2019\nLA - \nER - \n\nTY - GEN\nAU - Davenport, Mark A.\nAU - Laska, Jason N.\nAU - Boufounos, Petros T.\nAU - Baraniuk, Richard G.\nTI - A simple proof that random matrices are democratic\nET - arXiv preprint arXiv: 0911.0736\nPY - 2009\nDA - 2009\nLA - \nER - \n\nTY - JOUR\nAU - Donoho, David L.\nTI - For most large underdetermined systems of linear equations the minimal 1-norm solution is also the sparsest solution. Communications on Pure and Applied Mathematics: A Journal Issued by the Courant\nT2 - Institute of Mathematical Sciences\nVL - 59\nSP - 797\nEP - 829\nPY - 2006\nDA - 2006\nIS - 6\nLA - \nER - \n\nTY - JOUR\nAU - Fisher, R.A.\nTI - Properties and applications of Hh functions\nT2 - Mathematical tables\nVL - 1\nSP - 815\nEP - 852\nPY - 1931\nDA - 1931\nLA - \nER - \n\nTY - JOUR\nAU - Galton, Francis\nTI - An examination into the registered speeds of american trotting horses, with remarks on their value as hereditary data\nT2 - Proceedings of the Royal Society of London\nVL - 62\nSP - 310\nEP - 315\nPY - 1897\nDA - 1897\nIS - 379-387\nLA - \nER - \n\nTY - JOUR\nAU - Hajivassiliou, Vassilis A.\nAU - McFadden, Daniel L.\nTI - The method of simulated scores for the estimation of ldv models\nT2 - Econometrica\nSP - 863\nEP - 896\nPY - 1998\nDA - 1998\nLA - \nER - \n\nTY - JOUR\nTI - Jerry A Hausman and David A Wise. Social experimentation, truncated distributions, and efficient estimation\nT2 - Econometrica: Journal of the Econometric Society\nSP - 919\nEP - 938\nPY - 1977\nDA - 1977\nLA - \nER - \n\nTY - GEN\nA2 - Keane, Michael P.\nPY - 1993\nDA - 1993\nTI - simulation estimation for panel data models with limited dependent variables\nLA - \nER - \n\nTY - GEN\nAU - Laska, Jason N.\nTI - Democracy in action: Quantization, saturation, and compressive sensing\nPY - 2010\nDA - 2010\nLA - \nER - \n\nTY - BOOK\nAU - Maddala, Gangadharrao S.\nTI - Limited-dependent and qualitative variables in econometrics. Number 3\nPB - Cambridge university press\nPY - 1986\nDA - 1986\nLA - \nER - \n\nTY - JOUR\nAU - Pearson, Karl\nTI - On the systematic fitting of frequency curves\nT2 - Biometrika\nVL - 2\nSP - 2\nEP - 7\nPY - 1902\nDA - 1902\nLA - \nER - \n\nTY - JOUR\nAU - Pearson, Karl\nAU - Lee, Alice\nTI - On the generalised probable error in multiple normal correlation\nT2 - Biometrika\nVL - 6\nSP - 59\nEP - 68\nPY - 1908\nDA - 1908\nIS - 1\nLA - \nER - \n\nTY - CONF\nAU - Rudelson, Mark\nAU - Vershynin, Roman\nTI - Non-asymptotic theory of random matrices: extreme singular values\nT2 - Proceedings of the International Congress of Mathematicians\nPY - 2010\nDA - 2010\nSP - 1576\nEP - 1602\nPB - World Scientific\nT2 - ICM 2010) (In 4 Volumes) Vol. I: Plenary Lectures and Ceremonies Vols\nLA - \nCY - II–IV: Invited Lectures\nER - \n\nTY - BOOK\nAU - Schneider, Helmut\nTI - Truncated and censored samples from normal populations\nPB - Marcel Dekker, Inc\nPY - 1986\nDA - 1986\nLA - \nER - \n\nTY - JOUR\nAU - Tibshirani, Robert\nTI - Regression shrinkage and selection via the lasso\nT2 - Journal of the Royal Statistical Society: Series B (Methodological)\nVL - 58\nSP - 267\nEP - 288\nPY - 1996\nDA - 1996\nIS - 1\nLA - \nER - \n\nTY - JOUR\nAU - Tobin, James\nTI - Estimation of relationships for limited dependent variables\nT2 - Econometrica: journal of the Econometric Society\nSP - 24\nEP - 36\nPY - 1958\nDA - 1958\nLA - \nER - \n\nTY - JOUR\nAU - Voroninski, Vladislav\nAU - Xu, Zhiqiang\nTI - A strong restricted isometry property, with an application to phaseless compressed sensing\nT2 - Applied and Computational Harmonic Analysis\nVL - 40\nSP - 386\nEP - 395\nPY - 2016\nDA - 2016\nIS - 2\nLA - \nER - \n\nTY - JOUR\nAU - Wainwright, Martin J.\nTI - Sharp thresholds for high-dimensional and noisy sparsity recovery using l1constrained quadratic programming (lasso)\nT2 - IEEE transactions on information theory\nVL - 55\nSP - 2183\nEP - 2202\nPY - 2009\nDA - 2009\nIS - 5\nLA - \nER - \n\nTY - GEN\nTI - Since |zi| For part\nC1 - c\nLA - \nER - \n\nTY - GEN\nTI - This vector equality can be written in block form\nN1 - as follows: 1 m\nLA - \nER - \n\nTY - BOOK\nTI - Lemma I.3 ([9]). For t ∈\nPB - R, E[R2t\nPY - ]\nDA - ]\nLA - \nER - \n\nTY - CHAP\nAU - I.4, Lemma\nTI - With high probability\nT2 - m\nLA - \nER - \n\nTY - GEN\nN1 - Ea∼N(0,1)n [γS(aT x\nER - \n\nTY - GEN\nTI - By Lemma I.2, we have log\nLA - \nER - \n\nTY - BOOK\nTI - Therefore summing over all j ∈\nPB - m\nLA - \nER - \n\nTY - BOOK\nPB - Theorem G.1 gives that\nLA - \nER - \n\nTY - CHAP\nTI - If t ≥ 1\u002Fm, then (w − x)U 2 = 1\u002Fm, and thus w − x 2 ≥ 1\u002Fm. By convexity and this bound, f (x) −\nUR - f\nT2 - (x) ≥ f (w) − f (x) ≥ min λ2\nVL - 8\nLA - \nER - \n\n","title":"Truncated Linear Regression in High Dimensions"},"reference_links":[{"alt_id":"Amemiya_1973_a","entry":"[1] Takeshi Amemiya. Regression analysis when the dependent variable is truncated normal. Econometrica: Journal of the Econometric Society, pages 997–1016, 1973.","id":"1","oa_query":"https:\u002F\u002Fcloud-api.scholarcy.com\u002Foa_version?query=Amemiya%2C%20Takeshi%20Regression%20analysis%20when%20the%20dependent%20variable%20is%20truncated%20normal%201973","scholar_url":"https:\u002F\u002Fscholar.google.co.uk\u002Fscholar?q=Amemiya%2C%20Takeshi%20Regression%20analysis%20when%20the%20dependent%20variable%20is%20truncated%20normal%201973","scite":"https:\u002F\u002Fcloud-api.scholarcy.com\u002Fscite_url?query=Amemiya%2C%20Takeshi%20Regression%20analysis%20when%20the%20dependent%20variable%20is%20truncated%20normal%201973"},{"alt_id":"Balakrishnan_2014_a","entry":"[2] N Balakrishnan and Erhard Cramer. The art of progressive censoring. Springer, 2014.","id":"2","scholar_url":"https:\u002F\u002Fscholar.google.co.uk\u002Fscholar?q=Balakrishnan%2C%20N.%20Cramer%2C%20Erhard%20The%20art%20of%20progressive%20censoring%202014"},{"alt_id":"Bernoulli_1760_a","entry":"[3] Daniel Bernoulli. Essai d’une nouvelle analyse de la mortalitecausee par la petite verole, et des avantages de l’inoculation pour la prevenir. Histoire de l’Acad., Roy. Sci.(Paris) avec Mem, pages 1–45, 1760.","id":"3","scholar_url":"https:\u002F\u002Fscholar.google.co.uk\u002Fscholar?q=Bernoulli%2C%20Daniel%20Essai%20d%E2%80%99une%20nouvelle%20analyse%20de%20la%20mortalitecausee%20par%20la%20petite%20verole%2C%20et%20des%20avantages%20de%20l%E2%80%99inoculation%20pour%20la%20prevenir.%20Histoire%20de%20l%E2%80%99Acad.%2C%20Roy.%20Sci.%28Paris%29%20avec%20Mem%201760"},{"alt_id":"Boyd_2004_a","entry":"[4] Stephen Boyd and Lieven Vandenberghe. Convex optimization. Cambridge university press, 2004.","id":"4","scholar_url":"https:\u002F\u002Fscholar.google.co.uk\u002Fscholar?q=Boyd%2C%20Stephen%20Vandenberghe%2C%20Lieven%20Convex%20optimization%202004"},{"alt_id":"Breen_0000_a","entry":"[5] Richard Breen et al. Regression models: Censored, sample selected, or truncated data, volume 111.","id":"5","oa_query":"https:\u002F\u002Fcloud-api.scholarcy.com\u002Foa_version?query=Breen%2C%20Richard%20Regression%20models%3A%20Censored%2C%20sample%20selected%2C%20or%20truncated","scholar_url":"https:\u002F\u002Fscholar.google.co.uk\u002Fscholar?q=Breen%2C%20Richard%20Regression%20models%3A%20Censored%2C%20sample%20selected%2C%20or%20truncated","scite":"https:\u002F\u002Fcloud-api.scholarcy.com\u002Fscite_url?query=Breen%2C%20Richard%20Regression%20models%3A%20Censored%2C%20sample%20selected%2C%20or%20truncated"},{"alt_id":"Candes_et+al_2006_a","entry":"[6] Emmanuel J. Candes, Justin K. Romberg, and Terence Tao. Stable signal recovery from incomplete and inaccurate measurements. Communications on Pure and Applied Mathematics, 59(8):1207–1223, 2006.","id":"6","oa_query":"https:\u002F\u002Fcloud-api.scholarcy.com\u002Foa_version?query=Candes%2C%20Emmanuel%20J.%20Romberg%2C%20Justin%20K.%20Tao%2C%20Terence%20Stable%20signal%20recovery%20from%20incomplete%20and%20inaccurate%20measurements%202006","scholar_url":"https:\u002F\u002Fscholar.google.co.uk\u002Fscholar?q=Candes%2C%20Emmanuel%20J.%20Romberg%2C%20Justin%20K.%20Tao%2C%20Terence%20Stable%20signal%20recovery%20from%20incomplete%20and%20inaccurate%20measurements%202006","scite":"https:\u002F\u002Fcloud-api.scholarcy.com\u002Fscite_url?query=Candes%2C%20Emmanuel%20J.%20Romberg%2C%20Justin%20K.%20Tao%2C%20Terence%20Stable%20signal%20recovery%20from%20incomplete%20and%20inaccurate%20measurements%202006"},{"alt_id":"Chevillard_2012_a","entry":"[7] Sylvain Chevillard. The functions erf and erfc computed with arbitrary precision and explicit error bounds. Information and Computation, 216:72–95, 2012.","id":"7","oa_query":"https:\u002F\u002Fcloud-api.scholarcy.com\u002Foa_version?query=Chevillard%2C%20Sylvain%20The%20functions%20erf%20and%20erfc%20computed%20with%20arbitrary%20precision%20and%20explicit%20error%20bounds%202012","scholar_url":"https:\u002F\u002Fscholar.google.co.uk\u002Fscholar?q=Chevillard%2C%20Sylvain%20The%20functions%20erf%20and%20erfc%20computed%20with%20arbitrary%20precision%20and%20explicit%20error%20bounds%202012","scite":"https:\u002F\u002Fcloud-api.scholarcy.com\u002Fscite_url?query=Chevillard%2C%20Sylvain%20The%20functions%20erf%20and%20erfc%20computed%20with%20arbitrary%20precision%20and%20explicit%20error%20bounds%202012"},{"alt_id":"Cohen_2016_a","entry":"[8] A Clifford Cohen. Truncated and censored samples: theory and applications. CRC press, 2016.","id":"8","scholar_url":"https:\u002F\u002Fscholar.google.co.uk\u002Fscholar?q=Cohen%2C%20A.Clifford%20Truncated%20and%20censored%20samples%3A%20theory%20and%20applications%202016"},{"alt_id":"Daskalakis_et+al_2019_a","entry":"[9] Constantinos Daskalakis, Themis Gouleakis, Christos Tzamos, and Manolis Zampetakis. Computationally and Statistically Efficient Truncated Regression. In the 32nd Conference on Learning Theory (COLT), 2019.","id":"9","oa_query":"https:\u002F\u002Fcloud-api.scholarcy.com\u002Foa_version?query=Daskalakis%2C%20Constantinos%20Gouleakis%2C%20Themis%20Tzamos%2C%20Christos%20Zampetakis%2C%20Manolis%20Computationally%20and%20Statistically%20Efficient%20Truncated%20Regression%202019","scholar_url":"https:\u002F\u002Fscholar.google.co.uk\u002Fscholar?q=Daskalakis%2C%20Constantinos%20Gouleakis%2C%20Themis%20Tzamos%2C%20Christos%20Zampetakis%2C%20Manolis%20Computationally%20and%20Statistically%20Efficient%20Truncated%20Regression%202019","scite":"https:\u002F\u002Fcloud-api.scholarcy.com\u002Fscite_url?query=Daskalakis%2C%20Constantinos%20Gouleakis%2C%20Themis%20Tzamos%2C%20Christos%20Zampetakis%2C%20Manolis%20Computationally%20and%20Statistically%20Efficient%20Truncated%20Regression%202019"},{"alt_id":"Davenport_et+al_2009_a","arxiv_url":"https:\u002F\u002Farxiv.org\u002Fpdf\u002F0911.0736","entry":"[10] Mark A Davenport, Jason N Laska, Petros T Boufounos, and Richard G Baraniuk. A simple proof that random matrices are democratic. arXiv preprint arXiv:0911.0736, 2009.","id":"10"},{"alt_id":"Donoho_2006_a","entry":"[11] David L Donoho. For most large underdetermined systems of linear equations the minimal 1-norm solution is also the sparsest solution. Communications on Pure and Applied Mathematics: A Journal Issued by the Courant Institute of Mathematical Sciences, 59(6):797–829, 2006.","id":"11","oa_query":"https:\u002F\u002Fcloud-api.scholarcy.com\u002Foa_version?query=Donoho%2C%20David%20L.%20For%20most%20large%20underdetermined%20systems%20of%20linear%20equations%20the%20minimal%201-norm%20solution%20is%20also%20the%20sparsest%20solution.%20Communications%20on%20Pure%20and%20Applied%20Mathematics%3A%20A%20Journal%20Issued%20by%20the%20Courant%202006","scholar_url":"https:\u002F\u002Fscholar.google.co.uk\u002Fscholar?q=Donoho%2C%20David%20L.%20For%20most%20large%20underdetermined%20systems%20of%20linear%20equations%20the%20minimal%201-norm%20solution%20is%20also%20the%20sparsest%20solution.%20Communications%20on%20Pure%20and%20Applied%20Mathematics%3A%20A%20Journal%20Issued%20by%20the%20Courant%202006","scite":"https:\u002F\u002Fcloud-api.scholarcy.com\u002Fscite_url?query=Donoho%2C%20David%20L.%20For%20most%20large%20underdetermined%20systems%20of%20linear%20equations%20the%20minimal%201-norm%20solution%20is%20also%20the%20sparsest%20solution.%20Communications%20on%20Pure%20and%20Applied%20Mathematics%3A%20A%20Journal%20Issued%20by%20the%20Courant%202006"},{"alt_id":"Fisher_1931_a","entry":"[12] RA Fisher. Properties and applications of Hh functions. Mathematical tables, 1:815–852, 1931.","id":"12","oa_query":"https:\u002F\u002Fcloud-api.scholarcy.com\u002Foa_version?query=Fisher%2C%20R.A.%20Properties%20and%20applications%20of%20Hh%20functions%201931","scholar_url":"https:\u002F\u002Fscholar.google.co.uk\u002Fscholar?q=Fisher%2C%20R.A.%20Properties%20and%20applications%20of%20Hh%20functions%201931","scite":"https:\u002F\u002Fcloud-api.scholarcy.com\u002Fscite_url?query=Fisher%2C%20R.A.%20Properties%20and%20applications%20of%20Hh%20functions%201931"},{"alt_id":"Galton_1897_a","entry":"[13] Francis Galton. An examination into the registered speeds of american trotting horses, with remarks on their value as hereditary data. Proceedings of the Royal Society of London, 62(379-387):310–315, 1897.","id":"13","oa_query":"https:\u002F\u002Fcloud-api.scholarcy.com\u002Foa_version?query=Galton%2C%20Francis%20An%20examination%20into%20the%20registered%20speeds%20of%20american%20trotting%20horses%2C%20with%20remarks%20on%20their%20value%20as%20hereditary%20data%201897","scholar_url":"https:\u002F\u002Fscholar.google.co.uk\u002Fscholar?q=Galton%2C%20Francis%20An%20examination%20into%20the%20registered%20speeds%20of%20american%20trotting%20horses%2C%20with%20remarks%20on%20their%20value%20as%20hereditary%20data%201897","scite":"https:\u002F\u002Fcloud-api.scholarcy.com\u002Fscite_url?query=Galton%2C%20Francis%20An%20examination%20into%20the%20registered%20speeds%20of%20american%20trotting%20horses%2C%20with%20remarks%20on%20their%20value%20as%20hereditary%20data%201897"},{"alt_id":"Hajivassiliou_1998_a","entry":"[14] Vassilis A Hajivassiliou and Daniel L McFadden. The method of simulated scores for the estimation of ldv models. Econometrica, pages 863–896, 1998.","id":"14","oa_query":"https:\u002F\u002Fcloud-api.scholarcy.com\u002Foa_version?query=Hajivassiliou%2C%20Vassilis%20A.%20McFadden%2C%20Daniel%20L.%20The%20method%20of%20simulated%20scores%20for%20the%20estimation%20of%20ldv%20models%201998","scholar_url":"https:\u002F\u002Fscholar.google.co.uk\u002Fscholar?q=Hajivassiliou%2C%20Vassilis%20A.%20McFadden%2C%20Daniel%20L.%20The%20method%20of%20simulated%20scores%20for%20the%20estimation%20of%20ldv%20models%201998","scite":"https:\u002F\u002Fcloud-api.scholarcy.com\u002Fscite_url?query=Hajivassiliou%2C%20Vassilis%20A.%20McFadden%2C%20Daniel%20L.%20The%20method%20of%20simulated%20scores%20for%20the%20estimation%20of%20ldv%20models%201998"},{"alt_id":"[15]_1977_a","entry":"[15] Jerry A Hausman and David A Wise. Social experimentation, truncated distributions, and efficient estimation. Econometrica: Journal of the Econometric Society, pages 919–938, 1977.","id":"15","oa_query":"https:\u002F\u002Fcloud-api.scholarcy.com\u002Foa_version?query=Jerry%20A%20Hausman%20and%20David%20A%20Wise%20Social%20experimentation%20truncated%20distributions%20and%20efficient%20estimation%20Econometrica%20Journal%20of%20the%20Econometric%20Society%20pages%20919938%201977","scholar_url":"https:\u002F\u002Fscholar.google.co.uk\u002Fscholar?q=Jerry%20A%20Hausman%20and%20David%20A%20Wise%20Social%20experimentation%20truncated%20distributions%20and%20efficient%20estimation%20Econometrica%20Journal%20of%20the%20Econometric%20Society%20pages%20919938%201977","scite":"https:\u002F\u002Fcloud-api.scholarcy.com\u002Fscite_url?query=Jerry%20A%20Hausman%20and%20David%20A%20Wise%20Social%20experimentation%20truncated%20distributions%20and%20efficient%20estimation%20Econometrica%20Journal%20of%20the%20Econometric%20Society%20pages%20919938%201977"},{"alt_id":"Keane_1993_a","entry":"[16] Michael P Keane. 20 simulation estimation for panel data models with limited dependent variables. 1993.","id":"16","scholar_url":"https:\u002F\u002Fscholar.google.co.uk\u002Fscholar?q=Michael%20P%20Keane%2020%20simulation%20estimation%20for%20panel%20data%20models%20with%20limited%20dependent%20variables%201993"},{"alt_id":"Laska_2010_a","entry":"[17] Jason N Laska. Democracy in action: Quantization, saturation, and compressive sensing. PhD thesis, 2010.","id":"17","scholar_url":"https:\u002F\u002Fscholar.google.co.uk\u002Fscholar?q=Laska%2C%20Jason%20N.%20Democracy%20in%20action%3A%20Quantization%2C%20saturation%2C%20and%20compressive%20sensing%202010"},{"alt_id":"Maddala_1986_a","entry":"[18] Gangadharrao S Maddala. Limited-dependent and qualitative variables in econometrics. Number 3. Cambridge university press, 1986.","id":"18","scholar_url":"https:\u002F\u002Fscholar.google.co.uk\u002Fscholar?q=Maddala%2C%20Gangadharrao%20S.%20Limited-dependent%20and%20qualitative%20variables%20in%20econometrics.%20Number%203%201986"},{"alt_id":"Pearson_1902_a","entry":"[19] Karl Pearson. On the systematic fitting of frequency curves. Biometrika, 2:2–7, 1902.","id":"19","oa_query":"https:\u002F\u002Fcloud-api.scholarcy.com\u002Foa_version?query=Pearson%2C%20Karl%20On%20the%20systematic%20fitting%20of%20frequency%20curves%201902","scholar_url":"https:\u002F\u002Fscholar.google.co.uk\u002Fscholar?q=Pearson%2C%20Karl%20On%20the%20systematic%20fitting%20of%20frequency%20curves%201902","scite":"https:\u002F\u002Fcloud-api.scholarcy.com\u002Fscite_url?query=Pearson%2C%20Karl%20On%20the%20systematic%20fitting%20of%20frequency%20curves%201902"},{"alt_id":"Pearson_1908_a","entry":"[20] Karl Pearson and Alice Lee. On the generalised probable error in multiple normal correlation. Biometrika, 6(1):59–68, 1908.","id":"20","oa_query":"https:\u002F\u002Fcloud-api.scholarcy.com\u002Foa_version?query=Pearson%2C%20Karl%20Lee%2C%20Alice%20On%20the%20generalised%20probable%20error%20in%20multiple%20normal%20correlation%201908","scholar_url":"https:\u002F\u002Fscholar.google.co.uk\u002Fscholar?q=Pearson%2C%20Karl%20Lee%2C%20Alice%20On%20the%20generalised%20probable%20error%20in%20multiple%20normal%20correlation%201908","scite":"https:\u002F\u002Fcloud-api.scholarcy.com\u002Fscite_url?query=Pearson%2C%20Karl%20Lee%2C%20Alice%20On%20the%20generalised%20probable%20error%20in%20multiple%20normal%20correlation%201908"},{"alt_id":"Rudelson_2010_a","entry":"[21] Mark Rudelson and Roman Vershynin. Non-asymptotic theory of random matrices: extreme singular values. In Proceedings of the International Congress of Mathematicians 2010 (ICM 2010) (In 4 Volumes) Vol. I: Plenary Lectures and Ceremonies Vols. II–IV: Invited Lectures, pages 1576–1602. World Scientific, 2010.","id":"21","oa_query":"https:\u002F\u002Fcloud-api.scholarcy.com\u002Foa_version?query=Rudelson%2C%20Mark%20Vershynin%2C%20Roman%20Non-asymptotic%20theory%20of%20random%20matrices%3A%20extreme%20singular%20values%202010","scholar_url":"https:\u002F\u002Fscholar.google.co.uk\u002Fscholar?q=Rudelson%2C%20Mark%20Vershynin%2C%20Roman%20Non-asymptotic%20theory%20of%20random%20matrices%3A%20extreme%20singular%20values%202010","scite":"https:\u002F\u002Fcloud-api.scholarcy.com\u002Fscite_url?query=Rudelson%2C%20Mark%20Vershynin%2C%20Roman%20Non-asymptotic%20theory%20of%20random%20matrices%3A%20extreme%20singular%20values%202010"},{"alt_id":"Schneider_1986_a","entry":"[22] Helmut Schneider. Truncated and censored samples from normal populations. Marcel Dekker, Inc., 1986.","id":"22","scholar_url":"https:\u002F\u002Fscholar.google.co.uk\u002Fscholar?q=Schneider%2C%20Helmut%20Truncated%20and%20censored%20samples%20from%20normal%20populations%201986"},{"alt_id":"Tibshirani_1996_a","entry":"[23] Robert Tibshirani. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological), 58(1):267–288, 1996.","id":"23","oa_query":"https:\u002F\u002Fcloud-api.scholarcy.com\u002Foa_version?query=Tibshirani%2C%20Robert%20Regression%20shrinkage%20and%20selection%20via%20the%20lasso%201996","scholar_url":"https:\u002F\u002Fscholar.google.co.uk\u002Fscholar?q=Tibshirani%2C%20Robert%20Regression%20shrinkage%20and%20selection%20via%20the%20lasso%201996","scite":"https:\u002F\u002Fcloud-api.scholarcy.com\u002Fscite_url?query=Tibshirani%2C%20Robert%20Regression%20shrinkage%20and%20selection%20via%20the%20lasso%201996"},{"alt_id":"Tobin_1958_a","entry":"[24] James Tobin. Estimation of relationships for limited dependent variables. Econometrica: journal of the Econometric Society, pages 24–36, 1958.","id":"24","oa_query":"https:\u002F\u002Fcloud-api.scholarcy.com\u002Foa_version?query=Tobin%2C%20James%20Estimation%20of%20relationships%20for%20limited%20dependent%20variables%201958","scholar_url":"https:\u002F\u002Fscholar.google.co.uk\u002Fscholar?q=Tobin%2C%20James%20Estimation%20of%20relationships%20for%20limited%20dependent%20variables%201958","scite":"https:\u002F\u002Fcloud-api.scholarcy.com\u002Fscite_url?query=Tobin%2C%20James%20Estimation%20of%20relationships%20for%20limited%20dependent%20variables%201958"},{"alt_id":"Voroninski_2016_a","entry":"[25] Vladislav Voroninski and Zhiqiang Xu. A strong restricted isometry property, with an application to phaseless compressed sensing. Applied and Computational Harmonic Analysis, 40(2):386 – 395, 2016.","id":"25","oa_query":"https:\u002F\u002Fcloud-api.scholarcy.com\u002Foa_version?query=Voroninski%2C%20Vladislav%20Xu%2C%20Zhiqiang%20A%20strong%20restricted%20isometry%20property%2C%20with%20an%20application%20to%20phaseless%20compressed%20sensing%202016","scholar_url":"https:\u002F\u002Fscholar.google.co.uk\u002Fscholar?q=Voroninski%2C%20Vladislav%20Xu%2C%20Zhiqiang%20A%20strong%20restricted%20isometry%20property%2C%20with%20an%20application%20to%20phaseless%20compressed%20sensing%202016","scite":"https:\u002F\u002Fcloud-api.scholarcy.com\u002Fscite_url?query=Voroninski%2C%20Vladislav%20Xu%2C%20Zhiqiang%20A%20strong%20restricted%20isometry%20property%2C%20with%20an%20application%20to%20phaseless%20compressed%20sensing%202016"},{"alt_id":"Wainwright_2009_a","entry":"[26] Martin J Wainwright. Sharp thresholds for high-dimensional and noisy sparsity recovery using l1constrained quadratic programming (lasso). IEEE transactions on information theory, 55(5):2183–2202, 2009.","id":"26","oa_query":"https:\u002F\u002Fcloud-api.scholarcy.com\u002Foa_version?query=Wainwright%2C%20Martin%20J.%20Sharp%20thresholds%20for%20high-dimensional%20and%20noisy%20sparsity%20recovery%20using%20l1constrained%20quadratic%20programming%20%28lasso%29%202009","scholar_url":"https:\u002F\u002Fscholar.google.co.uk\u002Fscholar?q=Wainwright%2C%20Martin%20J.%20Sharp%20thresholds%20for%20high-dimensional%20and%20noisy%20sparsity%20recovery%20using%20l1constrained%20quadratic%20programming%20%28lasso%29%202009","scite":"https:\u002F\u002Fcloud-api.scholarcy.com\u002Fscite_url?query=Wainwright%2C%20Martin%20J.%20Sharp%20thresholds%20for%20high-dimensional%20and%20noisy%20sparsity%20recovery%20using%20l1constrained%20quadratic%20programming%20%28lasso%29%202009"},{"alt_id":"1._0000_a","entry":"1. Since |zi| For part (c),","id":"1","scholar_url":"https:\u002F\u002Fscholar.google.co.uk\u002Fscholar?q=Since%20zi%20For%20part%20c"},{"alt_id":"0._0000_b","entry":"0. This vector equality can be written in block form as follows: 1 m","id":"0","scholar_url":"https:\u002F\u002Fscholar.google.co.uk\u002Fscholar?q=This%20vector%20equality%20can%20be%20written%20in%20block%20form%20as%20follows%201%20m"},{"alt_id":"2._]_c","entry":"2. Lemma I.3 ([9]). For t ∈ R, E[R2t ]","id":"2","scholar_url":"https:\u002F\u002Fscholar.google.co.uk\u002Fscholar?q=Lemma%20I3%209%20For%20t%20%20R%20ER2t"},{"alt_id":"I_0000_a","entry":"4. Lemma I.4. With high probability, m","id":"4","oa_query":"https:\u002F\u002Fcloud-api.scholarcy.com\u002Foa_version?query=I.4%2C%20Lemma%20With%20high%20probability","scholar_url":"https:\u002F\u002Fscholar.google.co.uk\u002Fscholar?q=I.4%2C%20Lemma%20With%20high%20probability","scite":"https:\u002F\u002Fcloud-api.scholarcy.com\u002Fscite_url?query=I.4%2C%20Lemma%20With%20high%20probability"},{"alt_id":"[1]_0000_d","entry":"[1] Ea∼N(0,1)n [γS(aT x∗)]","id":"1","scholar_url":"https:\u002F\u002Fscholar.google.co.uk\u002Fscholar?q=EaN01n%20%CE%B3SaT%20x"},{"alt_id":"4._0000_e","entry":"4. By Lemma I.2, we have log","id":"4","scholar_url":"https:\u002F\u002Fscholar.google.co.uk\u002Fscholar?q=By%20Lemma%20I2%20we%20have%20log"},{"alt_id":"2._0000_f","entry":"2. Therefore summing over all j ∈ [m],","id":"2","scholar_url":"https:\u002F\u002Fscholar.google.co.uk\u002Fscholar?q=Therefore%20summing%20over%20all%20j%20%20m"},{"alt_id":"1._0000_g","entry":"1. Theorem G.1 gives that","id":"1","scholar_url":"https:\u002F\u002Fscholar.google.co.uk\u002Fscholar?q=Theorem%20G1%20gives%20that"},{"alt_id":"2._0000_h","entry":"2. If t ≥ 1\u002Fm, then (w − x)U 2 = 1\u002Fm, and thus w − x 2 ≥ 1\u002Fm. By convexity and this bound, f (x) − f (x) ≥ f (w) − f (x) ≥ min λ2 8 ","id":"2","oa_query":"https:\u002F\u002Fcloud-api.scholarcy.com\u002Foa_version?query=If%20t%20%201m%20then%20w%20%20xU%202%20%201m%20and%20thus%20w%20%20x%202%20%201m%20By%20convexity%20and%20this%20bound%20f%20x%20%20f%20x%20%20f%20w%20%20f%20x%20%20min%20%CE%BB2%208","scholar_url":"https:\u002F\u002Fscholar.google.co.uk\u002Fscholar?q=If%20t%20%201m%20then%20w%20%20xU%202%20%201m%20and%20thus%20w%20%20x%202%20%201m%20By%20convexity%20and%20this%20bound%20f%20x%20%20f%20x%20%20f%20w%20%20f%20x%20%20min%20%CE%BB2%208","scite":"https:\u002F\u002Fcloud-api.scholarcy.com\u002Fscite_url?query=If%20t%20%201m%20then%20w%20%20xU%202%20%201m%20and%20thus%20w%20%20x%202%20%201m%20By%20convexity%20and%20this%20bound%20f%20x%20%20f%20x%20%20f%20w%20%20f%20x%20%20min%20%CE%BB2%208"}],"sections":{"acknowledgments":["This research was supported by NSF Awards IIS-1741137, CCF-1617730 and CCF-1901292, by a Simons Investigator Award, by the DOE PhILMs project (No DE-AC05-76RL01830), by the DARPA award HR00111990021, by the MIT Undergraduate Research Opportunities Program, and by a Google PhD Fellowship."],"funding":["This research was supported by NSF Awards IIS-1741137, CCF-1617730 and CCF-1901292, by a Simons Investigator Award, by the DOE PhILMs project (No DE-AC05-76RL01830), by the DARPA award HR00111990021, by the MIT Undergraduate Research Opportunities Program, and by a Google PhD Fellowship"],"introduction":["In the vanilla linear regression setting, we are given m ≥ n observations of the form (Ai, yi), where Ai ∈ Rn, yi = ATi x∗ + ηi, x∗ is some unknown coefficient vector that we wish to recover, and ηi is independent and identically distributed across different observations i random noise. Under favorable conditions about the Ai’s and√the distribution of the noise, it is well-known that x∗ can be recovered to within 2-reconstruction error O( n\u002Fm).\u003Cbr\u002F\u003E\u003Cbr\u002F\u003EThe classical model and its associated guarantees might, however, be inadequate to address many situations which frequently arise in both theory and practice. We focus on two common and widely studied deviations from the standard model. First, it is often the case that m n, i.e. the number of observations is much smaller than the dimension of the unknown vector x∗. In this “under-determined” regime, it is fairly clear that it is impossible to expect a non-trivial reconstruction of the underlying x∗, since there are infinitely many x exploit additional"],"objectives":["Our aim is to show that if AUT is not an isometry, we a 2 is large, can’t show that b v ∈ Rm with","Given a value y ∈ R+ our goal is to find an x ∈ R such that f (x) = y"]},"structured_summary":{"Conclusion":["The algorithm consists of projected stochastic gradient descent with projection set Er. To bound the number of update steps required for the algorithm to converge to a good estimate of x, the authors need to solve several statistical problems.","It follows from the above lemmas and Theorem 5.3 that the PSGD algorithm, as outlined above and described in Section 4, converges to a good approximation of xin a polynomial number of updates."],"Introduction":["Computationally and statistically efficient methods for truncated linear regression have only recently been obtained in [\u003Ca class=\"ref-link\" id=\"c9\" href=\"#r9\"\u003E9\u003C\u002Fa\u003E], where it was shown that, under favorable assumptions about the Ai’s, the truncation set S, and assuming the ηi’s are drawn from a Gaussian, the negative log likelihood of the truncated sample can be optimized efficiently, and approximately recovers the true parameter vector with an 2 reconstruction error O","The authors show statistical recovery, i.e. the authors upper bound the number of samples that are needed for the solution of the truncated LASSO program to be close to the true coefficient vector x∗."],"Objectives":["Our aim is to show that if AUT is not an isometry, we a 2 is large, can’t show that b v ∈ Rm with\nGiven a value y ∈ R+ our goal is to find an x ∈ R such that f (x) = y"],"Results":["These two steps of the proof suffice to upper bound the number of samples that the authors need to recover the coefficient vector x∗ via the truncated LASSO program.","The authors formally state the main results for recovery of a sparse high-dimensional coefficient vector from truncated linear regression samples.","The authors define the regularized negative log-likelihood f : Rn → R by f (x) = nll(x; A, y) + λ ing the following program approximately recovers the true parameter vector x x∗","The authors prove the following theorem, which already shows that O(k log n) samples are sufficient to solve the problem of statistical recovery of x∗: Proposition 3.2.","Suppose that Assumption the author holds and let (A, y) be m samples drawn from Process 1 and xbe any optimal solution to Program (2).","The proof is a corollary of the result that A satisfies the Restricted Isometry Property from [\u003Ca class=\"ref-link\" id=\"c6\" href=\"#r6\"\u003E6\u003C\u002Fa\u003E] with high probability even when the authors only observe truncated samples; see Corollary G.6 and the subsequent discussion in Section G.","The second step is proving Proposition 3.3, by showing that the algorithm described in Section 4 efficiently recovers an approximate solution to Program 2.","3. Since As defined f is not strongly convex, the authors need in Section 4, the solution is the to convert the bound projection set Er ="]},"summary":["Computationally and statistically efficient methods for truncated linear regression have only recently been obtained in [\u003Ca class=\"ref-link\" id=\"c9\" href=\"#r9\"\u003E9\u003C\u002Fa\u003E], where it was shown that, under favorable assumptions about the Ai’s, the truncation set S, and assuming the ηi’s are drawn from a Gaussian, the negative log likelihood of the truncated sample can be optimized efficiently, and approximately recovers the true parameter vector with an 2 reconstruction error O","The authors show statistical recovery, i.e. the authors upper bound the number of samples that are needed for the solution of the truncated LASSO program to be close to the true coefficient vector x∗.","These two steps of the proof suffice to upper bound the number of samples that the authors need to recover the coefficient vector x∗ via the truncated LASSO program.","The authors formally state the main results for recovery of a sparse high-dimensional coefficient vector from truncated linear regression samples.","The authors define the regularized negative log-likelihood f : Rn → R by f (x) = nll(x; A, y) + λ ing the following program approximately recovers the true parameter vector x x∗","The authors prove the following theorem, which already shows that O(k log n) samples are sufficient to solve the problem of statistical recovery of x∗: Proposition 3.2.","Suppose that Assumption the author holds and let (A, y) be m samples drawn from Process 1 and xbe any optimal solution to Program (2).","The proof is a corollary of the result that A satisfies the Restricted Isometry Property from [\u003Ca class=\"ref-link\" id=\"c6\" href=\"#r6\"\u003E6\u003C\u002Fa\u003E] with high probability even when the authors only observe truncated samples; see Corollary G.6 and the subsequent discussion in Section G.","The second step is proving Proposition 3.3, by showing that the algorithm described in Section 4 efficiently recovers an approximate solution to Program 2.","3. Since As defined f is not strongly convex, the authors need in Section 4, the solution is the to convert the bound projection set Er =","The algorithm consists of projected stochastic gradient descent with projection set Er. To bound the number of update steps required for the algorithm to converge to a good estimate of x, the authors need to solve several statistical problems.","It follows from the above lemmas and Theorem 5.3 that the PSGD algorithm, as outlined above and described in Section 4, converges to a good approximation of xin a polynomial number of updates."],"top_statements":["In the vanilla linear regression setting, we are given m ≥ n observations of the form (Ai, yi), where Ai ∈ Rn, yi = ATi x∗ + ηi, x∗ is some unknown coefficient vector that we wish to recover, and ηi is independent and identically distributed across different observations i random noise","We focus on two common and widely studied deviations from the standard model","It is often the case that m n, i.e. the number of observations is much smaller than the dimension of the unknown vector x∗. In this “under-determined” regime, it is fairly clear that it is impossible to expect a non-trivial reconstruction of the underlying x∗, since there are infinitely many x exploit additional","Under standard conditions on the design matrix and the noise distribution, and under mild assumptions on the truncation set S, we show that the Stochastic Gradient Descent (SGD) algorithm on the truncated LASSO optimization program, our proposed adaptation of the standard LASSO optimization to accommodate truncation, is a computationally and statistically efficient method for recovering x∗, attaining an optimal 2 reconstruction error of O( (k log n)\u002Fm), where k is the sparsity of x∗","We begin our overview with a brief description of the approaches of [\u003Ca class=\"ref-link\" id=\"c26\" href=\"#r26\"\u003E26\u003C\u002Fa\u003E, \u003Ca class=\"ref-link\" id=\"c9\" href=\"#r9\"\u003E9\u003C\u002Fa\u003E] and subsequently outline the additional challenges that arise in our setting, and how we address them","We show statistical recovery, i.e. we upper bound the number of samples that are needed for the solution of the truncated LASSO program to be close to the true coefficient vector x∗"],"url":"https:\u002F\u002Fstatic.aminer.cn\u002Fstorage\u002Fpdf\u002Farxiv\u002F20\u002F2007\u002F2007.14539.pdf"}};