Visible to the public Trust-Region Minimization Algorithm for Training Responses (TRMinATR): The Rise of Machine Learning Techniques

TitleTrust-Region Minimization Algorithm for Training Responses (TRMinATR): The Rise of Machine Learning Techniques
Publication TypeConference Paper
Year of Publication2018
AuthorsRafati, Jacob, DeGuchy, Omar, Marcia, Roummel F.
Conference Name2018 26th European Signal Processing Conference (EUSIPCO)
Date PublishedSept. 2018
PublisherIEEE
ISBN Number978-9-0827-9701-5
Keywordsapproximation theory, computational complexity, Computer architecture, computer theory, Deep Learning, Europe, gradient descent methods, gradient methods, Hessian approximations, Hessian matrices, Hessian matrix inversion, human factors, learning (artificial intelligence), limited memory BFGS quasiNewton method, Limited-memory BFGS, Line-search methods, machine learning, matrix inversion, memory storage, Neural networks, Newton method, nonconvex functions, optimisation, Optimization, pubcrawl, Quasi-Newton methods, resilience, Resiliency, Scalability, security, Signal processing algorithms, storage management, Training, training responses, TRMinATR, Trust-region methods, trust-region minimization algorithm
Abstract

Deep learning is a highly effective machine learning technique for large-scale problems. The optimization of nonconvex functions in deep learning literature is typically restricted to the class of first-order algorithms. These methods rely on gradient information because of the computational complexity associated with the second derivative Hessian matrix inversion and the memory storage required in large scale data problems. The reward for using second derivative information is that the methods can result in improved convergence properties for problems typically found in a non-convex setting such as saddle points and local minima. In this paper we introduce TRMinATR - an algorithm based on the limited memory BFGS quasi-Newton method using trust region - as an alternative to gradient descent methods. TRMinATR bridges the disparity between first order methods and second order methods by continuing to use gradient information to calculate Hessian approximations. We provide empirical results on the classification task of the MNIST dataset and show robust convergence with preferred generalization characteristics.

URLhttps://ieeexplore.ieee.org/document/8553243
DOI10.23919/EUSIPCO.2018.8553243
Citation Keyrafati_trust-region_2018