Natural gradient works efficiently in learning

被引:1955
作者
Amari, S [1 ]
机构
[1] RIKEN, Frontier Res Program, Wako, Saitama 35101, Japan
关键词
D O I
10.1162/089976698300017746
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
When a parameter space has a certain underlying structure, the ordinary gradient of a function does not represent its steepest direction, but the natural gradient does. Information geometry is used for calculating the natural gradients in the parameter space of perceptrons, the space of matrices (for blind source separation), and the space of linear dynamical systems (for blind source deconvolution). The dynamical behavior of natural gradient online learning is analyzed and is proved to be Fisher efficient, implying that it has asymptotically the same performance as the optimal batch estimation of parameters. This suggests that the plateau phenomenon, which appears in the backpropagation learning algorithm of multilayer perceptrons, might disappear or might not be so serious when the natural gradient is used. An adaptive method of updating the learning rate is proposed and analyzed.
引用
收藏
页码:251 / 276
页数:26
相关论文
共 41 条
[1]   STATISTICAL-THEORY OF LEARNING-CURVES UNDER ENTROPIC LOSS CRITERION [J].
AMARI, S ;
MURATA, N .
NEURAL COMPUTATION, 1993, 5 (01) :140-153
[2]   INFORMATION GEOMETRY OF BOLTZMANN MACHINES [J].
AMARI, S ;
KURATA, K ;
NAGAOKA, H .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1992, 3 (02) :260-271
[3]   A THEORY OF ADAPTIVE PATTERN CLASSIFIERS [J].
AMARI, S .
IEEE TRANSACTIONS ON ELECTRONIC COMPUTERS, 1967, EC16 (03) :299-+
[4]   DIFFERENTIAL GEOMETRY OF A PARAMETRIC FAMILY OF INVERTIBLE LINEAR-SYSTEMS - RIEMANNIAN METRIC, DUAL AFFINE CONNECTIONS, AND DIVERGENCE [J].
AMARI, S .
MATHEMATICAL SYSTEMS THEORY, 1987, 20 (01) :53-82
[5]   Information geometry of estimating functions in semi-parametric statistical models [J].
Amari, S ;
Kawanabe, M .
BERNOULLI, 1997, 3 (01) :29-54
[6]  
AMARI S, IN PRESS IEEE T SIGN
[7]  
AMARI S, 1997, UNPUB SUPEREFFICIENC
[8]  
Amari S., 1995, HDB BRAIN THEORY NEU, P522
[9]  
AMARI S, IN PRESS NEURAL NETW
[10]  
Amari S., 1985, LECT NOTES STAT, V28, DOI DOI 10.1007/978-1-4612-5056-2