DATA MINING AND MACHINE LEARNING IN ASTRONOMY

被引:232
作者
Ball, Nicholas M. [1 ]
Brunner, Robert J. [2 ]
机构
[1] Natl Res Council Canada, Herzberg Inst Astrophys, Victoria, BC V9E 2E7, Canada
[2] Univ Illinois, Dept Astron, Urbana, IL 61801 USA
来源
INTERNATIONAL JOURNAL OF MODERN PHYSICS D | 2010年 / 19卷 / 07期
关键词
Data mining; machine learning; knowledge discovery in databases; astroinformatics; astrostatistics; Virtual Observatory; DIGITAL-SKY-SURVEY; ESTIMATING PHOTOMETRIC REDSHIFTS; AUTOMATED MORPHOLOGICAL CLASSIFICATION; INDEPENDENT COMPONENT ANALYSIS; STAR-GALAXY CLASSIFICATION; ARTIFICIAL NEURAL-NETWORKS; SUPPORT VECTOR MACHINES; PRELIMINARY LUMINOSITY CLASSIFICATION; FAINT OBJECT CLASSIFICATION; COSMIC-RAY HITS;
D O I
10.1142/S0218271810017160
中图分类号
P1 [天文学];
学科分类号
0704 ;
摘要
We review the current state of data mining and machine learning in astronomy. Data Mining can have a somewhat mixed connotation from the point of view of a researcher in this field. If used correctly, it can be a powerful approach, holding the potential to fully exploit the exponentially increasing amount of available data, promising great scientific advance. However, if misused, it can be little more than the black box application of complex computing algorithms that may give little physical insight, and provide questionable results. Here, we give an overview of the entire data mining process, from data collection through to the interpretation of results. We cover common machine learning algorithms, such as artificial neural networks and support vector machines, applications from a broad range of astronomy, emphasizing those in which data mining techniques directly contributed to improving science, and important current and future directions, including probability density functions, parallel algorithms, Peta-Scale computing, and the time domain. We conclude that, so long as one carefully selects an appropriate algorithm and is guided by the astronomical problem at hand, data mining can be very much the powerful tool, and not the questionable black box.
引用
收藏
页码:1049 / 1106
页数:58
相关论文
共 373 条
[1]  
Aarts E., 1989, Simulated annealing and Boltzmann machines: a stochastic approach to combinatorial optimization and neural computing
[2]   Predicting spectral features in galaxy spectra from broad-band photometry [J].
Abdalla, F. B. ;
Mateus, A. ;
Santos, W. A. ;
Sodre, L., Jr. ;
Ferreras, I. ;
Lahav, O. .
MONTHLY NOTICES OF THE ROYAL ASTRONOMICAL SOCIETY, 2008, 387 (03) :945-953
[3]  
Abe S., 2005, ADV PTRN RECOGNIT
[4]  
Adamo J.-M., 2000, Data mining for Association Rules and Sequential Patterns
[5]  
Adams A., 1994, Vistas in Astronomy, V38, P273, DOI 10.1016/0083-6656(94)90037-X
[6]  
Aggarwal CC, 2008, ADV DATABASE SYST, V34, P1, DOI 10.1007/978-0-387-70992-5
[7]  
AHA DW, 1991, MACH LEARN, V6, P37, DOI 10.1007/BF00153759
[8]  
AIZERMAN MA, 1965, AUTOMAT REM CONTR+, V25, P1175
[9]  
AMDAHL G, 1967, SPRING JOINT COMPUTE
[10]   Wide field imaging - I. Applications of neural networks to object detection and star/galaxy classification [J].
Andreon, S ;
Gargiulo, G ;
Longo, G ;
Tagliaferri, R ;
Capuano, N .
MONTHLY NOTICES OF THE ROYAL ASTRONOMICAL SOCIETY, 2000, 319 (03) :700-716