Sample compression, learnability, and the Vapnik-Chervonenkis dimension

被引:2
作者
Floyd, S [1 ]
Warmuth, M [1 ]
机构
[1] UNIV CALIF SANTA CRUZ, DEPT COMP SCI, SANTA CRUZ, CA 95064 USA
关键词
sample compression; Vapnik-Chervonenkis dimension; pac-learning;
D O I
10.1023/A:1022660318680
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Within the framework of pac-learning, we explore the learnability of concepts from samples using the paradigm of sample compression schemes. A sample compression scheme of size Ic for a concept class C subset of or equal to 2(X) consists of a compression function and a reconstruction function. The compression function receives a finite sample set consistent with some concept in C and chooses a subset of k examples as the compression set. The reconstruction function forms a hypothesis on X from a compression set of Ic examples. For any sample set of a concept in C the compression set produced by the compression function must lead to a hypothesis consistent with the whole original sample set when it is fed to the reconstruction function. We demonstrate that the existence of a sample compression scheme of fixed-size for a class C is sufficient to ensure that the class C is pac-learnable. Previous work has shown that a class is pac-learnable if and only if the Vapnik-Chervonenkis (VC) dimension of the class is finite. In the second half of this paper we explore the relationship between sample compression schemes and the VC dimension. We define maximum and maximal classes of VC dimension d. For every maximum class of VC dimension cl, there is a sample compression scheme of size d, and for sufficiently-large maximum classes there is no sample compression scheme of size less than d. We discuss briefly classes of VC dimension d that are maximal but not maximum. It is an open question whether every class of VC dimension d has a sample compression scheme of size O(d).
引用
收藏
页码:269 / 304
页数:36
相关论文
共 32 条
[1]  
Angluin D., 1988, Machine Learning, V2, P319, DOI 10.1023/A:1022821128753
[2]   OCCAM RAZOR [J].
BLUMER, A ;
EHRENFEUCHT, A ;
HAUSSLER, D ;
WARMUTH, MK .
INFORMATION PROCESSING LETTERS, 1987, 24 (06) :377-380
[3]   LEARNABILITY AND THE VAPNIK-CHERVONENKIS DIMENSION [J].
BLUMER, A ;
EHRENFEUCHT, A ;
HAUSSLER, D ;
WARMUTH, MK .
JOURNAL OF THE ACM, 1989, 36 (04) :929-965
[4]   LEARNING FASTER THAN PROMISED BY THE VAPNIK-CHERVONENKIS DIMENSION [J].
BLUMER, A ;
LITTLESTONE, N .
DISCRETE APPLIED MATHEMATICS, 1989, 24 (1-3) :47-53
[5]  
CESABIANCHI N, 1993, 25TH P ANN ACM S THE, P382
[6]  
CLARKSON KL, 1992, EUCLIDEAN GEOMETRY C
[7]  
EHRENFEUCHT A, 1987, 1988 P WORKSH COMP L, P139
[8]  
FLOYD S, 1989, TR89061 INT COMP SCI
[9]  
FREUND Y, 1995, IN PRESS INFORMATION
[10]   THE POWER OF SELF-DIRECTED LEARNING [J].
GOLDMAN, SA ;
SLOAN, RH .
MACHINE LEARNING, 1994, 14 (03) :271-294