A Marketplace for Data: An Algorithmic Solution

被引:148
作者
Agarwal, Anish [1 ]
Dahleh, Munther [1 ]
Sarkar, Tuhin [1 ]
机构
[1] MIT, Cambridge, MA 02139 USA
来源
ACM EC '19: PROCEEDINGS OF THE 2019 ACM CONFERENCE ON ECONOMICS AND COMPUTATION | 2019年
关键词
Data Marketplaces; Value of Data; Shapley Value; Online Combinatorial Auctions; DESIGN;
D O I
10.1145/3328526.3329589
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In this work, we aim to design a data marketplace; a robust real-time matching mechanism to efficiently buy and sell training data for Machine Learning tasks. While the monetization of data and pre-trained models is an essential focus of industry today, there does not exist a market mechanism to price training data and match buyers to sellers while still addressing the associated (computational and other) complexity. The challenge in creating such a market stems from the very nature of data as an asset: (i) it is freely replicable; (ii) its value is inherently combinatorial due to correlation with signal in other data; (iii) prediction tasks and the value of accuracy vary widely; (iv) usefulness of training data is difficult to verify a priori without first applying it to a prediction task. As our main contributions we: (i) propose a mathematical model for a two-sided data market and formally define the key associated challenges; (ii) construct algorithms for such a market to function and analyze how they meet the challenges defined. We highlight two technical contributions: (i) a new notion of "fairness" required for cooperative games with freely replicable goods; (ii) a truthful, zero regret mechanism to auction a class of combinatorial goods based on utilizing Myerson's payment function and the Multiplicative Weights algorithm. These might be of independent interest.
引用
收藏
页码:701 / 726
页数:26
相关论文
共 32 条
[1]  
Aiello B, 2001, LECT NOTES COMPUT SC, V2045, P119
[2]  
[Anonymous], 2012, P INT WORKSH INT NET, DOI 10.1007/978-3-642-35311-6_28
[3]  
[Anonymous], TECHNICAL REPORT
[4]  
[Anonymous], 2012, J PREDICT MARKETS
[5]  
Arora Sanjeev, 2012, Theory of computing, V8, P121, DOI [DOI 10.4086/TOC.2012.V008A006, 10.4086/toc.2012.v008a006]
[6]  
Auer P., 2003, Journal of Machine Learning Research, V3, P397, DOI 10.1162/153244303321897663
[7]  
Babaioff Moshe, 2012, P 13 ACM C ELECT COM, P92
[8]   Approximating power indices: theoretical and empirical analysis [J].
Bachrach, Yoram ;
Markakis, Evangelos ;
Resnick, Ezra ;
Procaccia, Ariel D. ;
Rosenschein, Jeffrey S. ;
Saberi, Amin .
AUTONOMOUS AGENTS AND MULTI-AGENT SYSTEMS, 2010, 20 (02) :105-122
[9]  
Balkanski Eric, 2017, P 31 INT C NEURAL IN, V30, P6221
[10]   The Design and Price of Information [J].
Bergemann, Dirk ;
Bonatti, Alessandro ;
Smolin, Alex .
AMERICAN ECONOMIC REVIEW, 2018, 108 (01) :1-48