A theory of learning from different domains

Machine Learning - Tập 79 Số 1-2 - Trang 151-175 - 2010
Shai Ben-David1, John Blitzer2, Koby Crammer3, Alex Kulesza4, Fernando Pereira5, Jennifer Wortman Vaughan6
1David R Cheriton School of Computer Science, University of Waterloo, Waterloo, Canada
2Dept. of Computer Science, UC Berkeley, Berkeley, USA
3Department of Electrical Engineering, The Technion, Haifa, Israel
4Department of Computer and Information Science, University of Pennsylvania, Philadelphia, USA
5Google Research, Mountain View, USA#TAB#
6School of Engineering & Applied Sciences, Harvard University, Cambridge, USA

Tóm tắt

Từ khóa


Tài liệu tham khảo

Ando, R., & Zhang, T. (2005). A framework for learning predictive structures from multiple tasks and unlabeled data. Journal of Machine Learning Research, 6, 1817–1853.

Anthony, M., & Bartlett, P. (1999). Neural network learning: theoretical foundations. Cambridge: Cambridge University Press.

Bartlett, P., & Mendelson, S. (2002). Rademacher and Gaussian complexities: risk bounds and structural results. Journal of Machine Learning Research, 3, 463–482.

Batu, T., Fortnow, L., Rubinfeld, R., Smith, W., & White, P. (2000). Testing that distributions are close. In: IEEE symposium on foundations of computer science (Vol. 41, pp. 259–269).

Baxter, J. (2000). A model of inductive bias learning. Journal of Artificial Intelligence Research, 12, 149–198.

Ben-David, S., Eiron, N., & Long, P. (2003). On the difficulty of approximately maximizing agreements. Journal of Computer and System Sciences, 66, 496–514.

Ben-David, S., Blitzer, J., Crammer, K., & Pereira, F. (2006). Analysis of representations for domain adaptation. In: Advances in neural information processing systems.

Bickel, S., Brückner, M., & Scheffer, T. (2007). Discriminative learning for differing training and test distributions. In: Proceedings of the international conference on machine learning.

Bikel, D., Miller, S., Schwartz, R., & Weischedel, R. (1997). Nymble: a high-performance learning name-finder. In: Conference on applied natural language processing.

Blitzer, J., Crammer, K., Kulesza, A., Pereira, F., & Wortman, J. (2007a). Learning bounds for domain adaptation. In: Advances in neural information processing systems.

Blitzer, J., Dredze, M., & Pereira, F. (2007b) Biographies, Bollywood, boomboxes and blenders: domain adaptation for sentiment classification. In: ACL.

Collins, M. (1999). Head-driven statistical models for natural language parsing. PhD thesis, University of Pennsylvania.

Cortes, C., Mohri, M., Riley, M., & Rostamizadeh, A. (2008). Sample selection bias correction theory. In: Proceedings of the 19th annual conference on algorithmic learning theory.

Crammer, K., Kearns, M., & Wortman, J. (2008). Learning from multiple sources. Journal of Machine Learning Research, 9, 1757–1774.

Dai, W., Yang, Q., Xue, G., & Yu, Y. (2007). Boosting for transfer learning. In: Proceedings of the international conference on machine learning.

Das, S., & Chen, M. (2001). Yahoo! for Amazon: extracting market sentiment from stock message boards. In: Proceedings of the Asia pacific finance association annual conference.

Daumé, H. (2007). Frustratingly easy domain adaptation. In: Association for computational linguistics (ACL).

Finkel, J. R. Manning, C. D. (2009). Hierarchical Bayesian domain adaptation. In: Proceedings of the north American association for computational linguistics.

Heckman, J. (1979). Sample selection bias as a specification error. Econometrica, 47, 153–161.

Huang, J., Smola, A., Gretton, A., Borgwardt, K., & Schoelkopf, B. (2007). Correcting sample selection bias by unlabeled data. In: Advances in neural information processing systems.

Jiang, J., & Zhai, C. (2007). Instance weighting for domain adaptation. In: Proceedings of the association for computational linguistics.

Kifer, D., Ben-David, S., & Gehrke, J. (2004). Detecting change in data streams. In: Ver large databases.

Li, X., & Bilmes, J. (2007). A Bayesian divergence prior for classification adaptation. In: Proceedings of the international conference on artificial intelligence and statistics.

Mansour, Y., Mohri, M., & Rostamizadeh, A. (2009a). Domain adaptation with multiple sources. In: Advances in neural information processing systems.

Mansour, Y., Mohri, M., & Rostamizadeh, A. (2009b). Multiple source adaptation and the rényi divergence. In: Proceedings of the conference on uncertainty in artificial intelligence.

McAllester, D. (2003). Simplified PAC-Bayesian margin bounds. In: Proceedings of the sixteenth annual conference on learning theory.

Pang, B., Lee, L., & Vaithyanathan, S. (2002). Thumbs up? Sentiment classification using machine learning techniques. In: Proceedings of empirical methods in natural language processing.

Ratnaparkhi, A. (1996). A maximum entropy model for part-of-speech tagging. In: Proceedings of empirical methods in natural language processing.

Sugiyama, M., Suzuki, T., Nakajima, S., Kashima, H., von Bünau, P., & Kawanabe, M. (2008). Direct importance estimation for covariate shift adaptation. Annals of the Institute of Statistical Mathematics, 60, 699–746.

Thomas, M., Pang, B., & Lee, L. (2006). Get out the vote: determining support or opposition from congressional floor-debate transcripts. In: Proceedings of empirical methods in natural language processing.

Turney, P. (2002). Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In: Proceedings of the association for computational linguistics.

Vapnik, V. (1998). Statistical learning theory. New York: Wiley.

Zhang, T. (2004). Solving large-scale linear prediction problems with stochastic gradient descent. In: Proceedings of the international conference on machine learning.