====== Transfer Learning Theory Reading Group ====== **Location**: 1102 **Time**: Saturday 2pm (every other week) In this bi-weekly reading group, we will read classic domain adaptation theory papers discussed in the following textbook: [[https://www.sciencedirect.com/book/9781785482366/advances-in-domain-adaptation-theory | Advances in Domain Adaptation Theory (ADAT), Redko]] Through the readings, we hope to get a basic understanding of how and why domain adaptation algorithms work fundamentally, and in what conditions they would not work. ===== Reading Schedule ===== ==== Background ==== * ADAT chapter 1 (Learning theory background) * ADAT chapter 2 (Domain adaptation background) ==== Domain adaptation generalization bound ==== === Week 1: HΔH-divergence === * ADAT chapter 3.1-3.4 * Ben-David, Shai, John Blitzer, Koby Crammer, Alex Kulesza, Fernando Pereira, and Jennifer Wortman Vaughan. "A theory of learning from different domains." Machine learning 79, no. 1 (2010): 151-175 //(defines the HΔH-divergence, a preliminary work was published in NIPS 2007)// === Week 3: Discrepancy distance I === * ADAT chapter 3.5.1-3.5.2 * Mansour, Yishay, Mehryar Mohri, and Afshin Rostamizadeh. "Domain adaptation: Learning bounds and algorithms." arXiv preprint arXiv:0902.3430 (2009).//improved generalization bound using discrepancy distance// === Week 5: Discrepancy distance II === * ADAT chapter 3.5.3 //a discrepancy distance based generalization bound for regression problems.// * Cortes, Corinna, and Mehryar Mohri. "Domain adaptation in regression." In International Conference on Algorithmic Learning Theory, pp. 308-323. Springer, Berlin, Heidelberg, 2011. * See also: * Cortes, Corinna, Mehryar Mohri, and Andrés Munoz Medina. "Adaptation based on generalized discrepancy." The Journal of Machine Learning Research 20, no. 1 (2019): 1-30. * Maurer, Andreas. "Transfer bounds for linear feature learning." Machine learning 75, no. 3 (2009): 327-350. ==== Impossibility theorems for domain adaptation ==== === Week 7: Impossibility theorems === * ADAT Chapter 4.1-4.2 * David, Shai Ben, Tyler Lu, Teresa Luu, and Dávid Pál. "Impossibility theorems for domain adaptation." In Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, pp. 129-136. JMLR Workshop and Conference Proceedings, 2010. //(using the HΔH-divergence)// === Week 9: Hardness results === * ADAT Chapter 4.3-4.4 * Ben-David, Shai, and Ruth Urner. "On the hardness of domain adaptation and the utility of unlabeled target samples." In International Conference on Algorithmic Learning Theory, pp. 139-153. Springer, Berlin, Heidelberg, 2012. ==== Integral probability generalization bound ==== === Week 11: Wasserstein distance === * ADAT Chapter 5.1-5.3 * Redko, Ievgen, Amaury Habrard, and Marc Sebban. "Theoretical analysis of domain adaptation with optimal transport." In Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 737-753. Springer, Cham, 2017. ===== Other candidate papers ===== * Baxter, Jonathan. "A model of inductive bias learning." Journal of artificial intelligence research 12 (2000): 149-198. //(Show how multi-task learning can improve generalization, assuming the target task is embedded within an environment of related tasks.) // * ERM-based Multi-source Transfer Learning //(recent work by Xinyi on the sample complexity of multi-source transfer learning)//