Advanced Knowledge Distillation Techniques and Applications

Knowledge distillation is a machine learning technique wherein a smaller, simpler model (the student model) is trained to emulate the performance of a larger, more complex model (the teacher model). This method aims to retain the teacher's performance while using fewer resources.
Author: Bahman Moraffah
Estimated Reading Time: 10 min
Published: 2020

Introduction

Knowledge distillation is a machine learning technique wherein a smaller, simpler model (the student model) is trained to emulate the performance of a larger, more complex model (the teacher model). This method aims to retain the teacher's performance while using fewer resources.

References

[1] Hinton, G., Vinyals, O., & Dean, J. (2015). Distilling the Knowledge in a Neural Network.

[2] Romero, A., Ballas, N., Kahou, S. E., Chassang, A., Gatta, C., & Bengio, Y. (2014). FitNets: Hints for Thin Deep Nets.