Transfer learning has been the subject of much recent research. In practice, that research means that the models are unstable since they are continually revised whenever new data arrives. This paper offers a very simple "bellwether" transfer learner. Given N datasets, we find which one produces the best predictions on all the others. This bellwether dataset is then used for all subsequent predictions (when its predictions start failing, one may seek another bellwether). Bellwethers are interesting since they are very simple to find (wrap a for-loop around standard data miners). They simplify the task of making general policies in software engineering since as long as one bellwether remains useful, stable conclusions for $N$ datasets can be achieved by reasoning over that bellwether. This paper shows that this bellwether approach works for multiple datasets from various domains in SE. From this, we conclude that (1) bellwether method is a useful (and simple) transfer learner; (2) Unlike bellwethers, other complex transfer learners do not generalized to all domains in SE; (3) "bellwethers" are a baseline method against which future transfer learners should be compared; (4) When building increasingly complex automatic methods, researchers should pause and compare more sophisticated method against simpler alternatives.
Submitted 17 Mar 2017 to Software Engineering
Published 21 Mar 2017
Author comments: 18 Pageshttp://arxiv.org/abs/1703.06218http://arxiv.org/pdf/1703.06218.pdf