Predicting with Sparse Data

Abstract
It is well known that effective prediction of project cost related
factors is an important aspect of software engineering.  Unfortunately,
despite extensive research over more than 30 years, this remains a
significant problem for many practitioners.  A major obstacle is the
absence of reliable and systematic historic data, yet this is a sine
qua non for almost all proposed methods: statistical, machine learning
or calibration of existing models.  In this paper we describe our
sparse data method (SDM) based upon a pairwise comparison technique and
Saaty's Analytic Hierarchy Process.  Our minimum data requirement is a
single known point.  The technique is supported by a software tool
known as DataSalvage.  We show, for data from two organisations, how
our approach - based upon expert judgement - adds value to expert
judgement by producing significantly more accurate and less biased
results.  A sensitivity analysis shows that our approach is robust to
pairwise comparison errors.  We then describe the results of a small
usability trial with a practising project manager.  From this empirical
work we conclude that the technique is promising and may help overcome
some of the present barriers to effective project prediction.

Keywords: prediction, software project effort, expert judgement,
empirical data, sparse data.
 

Martin Shepperd and Michelle Cartwright
Empirical Software Engineering Research Group
School of Design, Engineering & Computing,
Bournemouth University,
Talbot Campus,
Poole, UK
Email: {mshepper, mcartwri}@bournemouth.ac.uk