The Millennium Galaxy Catalogue (MGC; Liske et al. 2003) is designed to provide a comprehensive resource for the study of structure in the Universe on 1–100 kpc scales, i.e. galaxies. Structure in the local Universe on >1 Mpc scales is relatively well understood, thanks to a well developed cosmological theory (ΛCDM), and extensive surveys such as the two-degree Field Galaxy Redshift Survey (2dFGRS; Colless et al. 2001). However, the story on the smaller 1–100 kpc scales is significantly different and structure at this level is not fully understood. This is not surprising as this scale constitutes a regime dominated by non-linear physical behaviour, where mass concentrations have decoupled from the Hubble flow, and where baryon physics becomes critical as the baryons eventually dominate over the dark matter. Numerical models are, as yet, unable to grapple with this regime until such time as the baryon physics can be comprehensively encoded. Instead, models of galaxy formation rely on semianalytic extensions, which in turn are based on empirical data and a few key analytical recipes (see Baugh 2006). It is therefore important to recognise that empirical datasets are currently driving our understanding of galaxy formation, with the semi-analytic models rapidly developing to accommodate new empirical studies.