Despite their vast importance to inorganic chemistry, materials science and catalysis, the accuracy of modelling the formation or cleavage of metal-ligand (M-L) bonds depends greatly on the chosen functional and the type of bond in a way that is not systematically understood. In order to approach a state of high-accuracy DFT for rational prediction of chemistry and catalysis, such system-dependencies need to be resolved. We studied 30 different density functionals applied to a "balanced data set" of 60 experimental diatomic M-L bond energies; this data set has no bias toward any dq configuration, metal, bond type, or ligand as all of these occur to the same extent, and we can therefore identify accuracy bottlenecks. We show that the performance of a functional is very dependent on data set choice and we dissect these effects into system type. In addition to the use of balanced data sets, we also argue that the precision (rather than just accuracy) of a functional is of interest, measured by standard deviations of the errors. There are distinct system dependencies both in the ligand and metal series: Hydrides are best described by a very large HF exchange percentage, possibly due to self-interaction error, whereas halides are best described by very small (0-10%) HF exchange fractions, and double-bond enforcing oxides and sulfides favor 10-25% HF exchange, as is also average for the full data set. Thus, average HF requirements hide major system-dependent requirements. For late transition metals Co-Zn, HF percentage of 0-10% is favored, whereas the early transition metals Sc-Fe hybrid functionals with 20% HF exchange or higher is commonly favored. Accordingly, B3LYP is an excellent choice for early d-block but a poor choice for late transition metals. We conclude that DFT intrinsically underestimates the bond strengths of late vs. early transition metals, correlating with increased effective nuclear charge Thus, the revised RPBE, which reduces the over-binding tendency of PBE, is mainly an advantage for the early-mid transition metals and not very much for the late transition metals, i.e. there is a metal-dependent effect of the relative performance of RPBE vs. PBE, which are widely used to study adsorption energetics on metal surfaces. Overall, the best performing functionals are PW6B95, the MN15 and MN15-L functionals, and the double hybrid B2PLYP.