## Computational methods

Both the data sets presented here were constructed using the same methodology summarized here and described in detail in the paper.

### Geometry optimizations

The geometries of the equilibrium structures were optimized at B3LYP-D3/def2-QZVP level using very fine DFT grid and tight convergence thresholds. All the resulting structures had been verified to be true minima by a vibrational analysis. From the equilibrium structures, the dissociation curves were constructed using a protocol described in the paper.

### Benchmark interaction energies

The benchmark results reported for the NCIA data sets are interaction energies computed on fixed geometry of the dimer, i.e. the deformation energy of the monomers is not included. This approach avoids any issues caused by using different methods for the geometry optimization and for calculation of the interaction energy, and is consistent between equilibrium and non-equilibrium geometries.

The interaction energies are calculated using a composite CCSD(T)/CBS scheme constructed from a HF energy, MP2 correlation energy extrapolated to the CBS limit, and a ΔCCSD(T) correction. All the calculations employed the counterpoise correction

For the equilibrium geometries and the closest point of the dissociation curves, the scheme employs aug-cc-pV5Z basis for the HF energy, MP2 term is extrapolated from aug-cc-pVQZ and aug-cc-pV5Z basis sets and the ΔCCSD(T) correction in heavy-aug-cc-pVTZ basis. The remaining points of the curve were calculated using smaller basis sets and rescaled to match the high-level benchmark; this procedure had been proven to introduce truly negligible errors.