They probably have a trillion emails with human labels, either from users directly applying them, or inferrable from actions like deleting.
With that much data, even a simple Bayesian classifier should work pretty much perfectly.