▲ | yen223 a day ago | |
At my last company I helped build the experimentations platform that processes millions of requests per day. I have some thoughts: - The most useful resource we've found was from Spotify, of all places: https://engineering.atspotify.com/category/data-science/ - For hashing, md5 hash on (user-id + a/b-test-id) is sufficient. In practice we had no issues with split bias. You should not get too clever with hashing. You'll want to stick to something reliable and widely supported to make any post-experiment analysis easier. You definitely want to log the user-id to version mapping somewhere. - As for in-house vs external, I would probably go in-house, though that depends on the system you're A/B testing. In practice the amount of work needed to integrate a third-party tool was roughly the same as building the platform, but building the platform meant we could test more bespoke features. |