hachirou吧 关注:18贴子:2,256
  • 0回复贴,共1

Abtest Spliting Hashing

只看楼主收藏回复

实现了abtest平均分配
http://blog.richardweiss.org/2016/12/25/hash-splits.html
# First, all the imports for the whole example
from tqdm import tqdm_notebook
import hashlib
import pandas
import scipy.statsfrom sklearn.metrics
import mutual_info_score
import statsmodels.api as sm
def ab_split(id, salt, control_group_size):
''' Returns 't' (for test) or 'c' (for control), based on the ID and salt.
The control_group_size is a float, between 0 and 1, that sets how big the
control group is. '''
test_id = str(id) + '-' + str(salt)
test_id_digest = hashlib.md5(test_id.encode('ascii')).hexdigest()
test_id_first_digits = test_id_digest[:6]
test_id_final_int = int(test_id_first_digits, 16)
ab_split = (test_id_final_int/0xFFFFFF)
if ab_split > control_group_size: return 't'
else: return 'c'


IP属地:北京1楼2019-03-11 17:31回复