Monday, July 24, 2023

Equal Frequency Binning (Part of Data Analytics Course)

import pandas as pd
l = [5, 10, 11, 13, 15, 35, 50, 55, 72, 92, 204, 215]
s = pd.Series(l)
s.nunique()


12


bins = pd.qcut(s, 3) # Equal frequency binning

How do we make data smooth using mean?

df = pd.DataFrame( { 'data': l, 'bins': bins } ) df
t = df.groupby(['bins']).mean(['data']) t
map_of_mean_values = {} for i in t.iterrows(): # type(i[0]): Interval # type(i[1]): Series map_of_mean_values.update({str(i[0]): i[1][0]}) map_of_mean_values {'(4.999, 14.333]': 9.75, '(14.333, 60.667]': 38.75, '(60.667, 215.0]': 145.75} df['bins'] = df['bins'].astype(str) df['smoothed_values'] = df['bins'].apply(lambda x: map_of_mean_values[x]) df

No comments:

Post a Comment