python - Rolling sum in subgroups of a dataframe (pandas) -

i have sessions dataframe contains e-mail , sessions (int) columns.

i need calculate rolling sum of sessions per email (i.e. not globally).

now, following works, it's painfully slow:

emails = set(list(sessions['e-mail'])) ses_sums = [] em in emails:     email_sessions = sessions[sessions['e-mail'] == em]     email_sessions.is_copy = false     email_sessions['session_rolling_sum'] = pd.rolling_sum(email_sessions['sessions'], window=self.window).fillna(0)     ses_sums.append(email_sessions) df = pd.concat(ses_sums, ignore_index=true)

is there way of achieving same in pandas, using pandas operators on dataframe instead of creating separate dataframes each email , concatenating them?

(either or other way of making faster)

np.random.seed([3,1415]) df = pd.dataframe({'e-mail': np.random.choice(list('ab'), 20),                    'session': np.random.randint(1, 10, 20)})  df.groupby('e-mail').session.rolling(3).sum()  e-mail           0      nan         2      nan         4     11.0         5      7.0         7     10.0         12    16.0         15    16.0         17    16.0         18    17.0         19    18.0 b       1      nan         3      nan         6     18.0         8     14.0         9     16.0         10    12.0         11    13.0         13    16.0         14    20.0         16    22.0 name: session, dtype: float64

Search This Blog

Today

python - Rolling sum in subgroups of a dataframe (pandas) -

Comments

Post a Comment

Popular posts from this blog

java - Jasper subreport showing only one entry from the JSON data source when embedded in the Title band -

serialization - Convert Any type in scala to Array[Byte] and back -

SonarQube Plugin for Jenkins does not find SonarQube Scanner executable -