Pandas - Groupby with conditional formula(Pandas - 带条件公式的 Groupby)
问题描述
Survived SibSp Parch
0 0 1 0
1 1 1 0
2 1 0 0
3 1 1 0
4 0 0 1
Given the above dataframe, is there an elegant way to groupby
with a condition?
I want to split the data into two groups based on the following conditions:
(df['SibSp'] > 0) | (df['Parch'] > 0) = New Group -"Has Family"
(df['SibSp'] == 0) & (df['Parch'] == 0) = New Group - "No Family"
then take the means of both of these groups and end up with an output like this:
SurvivedMean
Has Family Mean
No Family Mean
Can it be done using groupby or would I have to append a new column using the above conditional statement?
An easy way to group that is to use the sum of those two columns. If either of them is positive, the result will be greater than 1. And groupby accepts an arbitrary array as long as the length is the same as the DataFrame's length so you don't need to add a new column.
family = np.where((df['SibSp'] + df['Parch']) >= 1 , 'Has Family', 'No Family')
df.groupby(family)['Survived'].mean()
Out:
Has Family 0.5
No Family 1.0
Name: Survived, dtype: float64
这篇关于Pandas - 带条件公式的 Groupby的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!
本文标题为:Pandas - 带条件公式的 Groupby


基础教程推荐
- Python 中是否有任何支持将长字符串转储为块文字或折叠块的 yaml 库? 2022-01-01
- matplotlib 设置 yaxis 标签大小 2022-01-01
- Python,确定字符串是否应转换为 Int 或 Float 2022-01-01
- Kivy 使用 opencv.调整图像大小 2022-01-01
- kivy 应用程序中的一个简单网页作为小部件 2022-01-01
- 在 Django Admin 中使用内联 OneToOneField 2022-01-01
- 比较两个文本文件以找出差异并将它们输出到新的文本文件 2022-01-01
- 在 Python 中将货币解析为数字 2022-01-01
- 对多索引数据帧的列进行排序 2022-01-01
- 究竟什么是“容器"?在蟒蛇?(以及所有的 python 容器类型是什么?) 2022-01-01