如何在Ploly中将多指标数据框绘制为堆叠条形图
本文介绍了如何在Ploly中将多指标数据框绘制为堆叠条形图的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个下面的数据帧,它将被进一步处理以创建数据透视表。现在,我正在尝试在Ploly中绘制多指标透视表数据。但在PLOTLY中,不知何故它没有采用这些值并显示错误。 我需要在x轴上绘制类别‘Development’和‘Developing’,并绘制与这些类别相关的数据。关联的"员工"数据应绘制在每个类别中。‘Y轴必须是’gdp‘,堆栈条必须是’cond_cat‘。下面是供参考的代码。
示例数据帧
import pandas as pd
import numpy as np
s = 200
np.random.seed(365) # so the data is the same each time
df = pd.DataFrame({"Country": np.random.choice(["USA America", "JPY one two", "MEX", "IND", "AUS"], s),
"employee": np.random.choice(["Bob", "Sam", "John", "Tom", "Harry"], s),
"economy_cat": np.random.choice(["developing","develop"], s),
"cond_cat": np.random.choice(["good","bad", 'worse', 'better', 'average'], s),
"gdp": np.random.randint(5, 75, s),
})
df = df[df.Country=='USA America']
# print(df.head())
Country employee economy_cat cond_cat gdp
9 USA America Sam developing better 30
11 USA America Bob developing average 45
21 USA America John develop bad 29
22 USA America Sam develop bad 73
30 USA America Harry develop bad 25
重塑
df_pivot = df.pivot_table(index=['economy_cat','employee'],columns=['cond_cat'],values='gdp',aggfunc='sum')
# print(df_pivot)
cond_cat average bad better good worse
economy_cat employee
develop Bob 6.0 NaN 46.0 NaN NaN
Harry NaN 25.0 9.0 NaN NaN
John 37.0 29.0 NaN NaN NaN
Sam NaN 82.0 NaN NaN 60.0
Tom 48.0 NaN NaN 51.0 NaN
developing Bob 45.0 NaN NaN 45.0 NaN
Harry 75.0 183.0 113.0 NaN NaN
John 16.0 36.0 27.0 67.0 NaN
Sam NaN NaN 30.0 NaN 43.0
Tom 111.0 NaN NaN 77.0 73.0
绘图
fig = make_subplots(rows=1, cols=1)
fig.add_trace(
go.Bar(
x= df_pivot["economy_cat","employee"],
y= df_pivot["cond_cat"],marker_color = "#1f77b4",showlegend=False,
marker_line_color = '#1f77b4',
),
row=1,
col=1,
)
fig.add_trace(
go.Bar(
x= df_pivot["economy_cat","employee"],
y= df_pivot["cond_cat"],marker_color = "rgba(255, 0, 0, 0.6)",showlegend=False,
marker_line_color = "rgba(255, 0, 0, 0.6)",
),
row=1,
col=1,
)
fig.update_layout(barmode = 'stack')
fig.show()
绘制时出错
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
e:Anaconda3libsite-packagespandascoreindexesase.py in get_loc(self, key, method, tolerance)
3360 try:
-> 3361 return self._engine.get_loc(casted_key)
3362 except KeyError as err:
e:Anaconda3libsite-packagespandas\_libsindex.pyx in pandas._libs.index.IndexEngine.get_loc()
e:Anaconda3libsite-packagespandas\_libsindex.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas\_libshashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
pandas\_libshashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
KeyError: ('economy_cat', 'employee')
The above exception was the direct cause of the following exception:
KeyError Traceback (most recent call last)
C:UsersTRENTO~1.MCKAppDataLocalTemp/ipykernel_18596/2928341867.py in <module>
14 fig.add_trace(
15 go.Bar(
---> 16 x= df_pivot["economy_cat","employee"],
17 y= df_pivot["cond_cat"],marker_color = "#1f77b4",showlegend=False,
18 marker_line_color = '#1f77b4',
e:Anaconda3libsite-packagespandascoreframe.py in __getitem__(self, key)
3456 if self.columns.nlevels > 1:
3457 return self._getitem_multilevel(key)
-> 3458 indexer = self.columns.get_loc(key)
3459 if is_integer(indexer):
3460 indexer = [indexer]
e:Anaconda3libsite-packagespandascoreindexesase.py in get_loc(self, key, method, tolerance)
3361 return self._engine.get_loc(casted_key)
3362 except KeyError as err:
-> 3363 raise KeyError(key) from err
3364
3365 if is_scalar(key) and isna(key) and not self.hasnans:
KeyError: ('economy_cat', 'employee')
推荐答案
如果我理解正确,以下是您要查找的完整代码。
需要注意的是,图中预期数据框列作为轴,而不是多索引,因此,旋转的数据框索引被重置,然后可以将列传递到x=
导入和DataFrame
import pandas as pd
import numpy as np
from plotly.subplots import make_subplots
import plotly.graph_objects as go
import plotly.express as px
from itertools import cycle
# beginning with df_pivot from the OP, reset the index
df = df_pivot.reset_index()
# print(df)
cond_cat economy_cat employee average bad better good worse
0 develop Bob 6.0 NaN 46.0 NaN NaN
1 develop Harry NaN 25.0 9.0 NaN NaN
2 develop John 37.0 29.0 NaN NaN NaN
3 develop Sam NaN 82.0 NaN NaN 60.0
4 develop Tom 48.0 NaN NaN 51.0 NaN
5 developing Bob 45.0 NaN NaN 45.0 NaN
6 developing Harry 75.0 183.0 113.0 NaN NaN
7 developing John 16.0 36.0 27.0 67.0 NaN
8 developing Sam NaN NaN 30.0 NaN 43.0
9 developing Tom 111.0 NaN NaN 77.0 73.0
打印
# data and colors
columns = df.columns[2:]
palette = cycle(px.colors.qualitative.Alphabet)
# palette = cycle(px.colors.sequential.PuBu)
colors = {c:next(palette) for c in columns}
# subplot setup
fig = make_subplots(rows=1, cols=1)
# add bars
for cols in columns:
fig.add_trace(go.Bar(x=[df['economy_cat'], df['employee']],
y = df[cols],
name = cols,
legendgroup = cols,
marker_color = colors[cols],
showlegend = True
), row = 1, col = 1)
fig.update_layout(barmode='stack')
fig.show()
这篇关于如何在Ploly中将多指标数据框绘制为堆叠条形图的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!