Power BI:如何在 Power Query 编辑器中将 Python 与多个
问题描述
如何使用 Python 脚本创建一个新表,该脚本使用两个现有表作为输入?例如,通过使用
如何更改该设置以引用多个表?
<小时>样本数据
这里有两个可以存储为 CSV 文件并使用 Home > 加载的表格.获取数据 >文本/CSV
表 1
日期,值12108-10-12,12108-10-13,22108-10-14,32108-10-15,42108-10-16,5
表2
日期,值22108-10-12,102108-10-13,112108-10-14,122108-10-15,132108-10-16,14
这是针对 R 脚本描述的相同挑战
详情:
必须非常仔细地遵循上面的列表才能使事情正常进行.所以这里是所有肮脏的小细节:
1. 使用 Get Data
将表格作为 CSV 文件加载到 Power BI Desktop 中.
2.点击编辑查询
.
3.在Table1
中,点击Date列
旁边的符号,选择Text
并点击替换当前
4. 对 Table2
5.在Home
选项卡上,点击输入数据
6.在出现的框中,除了点击OK
之外别无他法.
7. 这将在 Queries
下插入一个名为 Table3
的空表,这正是我们想要的:
8.进入Transform
标签并点击Run Python Script
:
9. 这将打开 Run Python Script
编辑器.您可以从这里开始编写脚本,但这会使接下来的步骤变得不必要地复杂.所以什么都不做,只点击OK
:
10. 在公式栏中,您将看到公式 = Python.Execute("# 'dataset' 保存此脚本的输入数据#(lf)",[dataset=#"更改类型"])
.请注意,您在 Applied Steps 下有一个名为 Run Python Script
的新步骤:
11. 上面的截图中有几个有趣的细节,但首先我们要分解函数 = Python.Execute("# 'dataset' 的参数此脚本的输入数据#(lf)",[dataset=#"Changed Type"])
.
"# 'dataset'" 部分保存此脚本的输入数据#(lf)"
只是插入您可以在 Python 脚本编辑器中看到的注释
. 所以它并不重要,但你也不能把它留空.我喜欢使用更短的东西,比如 "# Python:"
.
[dataset=#"Changed Type"]
部分是一个指针,指向处于Changed Type
Table3>.因此,如果您在插入 Python 脚本之前所做的最后一件事不是更改数据类型,那么这部分看起来会有所不同.然后使用 dataset
作为 pandas 数据框,可以在您的 python 脚本中使用该表.考虑到这一点,我们可以对公式进行一些非常有用的更改:
12. 将公式栏更改为 = Python.Execute("# Python:",[df1=Table1, df2=Table2])
并点击 输入代码>.这将使
Table1
和 Table2
可用于您的 Python 脚本作为两个分别名为 df1
和 df2
的 pandas 数据帧.
13.点击Applied Steps
下Run Python script
旁边的齿轮(还是一朵花?)图标:
14. 插入以下代码段:
代码:
将 pandas 导入为 pddf3 = pd.merge(df1, df2, how = 'left', on = ['Date'])df3['Value3'] = df1['Value1']*df2['Value2']
这将在 Date 列
上连接 df1
和 df2
,并插入一个名为 Value3
的新计算列.不太花哨,但通过此设置,您可以任何在 Power BI 世界中使用您的数据并借助 Python 的强大功能.
15.点击OK
,你会看到:
您将看到 df3
在蓝色方块中的输入数据框 df1
和 df2
下列出.如果您已在 Python 脚本中指定任何其他数据框作为计算步骤,它们也会在此处列出.要将其变成 Power BI 可访问的表格,只需单击绿色箭头所示的 Table
.
16. 就是这样:
请注意,Date 列
的数据类型默认设置为Date
,但您可以将其更改为Text
,如前所述.
点击首页>关闭并应用
退出 Power Query 编辑器
并返回到 Power BI Desktop 中所有开始的位置.
How can you create a new table with a Python script that uses two existing tables as input? For example by performing a left join
using pandas merge?
Some details:
Using Home > Edit queries
you can utilize Python under Transform > Run Python Script
. This opens a Run Python Script
dialog box where your're told that '#dataset' holds the input data for this script
. And you'll find the same phrase if you just click OK
and look at the formula bar:
= Python.Execute("# 'dataset' holds the input data for this script#(lf)",[dataset=#"Changed Type"])
This also adds a new step under Applied Steps
called Run Python script
where you can edit the Python script by clicking the gear symbol on the right:
How can you change that setup to reference more than one table?
Sample data
Here are two tables that can be stored as CSV files and loaded using Home > Get Data > Text/CSV
Table1
Date,Value1
2108-10-12,1
2108-10-13,2
2108-10-14,3
2108-10-15,4
2108-10-16,5
Table2
Date,Value2
2108-10-12,10
2108-10-13,11
2108-10-14,12
2108-10-15,13
2108-10-16,14
This is the same challenge that has been described for R scripts here. That setup should work for Python too. However, I've found that that approach has one drawback: It stores the new joined or calculated table as an edited version of one of the previous tables. The following suggestion will demonstrate how you can produce a completely new calculated table without altering the input tables (except changing the data type of the Date columns from Date
to Text
because of this.)
Short answer:
In the Power Query editor
, follow these steps:
Change the data type of the
Date columns
in both columns toText
.Click
Enter Data
. Only clickOK
.Activate the new
Table3
and useTransform > Run Python Script
. Only clickOK
.Activate the formula bar and replace what's in it with
= Python.Execute("# Python:",[df1=Table1, df2=Table2])
. ClickEnter
.If you're prompted to do so, click
Edit Permission
andRun
in the next step.Under
Applied Steps
, in the new step namedRun Python Script
, click the gear icon to open theRun Python Script
editor.Insert the snippet below and click
OK
.
Code:
import pandas as pd
df3 = pd.merge(df1, df2, how = 'left', on = ['Date'])
df3['Value3'] = df1['Value1']*df2['Value2']
Next to df3
, click Table
, and that's it:
The details:
The list above will have to be followed very carefully to get things working. So here are all of the dirty little details:
1. Load the tables as CSV files in Power BI Desktop using Get Data
.
2. Click Edit Queries
.
3. In Table1
, Click the symbol next to the Date column
, select Text
and click Replace Current
4. Do the same for Table2
5. On the Home
tab, click Enter Data
6. In the appearing box, do nothing else than clicking OK
.
7. This will insert an empty table named Table3
under Queries
, and that's exactly what we want:
8. Go to the Transform
tab and click Run Python Script
:
9. This opens the Run Python Script
editor. And you can start writing you scripts right here, but that will make things unnecessarily complicated in the next steps. So do nothing but click OK
:
10. In the formula bar you will se the formula = Python.Execute("# 'dataset' holds the input data for this script#(lf)",[dataset=#"Changed Type"])
. And notice that you've got a new step under Applied Steps named Run Python Script
:
11. There are several interesting details in the screenshot above, but first we're going to break down the arguments of the function = Python.Execute("# 'dataset' holds the input data for this script#(lf)",[dataset=#"Changed Type"])
.
The part "# 'dataset'" holds the input data for this script#(lf)"
simply inserts the comment that you can see in the Python Script Editor
. So it's not important, but you can't just leave it blank either. I like to use something shorter like "# Python:"
.
The part [dataset=#"Changed Type"]
is a pointer to the empty Table3
in the state that it is under Changed Type
. So if the last thing that you do before inserting a Python Script is something else than changing data types, this part will look different. The table is then made available in your python script using dataset
as a pandas data frame. With this in mind, we can make som very useful changes to the formula:
12. Change the formula bar to = Python.Execute("# Python:",[df1=Table1, df2=Table2])
and hit Enter
. This will make Table1
and Table2
available for your Python scripts as two pandas dataframes named df1
and df2
, respectively.
13. Click the gear (or is it a flower?) icon next to Run Python script
under Applied Steps
:
14. Insert the following snippet:
Code:
import pandas as pd
df3 = pd.merge(df1, df2, how = 'left', on = ['Date'])
df3['Value3'] = df1['Value1']*df2['Value2']
This will join df1
and df2
on the Date column
, and insert a new calculated column named Value3
. Not too fancy, but with this setup you can do anything you want with your data in the world of Power BI and with the power of Python.
15. Click OK
and you'll se this:
You'll see df3
listed under the input dataframes df1
and df2
in the blue square. If you've assigned any other dataframes as a step in your calculations in the Python script, they will be listed here too. In order to turn it into an accessible table for Power BI, just click Table
as indicated by the green arrow.
16. And that's it:
Note that the data type of the Date column
is set to Date
by default, but you can change that to Text
as explained earlier.
Click Home > Close&Apply
to exit the Power Query Editor
and go back to where it all started in Power BI Desktop.
这篇关于Power BI:如何在 Power Query 编辑器中将 Python 与多个表一起使用?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!