callScriptWithTable
Run an external script on a table
Run a Python script or a jar file within an action node. This operator is similar to callScript in that it allows for custom programming on the data. It is different from callScript in that this operator is invoked once for the entire data set in the input tables and therefore has information on all the rows in the input data to work with at once.
Operator Usage in Easy Mode
- Click + on the parent node.
- Enter the Call Script With Table operator in the search field and select the operator from the Results to open the operator form.
- In the Input Table drop-down, enter or select the table containing the data to run this operator on.
- In the Script Name drop-down, select the name of the script.
- Click Run to view the result.
- Click Save to add the operator to the playbook.
- Click Cancel to discard the operator form.
Usage Details
callScriptWithTable(parent_table_name, script_name)
Input
parent_table_name
: Name of the parent table.
script_name
: Name of the Python script.
Input Data Format
The script receives the absolute path of a file holding the contents of parent_table_name as the first argument on the command line. The format of the file is JSON is as follows:
{
columns: ["col1_name", "col2_name"],
rows: [ { "col1_name" : "abc", "col2_name": 1.2 }, { "col1_name": "xyz", "col2_name": 4.3 } ]
}
Output data requirement from the script:
The script must output the absolute path of a file in the same JSON format as the input. If you are not sure where to put it, write to /tmp/.
Caveats:
-
Correlations are not preserved between the input and output of
callScriptWithTable()
. One of the consequences is that outputtinglhub_id
is redundant. There is no need to have anlhub_id
column in the output table. The system generates new randomlhub_id
values for each row. Correlation downstream will still be done provided that downstream operators support correlation. -
lhub_id
is a special internal column name.
Example 1
Test return
sourceIP | sourcePort | destIP | destPort |
---|---|---|---|
1 | 2 | 3 | 4 |
Script file: pyScript.py
import sys
path='/tmp/outputfile'
f = open(path, 'w')
# schema for a return table
f.write('{"schema":[\
{"name":"col2","type":"string"},\
{"name":"col3","type":"string"}\
],')
# table rows
f.write('"rows":[\
{"col3":"cell 1 3"},\
{"col2":"cell 2 2"},\
{"col1":"cell 3 1"}\
]}')
f.close()
print path
callScriptWithTable(table, "pyScript.py")
Output
col1 | col2 | col3 |
---|---|---|
cell 1 3 | ||
cell 2 2 | ||
cell 3 1 |
Example 2
Use the following Python script to access the data from a table.
import sys
import json
#table is in sys.argv[1]
with open(sys.argv[1]) as data_file:
data = json.load(data_file)
# data = {rows:{[{"lhub_id":"id1", ... other columns},
# {"lhub_id":"id2", ... other columns}
# ...]}
# to access data
lh_ids = [row["lhub_id"] for row in data["rows"]]
if "1" not in lh_ids:
exit(1)
# to return data, read Example1
Updated about 1 year ago