callScriptWithTable

Run an external script on a table

Run a Python script or a jar file within an action node. This operator is similar to callScript in that it allows for custom programming on the data. It is different from callScript in that this operator is invoked once for the entire data set in the input tables and therefore has information on all the rows in the input data to work with at once.

Operator Usage in Easy Mode

  1. Click + on the parent node.
  2. Enter the Call Script With Table operator in the search field and select the operator from the Results to open the operator form.
  3. In the Input Table drop-down, enter or select the table containing the data to run this operator on.
  4. In the Script Name drop-down, select the name of the script.
  5. Click Run to view the result.
  6. Click Save to add the operator to the playbook.
  7. Click Cancel to discard the operator form.

Usage Details

callScriptWithTable(parent_table_name, script_name)

Input

parent_table_name: Name of the parent table.
script_name: Name of the Python script.

Input Data Format

The script receives the absolute path of a file holding the contents of parent_table_name as the first argument on the command line. The format of the file is JSON is as follows:

{ 
  columns: ["col1_name", "col2_name"],
  rows: [ { "col1_name" : "abc", "col2_name": 1.2 }, { "col1_name": "xyz", "col2_name": 4.3 } ]
}

Output data requirement from the script:

The script must output the absolute path of a file in the same JSON format as the input. If you are not sure where to put it, write to /tmp/.

Caveats:

  • Correlations are not preserved between the input and output of callScriptWithTable(). One of the consequences is that outputting lhub_id is redundant. There is no need to have an lhub_id column in the output table. The system generates new random lhub_id values for each row. Correlation downstream will still be done provided that downstream operators support correlation.

  • lhub_id is a special internal column name.

Example 1

Test return

sourceIP sourcePort destIP destPort
1234

Script file: pyScript.py

import sys

path='/tmp/outputfile'

f = open(path, 'w')

# schema for a return table
f.write('{"schema":[\
  {"name":"col2","type":"string"},\
  {"name":"col3","type":"string"}\
 ],')

# table rows
f.write('"rows":[\
  {"col3":"cell 1 3"},\
  {"col2":"cell 2 2"},\
  {"col1":"cell 3 1"}\
 ]}')

f.close()
print path
callScriptWithTable(table, "pyScript.py")

Output

col1 col2 col3
cell 1 3
cell 2 2
cell 3 1

Example 2

Use the following Python script to access the data from a table.

import sys
import json

#table is in sys.argv[1]
with open(sys.argv[1]) as data_file:    
    data = json.load(data_file)

# data = {rows:{[{"lhub_id":"id1", ... other columns}, 
#                {"lhub_id":"id2", ... other columns}
#                ...]}

# to access data
lh_ids = [row["lhub_id"] for row in data["rows"]]
  
if "1" not in lh_ids:
  exit(1)
  
# to return data, read Example1

© 2017-2021 LogicHub®. All Rights Reserved.