baselineScorerV2

Score events based on past behavior

Scores the events on the specified groupByField and metricField in the given table, same as the original BaselineScorer, along with some new mutable parameters. The higher score the more abnormal event is.

Operator Usage in Easy Mode

  1. Click + on the parent node.
  2. Enter the Baseline Scorer operator in the search field and select the operator from the Results to open the operator form.
  3. In the Input Table drop-down, enter or select the table containing the data to run this operator on.
  4. In the Group By Field, enter the column name by which to group the rows by.
  5. In the Metric Field, enter the column name that contains the metric to be used for scoring.
  6. In the Baseline Table drop-down, enter or select the name of the baseline table.
  7. In the Not In Baseline Score field, enter the score for items not in the baseline. Default is 8.
  8. In the Not Enough Examples Score field, enter the score for items less in number than the notEnoughExamplesThreshold. Default is 6.
  9. In the Not Enough Examples Threshold field, enter the threshold for similar items. Default is 1.
  10. In the Max Std Dev field, enter the maximum value for standard deviation. Default is 10.0.
  11. In the Std Dev Multiplier field, enter the multiplier for standard deviation. Default is 1.0.
  12. In the Min Ratio Between Std Dev And Avg field, enter the minimum Ratio between mean and standard deviation. Default is 0.3.
  13. Click Run to view the result.
  14. Click Save to add the operator to the playbook.
  15. Click Cancel to discard the operator form.

Usage Details

baselineScorerV2(inputTable, groupByField, metricField, baselineTable, notInBaselineScore,
                 notEnoughExamplesScore, notEnoughExamplesThreshold, maxStdDev, stdDevMultiplier,
                 minRatioBetweenStdDevAndAvg)

Input
inputTable (TableReference) - The table containing the data to run this operator on.
groupByField (ColumnReference) - The column by which to group the rows by
metricField (ColumnReference) - The column that contains the metric to be used for scoring
baselineTable (TableReference) - The name of the baseline table
notInBaselineScore (Long): Score for items not in baseline. Default is 8
notEnoughExamplesScore (Long) - Score for items less in number than the notEnoughExamplesThreshold. Default is 6
notEnoughExamplesThreshold (Long) - Threshold for similar items. Default is 1
maxStdDev (Double) - Maximum value for standard deviation. Default is 10.0
stdDevMultiplier (Double) - Multiplier for standard deviation. Default is 1.0
minRatioBetweenStdDevAndAvg (Double) - Minimum Ratio between mean and standard deviation. Default is 0.3

Output
The input table with an additional lhub_score column containing the score.

Example

Input
tableA:

iduserdownload_count
x1emil12
x2emil22
x3monica32
x4monica35

tableB:

iduserdownload_count
v1monica25
v2emil15
v3emil50
baselineScorerV2(tableB, "user", "download_count", tableA , 8 , 6 , 1 , 10.0 , 1.0 , 0.3)

Output

iduserdownload_countlhub_scorelh_baselinelhub_confidence_score
v3emil50412,224
v1monica25032,354
v2emill15012,224

© 2017-2021 LogicHub®. All Rights Reserved.