Authoring Ascribe Rule Sets

You use Ascribe™ Rule Sets to modify the output of the text analytics engine, introduce your own findings to the text analytics results, and to modify comments as they are loaded into Ascribe.  Rule Sets are authored in JavaScript and require some knowledge of JavaScript to create them.

The structure of a Rule Set

A Rule Set has an ID, which is the name of the Rule Set.  The ID must be unique among all Rule Sets in the Ascribe account.  Rule Sets also have an Enabled property.  If true, the Rule Set is available for use.  If false, the Rule Set cannot be used.

Rule Sets contain rules, with these types and purposes:

  • Modify Finding: modify the findings of the linguistic analysis.
  • Veto Finding: remove a finding from the linguistic analysis.
  • Add Finding from Finding: insert new findings into the linguistic analysis by examination of the findings produced by the analysis.
  • Add Finding from Response: insert new findings into the linguistic analysis
  • Modify Response on Load: change the text of comments being loaded
  • Class: code that can be used by any rule in the Rule Set

The first three rule types listed above operate on findings emitted by the linguistic analysis.  Using these three types of Rules you can tune the results of the analysis to your needs.

Add Finding from Response and Modify Response on Load rules operate on responses (or comments), independent of the linguistic analysis.

Class rules are distinct from the other rule types.  Class rules allow you to add code that can be used by any rule in the Rule Set.

Each rule in the Rule Set, except for Class rules, also has an Enabled property.  If disabled, the rule will be ignored when the Rule Set is executed.

Findings

A finding from the linguistic analysis engine has these properties:

  • The comment that was analyzed to produce the finding. We refer to this interchangeably as the response, meaning the response to a survey question.  In any case it is the text that was input to the linguistic analysis engine.
  • The topic. For sentiment analysis this is the word or phrase about which sentiment was expressed.  For topic analysis this is a topic mentioned in the comment, typically a noun or noun phrase.  The topic may be empty in a sentiment finding.  For example, the comment “It was awful” produces a finding of negative sentiment for which the topic is unknown.
  • The expression. For sentiment analysis this is the expression of sentiment.  The comment “The showers were terrible” produces a topic of “showers” and an expression of “terrible”.  For topic analysis the expression is the word modifying the topic.  The comment “I have worked out at the other gyms in the area” gives the topic “gym” and expression “have worked out at other”.  For topic analysis the expression may be empty, when a topic is found without a modifying phrase.
  • The extract. This is the segment of the comment that yielded the finding.
  • The sentiment An integer value between -2 (strong negative) and +2 (strong positive).  Topic findings have a null sentiment score.

A given comment analyzed by the linguistic engine may produce any number of findings, including zero.

Properties of the Finding Type

When a rule is invoked it is passed a predefined object named f of type Finding.  In JScript notation the Finding type would be defined as:

class Finding {
  r: String; // Response (comment) text
  t: String; // Topic
  e: String; // Expression
  x: String; // Extract
  s: int; // Sentiment score
}

An object f of type Finding also has these read-only properties:

f.IsValid: boolean
f.IsInvalid: boolean

The IsInvalid property is true if any of f.t, f.e, and f.x are null, empty, or whitespace.  The IsValid property always returns !IsInvalid.

Sentiment Score

Note that the type of the sentiment score property s is int.  The allowed range of f.s is [-2, 2].  Therefore f.s has five allowed integer values and may also be null or undefined.  If f.s is null or undefined it means there is no sentiment associated with the finding.  This is different than f.s == 0, which means a finding of neutral sentiment.

If f.s is assigned an integer value outside the range [-2, 2] it is treated the same as f.s == null, or no finding of sentiment.

Beware of type mismatch possibilities when assigning a value to f.s.  Without an explicit type assignment to a numeric variable the implicit type is double, which will cause a type mismatch error when it is assigned to f.s.  Examples:

f.s = 1; // OK, the double value 1 can be implicitly converted to int
f.s = 1.5; // Compile time type mismatch error
f.s = int(1.5); // OK, double value 1.5 has been cast to int
var s = 1; f.s = s; // Compile time type mismatch error, s has type double
var s: int = 1; f.s = s; // OK, s is explicitly typed as int
var s = int(1); f.s = s; // Compile time type mismatch error, s still has type double
f.s = 5; // No error, but equivalent to f.s = null

Topic Findings versus Sentiment Findings

Ascribe performs two independent linguistic analyses on each Inspection, Topic analysis and Sentiment analysis.  Topic analysis never assigns a sentiment score to the findings it generates, sentiment analysis always does.  You can determine whether a finding has been produced by topic analysis by testing the sentiment score for a null value:

if (f.s == null) {
  // it's a topic finding
} else {
  // it's a sentiment finding
}

Constructors of the Finding Type

The Finding type has two constructors.  With no arguments a new Finding object is created with string properties set to an empty (zero length) string, and sentiment to null:

var newFinding = new Finding();
// f.r == f.t == f.e == f.x == ""
// f.s == null

The second constructor accepts a single argument of type Finding:

var newFinding = new Finding(f); // properties of newFinding are the same as f

Rule Set Execution

Rule Sets operate on responses and findings as shown in this diagram:

Rule Sets Workflow

When you load data into CX Inspector the entire workflow is executed.  When you apply a Rule Set after data are loaded into an Inspection execution begins with the stored responses and findings, and the Modify Response on Load rules are not executed.  When a Rule Set is used with the Ascribe Coder API only the Modify Response on Load rules are executed.

See Ascribe Rule Set Execution Workflow for a detailed description of Rule Set execution.

Rule Programming Language

Rules are authored in JScript, a superset of ECMA 3 JavaScript.  If you are a JavaScript programmer, you can for the most part simply program rules as if they were pure JavaScript.  If you venture into use of Class rules you will probably want to read through the JScript reference.

A key difference between JScript and JavaScript is the introduction of optional type annotations, like TypeScript.  JScript also supports classes, like ECMAScript 2015.  You will occasionally have use for type annotations, such as:

var s: int = 2; // variable s is an integer type

Rule Syntax and Semantics

In this section we will cover the syntax and semantics of all rule types except for Class rules.  Class rules are not directly invoked during Rule Set execution and are described in a later section.

Rules are the bodies of functions generated by the Rule Set compiler.  If we have a Modify Finding rule:

// If topic is "nothing" reverse the sentiment polarity
if (f.s && /^nothing$/i.test(f.t)) {
  f.s = -f.s;
}

The compiler generates this code from the rule:

// Modify Finding
static function Rule1458(f: Finding) {
  // If topic is "nothing" reverse the sentiment polarity
  if (f.s && /^nothing$/i.test(f.t)) {
    f.s = -f.s;
  }
  return f;
}

The compiler has placed the rule body within a function with a single argument: the finding.  It has also added a

return f;

statement at the end of the function body.  You can inspect the source code generated by the Rule Set compiler by opening the Rule Set and clicking the Print icon.

Modify Finding Rules

Modify Finding rules participate in the pipeline of rules described in Ascribe Rule Set Execution Workflow:

Modify Finding ⇒ Veto Finding ⇒ Add Finding from Finding

The compiler introduces a

return f;

statement at the end of the rule body.  This return statement is not introduced by the compiler for other rule types, except for Modify Response on Load rules.

The rule can change the properties of the finding f.  The finding returned by the rule is passed to the next rule in the pipeline.

The finding returned by the rule will be ignored unless a valid finding is returned.  When a finding is ignored for this reason it is as if the rule had not executed.  The finding originally passed to the rule is passed to the next rule in the pipeline.  The finding returned by a Modify Finding rule is ignored if:

  • The finding returned is null,
  • or any of the properties t, e, x is null or whitespace.

The rule may change the value f.r, but any changes to the property will be ignored.  A Modify Finding rule cannot cause the text of the response to change.

Veto Finding Rules

Veto Finding rules also participate in the pipeline of rules described in Ascribe Rule Set Execution Workflow:

Modify Finding ⇒ Veto Finding ⇒ Add Finding from Finding

Veto Finding rules can veto the finding and cause it to be discarded by returning true.  A vetoed finding is discarded from the analysis and the remaining rules in the pipeline are short circuited for that finding.

While a Veto Finding rule can modify the properties of the finding f, any such changes have no effect on the finding in the pipeline.  Only the value returned by the rule is considered.  If rule returns true the finding is vetoed, otherwise the rule has no effect.

The rule must return a Boolean true value to veto the finding.  Returning a truthy value such as 1 or "veto" will not veto the finding.  Examples:

return true; // vetoed
return 1; // not vetoed
return !0; // vetoed

Add Finding from Finding Rules

Add Finding from Finding rules are the last part of the pipeline of rules described in Ascribe Rule Set Execution Workflow:

Modify Finding ⇒ Veto Finding ⇒ Add Finding from Finding

These rules can add additional findings to the analysis, by inspection of the finding from the linguistic analysis.  Each rule can add up to 1000 findings.  If the rule returns a single valid finding, that finding is added to the analysis.  If the rule returns an array of findings, all the valid findings returned are added to the analysis.

A finding returned by an Add Finding from Finding rule is ignored if:

  • The finding is null,
  • or any of its properties t, e, x is null or whitespace.

An Add Finding from Finding with no return statement will do nothing.  The rule is free to modify the properties of the finding f and return that object.  The value returned will be added to the analysis, but the original finding passed to the rule will not be affected.  Add Finding from Finding rules cannot affect the finding that they are passed.

Note that the trivial rule:

return f;

will add duplicates of every finding produced by the linguistic engine to the analysis, doubling the number of findings!

To add multiple findings to the analysis, return an array of Finding objects.  This Add Finding from Finding rule will add five new Findings to the analysis, one with each of the allowed integer sentiment scores.  The other properties of the new findings will be the same as f:

var newFindings = []; // create an empty array
for (var s: int = -2; s <= 2; s++) { // Note s is typed as int
  var nf = new Finding(f); // clone f
  nf.s = s;
  newFindings.push(nf);
}
return newFindings;

Add Finding from Response Rules

As described in Ascribe Rule Set Execution Workflow, Add Finding from Response rules execute independently of other rule types.  The are not part of the pipeline:

Modify Finding ⇒ Veto Finding ⇒ Add Finding from Finding

Instead, Add Finding from Response rules operate on all the responses (comments) in the analyzed variables in the Inspection.  These rules let you augment the analysis with findings created by your Rule Set.

Like Add Finding from Finding rules, Add Finding from Response rules can add up to 1000 new findings to the analysis.  Add Finding from Response rules use the same semantics for adding findings as Add Finding from Finding rules.  Returning a single Finding object whose IsValid property is true will add that finding to the analysis.  Returning an array of Finding objects will add all those whose IsValid property is true to the analysis.

The Finding object f passed to an Add Finding from Response rule has only its f.r property set to the text of the response.  The properties f.t, f.e, f.x, and f.s are all null.  Hence you must set f.t, f.e, and f.x to valid values to add a new finding.

Modify Response on Load Rules

As described in Ascribe Rule Set Execution Workflow, Modify Response on Load rules execute in a different part of the workflow than other rule types.  These rules allow you to modify the response text before it is stored in Ascribe.  Therefore, these rules can be used to curate the response text, perhaps to remove personally identifiable information, or to correct spelling error.  These rules can also veto a response, causing it to be discarded and not stored in Ascribe.

Like Add Finding from Response rules, the Finding object f passed to Modify Response on Load rules have only the property f.r populated.  It contains the text of the response being loaded.

The compiler introduces a

return f;

statement at the end of the rule body.  This return statement is not introduced by the compiler for other rule types, except for Modify Finding rules.

If the rule returns a finding, and if the r property of the finding is not null or whitespace, that text will be stored in Ascribe as the response text.  Therefore, an empty rule will store the response unchanged, because of the implicit

return f;

statement at the end of the rule.  If the rule returns anything other than an object of type Finding the response will be discarded and not loaded.  All these statements will cause the response to be discarded:

return false;
return true;
return null;
return;

If the property r of the Finding returned is null or whitespace the response will not be discarded, but the text of the response will not be modified.  The rule performs no action.

Modify Response on Load rules cannot introduce additional responses.  Returning an array of findings will not add multiple responses.  Instead, it will cause the response to be discarded, because the rule did not return an object of type Finding.

Class Rules

Class rules are not invoked directly by the Rule Set execution workflow.  They allow you to write code that can be used by any rule in your Rule Set.

To author Class rules you will need an understanding of the JScript language.

Class rules simply inject the rule into the script at Rule Set compilation time.  They are injected into the package that contains the rules.  Hence, a Class rule may contain only class definitions, in accordance with the JScript syntax.

Summary

Authoring a Rule Set requires knowledge of the JavaScript programming language, and the syntax and semantics described in this post.  You can use Rule Sets to tailor your text analyses to your specific needs.  Also see these related posts: Introduction to Ascribe Rule SetsTesting Ascribe Rule SetsAscribe Rule Set Execution Workflow.

Leave a Reply

Your email address will not be published. Required fields are marked *