What is wrong with OR in business rules?

Ready to explodeMyth or Reality?  Is it bad to use OR-ed conditions in Business rules?

A friend of mine pinged me last week to get my recommendation on the usage of OR-ed conditions.  There is actually a good technical reason why they should not be used…  and a few semantic arguments too…  Other than that, I am quite happy to use them!

A little background

A business rules is a statement that prescribes a set of actions, when conditions are met.

If the shopping cart amount is over $25 and the mode of shipping is “regular”

Then shipping is free

Rule design best practices warn the user about the dangers of mixing alternative conditions that lead to the same actions.  For example:

If the shopping cart amount is over $25 and the mode of shipping is “regular”

or the customer is a “prime” customer

Then shipping is free

Best practices dictate that this piece of logic should be implemented as 2 distinct rules.  The question we will discuss today is why.

RETE does not like OR

I believe this is the source of the heat that ORs have suffered since the dawn of rules engines.  Since the Rete algorithm does not support them natively, an OR-ed rule ends up being replicated to accommodate the equivalent logic.  Business rules management systems do that automatically, but there may be some side effects.

Logically, this is not a big deal of course.  The example above is simple.  In my career, I have seen individual rules that heavily use ORs with over 10 composite conditions, each made of 10 OR-ed individual conditions.  For instance, consider a rule like this one:

If (A1 or A2 or A3 or A4 or A5 or A6 or A7 or A8 or A9 or A10)

or (B1 or B2 or B3 or B4 or B5 or B6 or B7 or B8 or B9 or B10)

or (C1 or C2 or C3 or C4 or C5 or C6 or C7 or C8 or C9 or C10)

or (D1 or D2 or D3 or D4 or D5 or D6 or D7 or D8 or D9 or D10)

or (E1 or E2 or E3 or E4 or E5 or E6 or E7 or E8 or E9 or E10)

etc.

This single rule translates to 100 rules in the Rete network, and even more nodes connected in every possible ways.  Each node keeping track of the list of compliant objects, memory usage can skyrocket.  The performance issues can then appear in the rules compiler as it computes and assembles the Rete network, or at execution time.

This combinatorial explosion can get out of hand when the rulebase involves thousands of rules or more.  Experts in benchmarks and performance tuning typically discourage the usage of ORs for this reason.

OR gets in the way of Flexibility

Rules architects also discourage the usage of OR-ed conditions because:

  • It could lead to convoluted decisioning logic that hinders the readability of the rule
  • It combines separate business concepts that may need to evolve differently

Let’s get back to the shipping example above.  Over time, marketers might offer free 2-day shipping to prime customers but only free regular shipping for orders that are $25 or more.  It is intuitive that each segment should be managed independently.  There is no need to artificially force them in the same rule.

One of the key premises of business rules technology is to isolate business policies into atomic pieces of logic, instead of keeping them in spaghetti code.  ORs open the door for bad practice as we lose the atomicity of those policies.

What is the ELSE semantic?

Else is another disputed practice, but let’s not digress…  Let’s assume we are using ELSEs in conjunction with ORs.

Although it is absolutely legal from a syntax standpoint, and logically sound, to use both in a single rule, it might take rules writers some efforts to wrap their head around what the ELSE path would be.  This is similar to double negations.  This type of reasoning demands attention and rigor especially when “special values” such as unknown or unavailable get into play.  What if the shopping cart contains less than $25 of goods, but we do not know whether the shopper is prime or not?

As more and more options are thrown in the same rule, the more complicated it will look, and the more opportunities for confusion and mistakes.  This point exacerbates the previous point on maintainability.

So should we ban ORs?

Not so fast…  There are some cases where ORs really belong in the rule definition.  Rule design excellence is about knowing when to use them and what other options are available to replace the ORs we do not want.

Bad Performance?  How about list membership?

Separating the rules manually to avoid the OR explosion does not solve the performance issue.  The example above would perform exactly the same whether the rules are manually or automatically exploded.  The solution is elsewhere…

In most cases, large numbers of ORs in a single rule is meant to express a list membership.  For example:

If state is CA or state is NY or state is VA

Then apply sales taxes

The tests all refer to the same attribute — state — and can be replaced by a check for list membership, using a keyword such as IN:

If state in CA, NY, VA

Then apply sales taxes

The list membership check dramatically increases runtime performance.  The rule ends up being a little more readable as well.

Or defining actual group membership?

In some cases, conditions might be looking at different attributes but they might still be defining membership to a “group”.

If gender is female

or race in African-American, Asian American, Hispanic American, Native American

or age is older than 40

or disability is true

Then minority is true

This is quite frequent in business policies that define business terms.  Business Rules implementations tend to capture those definitions directly in the business rules.  We recommend capturing those as actual business terms that can be reviewed by business owners and leveraged consistently in business rules.

Business Term

As illustrated here, you want business terms to be computed and become part of the data model, like any other characteristic, computed or not, of the document to be processed.

The value here is that you isolate the OR-ed computation so that it can be leveraged in simple AND-ed rules as much as possible.  Those membership relationships are typically globally defined and maintained so that makes sense.

And when you need ORs…

Then you can certainly use them when needed.  The key, in my opinion, is to find ways to easily understand the decisioning logic, to untangle the logic in a manner of speaking.

What I have found effective is obviously the use of Fluid Metaphors, which allow you to turn on-demand your original syntax into decision tables, trees or graphs.  This really helps address the concerns on maintainability:

  • If rules were created independently but lead to the same action, use a decision graph to see them graphically linked
  • If OR-ed conditions were merged in a single rule, use a decision table or any other metaphor to visualize the individual paths for that rule

The only concern left is really the semantic interpretation of ORs and ELSEs combined.  But I would argue that ELSEs are a worse Evil than ORs…  I guess I have another post already figured out ;-)

14 responses to “What is wrong with OR in business rules?”

  1. Carole-Ann, this is really a good article and you have addressed a very popular issue among most of the BRM projects.

    There is one similar issue which arises when rule premise inheritance is being used, suppose you create a rule R1 using AND’s and OR’s (for eg) and in another Rule R2 you say if NOT R1 and (rest conditions); it makes a challenging job to even depict the final Rule when R1 is also complex rule.

    1. Thank you Shishir!

      You are exactly right. A rule reference, especially with a negation, is very similat to the ELSE issue. Great catch!

  2. Carole-Ann, excellent post. I’ve seen rules with 260 ORs, and growing. It’s not obvious which path of the rule actually triggers the rule, while atomic rules are much more straightforward.

    I can understand list membership can improve performance. How are the business terms evaluated in Rete? Are they treated as a single node?

    1. I am not surprised. I have seen those enormous rules in credit card fraud as well.

      This is indeed a great advantage of using list membership: they are evaluated as a single node and therefore do not grow the Rete net exponentially.

      Business Terms might be implemented differently by different tools. I can answer more specifically for SMARTS but would have to differ to the other vendors for their implementations. SMARTS augments the form with business terms usch that rule conditions can be expressed as “Customer is Minority” to refer to the example above, which would be a single node in the Rete net.

      You make an excellent point on traceability. When several rules are merged in one using ORs, it becomes nearly impossible to figure out which path triggered it. And it removes of course any opportunity for rule performance analysis…

      Thanks, Kenny!

      1. Sorry if it’s off topic. I am just curious about business terms. If the combination of conditions is a single node, what if one of the variable changes state?

        In your “minor” example in your article, if we have another rule:
        If born in China, then set race to Asian-American
        I suppose the “minor” node state may change, but how does the algorithm keep track of “race” change?

      2. Kenny, this works exactly the same way that a computation or simple input value change would affect the re-evaluation of the Rete subnet.

  3. Carol-Ann,

    I had one question throughout the article I think you answered at the end, but just want to check: How do you display OR logic in rules within decision table. You seem to say above to do so with different rows in the DT, correct? However, what if you have multiple rules in the same DT? How do you distinguish the parts of a one rule from other, perhaps simpler rules?

    1. Tim,

      Fluid Metaphors keep traceability from the source rules. So, if you start with an OR-ed rule wrtten as syntax such as:

      If age < 16 or income < 10,000
      then…

      You will end up with 2 rows in the decision table (I wish I could paste a pretty picture!):
      Age | Income | Action
      ——————————
      <16 | | …
      | < 10k | …

      The decision table could include many more rows, coming from rules written in syntax or row added in place in the decision table (or in any other metaphor of course).

      When you switch back to textual rules, the OR-ed rule is re-constructed as it was originally.

      The rules could be as complex as you want of course, in terms of number of conditions & actions, or in terms of expressions.

      Does that answer your question?

  4. CABM:

    One huge “knockout” rule is the “accepted norm” in business rules that will eliminate anything harmful and usually is a series of “OR” statements that makes it convenient for the business user to add or delete anything to the rules. Also, each of those OR statements contain a logging element so that the user knows WHY the action was taken. However, this does not address the problem of multiple ORs being valid unless the rule engine does a complete evaluation regardless of first “kickout” conditions. In this event, you would need to write lots of rules to get the conditions all processes. So, the huge OR rule turns out to be a lazy man’s way to do rules. However, “real” KEs and REs try to avoid OR, ELSE and NOT statements wherever possible knowing the potential for performance problems later.

    SDG
    jco

    1. James,

      This is right. Knowledge Engineers know a lot (no pun intended), but there is little literature on *why* one should not use those constructs. My goal is to shed some light on those best practices, to avoid blind blanket statements (that sometimes need to be violated for a good reason). The exception to the rule ;-)

  5. Hi Carol,

    This is a great info on usage of OR. Thank you.

    Can I assume that the rules engine that uses other than RETE approach like corticon rules engine which uses Deti approach would handle ‘OR’ efficiently?

    Best Regards,

    1. Kishore, great question!

      Most commercial BRMS products have the option to use Rete or a compiled sequential algorithm. So you always have a way to avoid the combinatorial explosion that is created by Rete for handling ORs…

      BUT I would like to stress that most issues I highlighted here are still valid.

      – Flexibility: OR-ed rules can become difficult to manage over time. Whether future changes will require a separation, or whether you need to track which actual path was taken for traceability, or whether you are facing a huge statement that becomes tricky to read, you may still want to follow best practices and avoid ORs, regardless of the engine that will execute them.

      – List or Group Membership: If you are looking for an application being in any of 10 given states, you would likely save a little by using membership tests rather than ORs. 10 API invocations versus 1… It might be valuable.

      I would not under-estimate the value of managing rules in a way that is friendly to business analysts. The reason rule technology emerged was actually to get away from spaghetti code. Why would you want to serve spaghetti rules, replacing one problem with another?

  6. I like that you are tackling subjects such as these regarding RETE. Wrote a nifty piece of code to explode the rules in my engine, but never thought about that case where the user has as many ORs as above. I shudder at the thought. I am wondering if anyone (hopefully you as you are great at explaining these things :-)) can tackle aggregation (sums, min, max, avg, etc). I haven’t seen any articles regarding how it is done in a discrimination network, or if it is even supposed to be done. I can see a greater than operation always returning true after multiple assertion until the last assertion that makes it false, not sure what is to be done there. Sorry for the off topic, not sure where else to send this…

    1. Aggregation functions… That is a very interesting topic! I’d be happy to discuss how they effect they might have on rule execution. I’d like to expand the scope to more than RETE though as set operations can surface in many forms in Business Rules projects. Just think about Key Performance Indicators! It looks like I have my work cut out for another blog post…

Leave a comment