Fuzzy Matching
Fuzzy Matching is used when you want to search your captures for the closest match across several different properties.
Because perfect matches are unlikely in real-world scenarios, the fuzzy matching system instead produces a score for each possible match and selects the capture with the lowest overall score.
See AI vs Fuzzy Matching for a discussion on some key benefits and down-sides of fuzzy matching vs using AI.
To do this, you'll often need to adjust the weighting factors that control matching, because in most cases there will be vital priorities to which of the various things you're searching for are the most important.
Example
For example, let's look at a system that is producing cupcakes.
We're capturing how our system is set up (including line speed, oven temperature, colour tolerance etc.) for every different batch of cupcakes we produce, so we can automatically set up our system correctly for new product runs in the future.
First, we identify which attributes we are going to know (or be able to easily get) before our production run begins.
These include…
- Product Weight,
- Product Type (ie. Blueberry, Chocolate, Bananna).
- Product Style (ie, Plain, Deluxe, Fudgy, Gluten Free),
- Ambient Temperature
- Ambient Humidity
Setting Priorities
If we want to search for the closest match (rather than using AI), we need to set some priorities in the configuration file.
We might decide that size is the most critical factor, followed by type and style. Temperature and humidity are useful, but not as important as any of the others.
For each attribute we're going to match, we can set up a multiplier and a type or closeness threshold.
Multiplier
The multiplier is applied to the difference between the attributes when comparing the value you searched for against
For example, we want a 10g difference in weight to be penalised much more than a 10% difference in humidity.
So we can give weight differences a large multiplier, or humidity differences a small multiplier.
"matching": { "Weight": { "mult": 10, "close": 0.002 }, "Type": { "mult": 1, "type": "equalonly" }, "Style": { "mult": 1, "type": "equalonly" }, "Temperature": { "mult": 0.5, "close": 2 }, "Humidity": { "mult": 0.1, "close": 10 } },
Closeness
The 'close' value defines how much difference is considered 'good enough' to be a match. If the difference between the search value and the captured value is less than this amount, it will be considered an exact match.
Match Types
The 'type' value defines how the comparison should be performed.
equalonly
When matching Equal Only, the condition becomes a simple yes/no rather than a multiplier - the score will be 0 for an exact match, or the 'mult' value if it's anything else.
This is the ideal method when comparing strings or discrete values.
preference
This indicates that you'd prefer values close to a particular target. Think of it as being able to specify a default search value for when users forget to do so.
For example, you might want to always give priority to the fastest result - if the search finds several very similar matches, we want to use the one that was quickest.
The following rule will do that…
"Paint Line.Speed - Actual": { "type": "preference", "mult": 0.2, "target": 300 }
The target attribute specifies what value you're hoping to achieve. In this case, a small amount will be added to the score whenever the speed is not 300.
It's often a good idea to give these very small 'mult' values - otherwise an unexpected or noisy signal might cause issues.