List of articles   Terminology   Choose language


Use of in-exact names in requests


Operation of comparison, which is named as "rough equality" and is designated "≈", can be applied to strings. It returns natural value, equal to deviation of two strings, or "null", if two strings are not equal.

Following differences can exist between words: superfluous letter, missed letter, other letter (i.e. one letter is replaced by other letter). Differences "upper-case letter" - "lower-case letter", "abbreviation" - "word from lower-case letters" are considered as difference "missed letter" (as presence-absence of one of two control marks, creating capital letter and abbreviation). Each difference is estimated in 4 points (because irrelevance [look below] accepts values in range from zero to three). Such set of differences is chosen of all variants, at which sum of points is minimal - this minimal sum is named as incongruity. Words are not rough-equal at detection more than two differences "other letter".

select  ...  where        @fld≈"different";
select  ...  where        @fld≈"diferent";
select  ...  where        @fld≈"defferent";
If, besides that, controls for indexes and limits are used, then variance of positions between two words concerning base line is name as irrelevance, which is calculated so: first identical signs of positions are rejected, maximal length of got stumps is irrelevance. Sum of incongruity and irrelevance is named as deviation of two words.

Following differences can exist between phrases: superfluous word, missed word, permutation of two words, other word (i.e. one word is replaced by other word), convolution (of words, going consecutively, into an abbreviation from initial letters of each word). Each difference is estimated in 16 points (for convolution - in 16 points on each letter of an abbreviation). It try to disassemble difference "other word" as set of differences between pair of words to reduce quantity of points (if two words are not equal, then difference "other word" is counted). Blank between words can be excluded or is replaced by hyphen - Both transformation is estimated in 1 point. Such set of differences is chosen of all variants, at which sum of points is minimal - this minimal sum is named as deviation. Phrases are not rough-equal at detection more than two differences "missed word".

select  ...  where        @fld≈"algebraic equations";
select  ...  where        @fld≈"equations algebraic";
select  ...  where        @fld≈"AE";
select  ...  where        @fld≈"differential equations";

Variance of two strings, if at least one of them contains more than one words, is calculated so, as deviation of two phrases - but permutation through any mark of punctuation (including mark "point" at the end of sentence) is estimated in 64 points, and mark "point" in the end of strings (each maybe consists of several sentences) is not considered at all. If one of compared strings is "null", then variance is equal 16.

Records, came into result of request, are sorted and extracted in order of increasing of variance (i.e. record with the least variance goes first).

select  ...  where        @fld≈"algebraic equations, differential equations";
select  ...  where        @fld≈"algebraic and differential equations.";
select  ...  where        @fld≈"AE, DE";


P.S.

It's possible to assign value of variance into field of a table.

insert  ...  values(      @fld≈"algebraic and differential equations" );
update  ...  set    @fld2=@fld≈"algebraic and differential equations";


Dmitry Turin



List of articles   Terminology   Choose language