Skip to end of metadata
Go to start of metadata

String Similarity

Author  Kyobong An

Primary Features

This plugin takes a character string and returns a best match from the pre-registered list of strings. This plugin helps to obtain the “right” character string from an incomplete one. One use case is to improve accuracy for OCR output if the accurate set of character strings are known like addresses and names.

Need help?

Technical contact to

May you search all operations,

Input (Required)

  • Correct String List: Set of right character strings in CSV
  • Similar String: One unvalidated character string

Advanced Input (Optional)

  • csv # headers: # of rows for headers – if the input CSV has a header row(s), the plugin ignores the head row(s) specified here.
  • Similarity Threshold: Threshold of similarity – Set integer between 1 and 100 with default 70. If none of the character string gives any score higher than the threshold the plugin returns an error (return code = 1)
  • Assign a column: Select a specific column to match - if the input CSV has multiple columns, the plugin matches the string on the specified column.
  • Case Insensitive: Check for ignoring upper/lower cases
  • csv output: Option to return a CSV as output – the plugin returns CSV with all the scores.

Return Value

  • A character string of best match (String, CSV, and File)
  • CSV with all scores (Sting, CSV, and File)

Return Code

                0              for successful execution

                1              for no match (no character string give higher score than the threshold)

                99           All other failures

Parameter Setting Examples

Text from Image

Text from Image

All Plugins