# Fuzzy Match

One common use case to utilize the python tool for, is fuzzy matching. Fuzzy matching is a process that finds strings in a dataset that are approximately similar to a target string, even if they aren't an exact match, by using algorithms to calculate the degree of similarity.

### Input, Output

| Input                                                                           | Output                                                                               |
| ------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------ |
| A table from an Analytics tool output with two columns able to be fuzzy matched | Your python altered data frames or charts with similarity score dictated by `return` |

### Example Use Cases

* Matching address that may have been entered incorrectly or slightly different&#x20;
* Matching names they may have been entered incorrectly or slightly different
* Assistance in reconciling data between different systems where exact matches aren't guaranteed

### Example Input Table

| A      | B    |
| ------ | ---- |
| Apple  | grpe |
| Banana | appl |
| Orange | bnna |
| Grape  | orng |

### Example Code

This example code creates a fuzzy match from a tool input and only returns those matches with a similarity score above .5 via `def fuzzy_match(list1, list2, threshold=0.5):`. It matches the first two columns provided, so you will need to rearrange your data to be matched as the first two columns.

```
#import Libraries
import pandas as pd
import difflib

# Function to perform fuzzy matching
def fuzzy_match(list1, list2, threshold=0.5):  # Lowering the threshold to 0.5
    matches = []
    for item in list1:
        match = difflib.get_close_matches(item, list2, n=1, cutoff=threshold)
        if match:
            similarity = difflib.SequenceMatcher(None, item, match[0]).ratio() * 100
            matches.append((item, match[0], similarity))
    return matches


#Convert list of dictionaries to pandas dataframe
tbl = sources[0]
df = tbl.df 


#Print the Dataframe to check structure
print("DataFrame structure:\n",df)


#Select the first two columns dynamically
list1 = df.iloc[:,0].tolist()
list2 = df.iloc[:,1].tolist()

#Perform Fuzzy Matching
matches = fuzzy_match(list1, list2)


#Create Dataframe for reults
results_df = pd.DataFrame(matches, columns=['Original Column','Match','Similarity Score'])


#Print and Return Results
print("\nResults:\n",results_df)
return [Table(df=results_df, name="Fuzzy Match Results")]
```

### Example Output Table

| Original Column | Match | Similarity Score |
| --------------- | ----- | ---------------- |
| Apple           | appl  | 66.67            |
| Orange          | orng  | 60.00            |
| Grape           | grpe  | 66.67            |


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.auditboardanalytics.com/tools/code/pythoncode/fuzzy-match.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
