Name Matching on Large Datasets in Databricks

Hi,
I recently came across Zing, it looks like a fantastic product. I see that it has databricks integration. Basically, what I am trying to do is do large name matching between external datasources from which I have ingested into databricks to internal names we have in our own tables. How to go about this? For instance, one use case is I have brought into about 4,000 names and I want to see which of our 6500 names it produces best matches to. Additionally, I want to know how well this would scale because another use case compares 85,000 external names to 300,000 internal names to see what best matches. 


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Name Matching on Large Datasets in Databricks #1177

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Name Matching on Large Datasets in Databricks #1177

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions