Hi, i'm an expert in cleaning Excel data. To find unique Company Activity Words for you, this is what I propose: I review the list and come up with 10-20 "text cleaning rules" that you approve. Then I implement.
For example, rules for cleaning the below could include these: Remove initial space, compare without inter-word spaces, hyphens, "&", "and", same first 2 words, plurals, brackets, other punctuations, upper and lower cases, etc.
AGRIBIOTECH
AGRI BIOTECH
AGRI -BIOTECH
AGRI & BIOTECH
AGRI BIO-TECH
AGRI-BIOTECH
AGRI-BIO TECH
AGRIBIO-TECH
I propose 5 days so that there will be enough time to communicate and get feedback from you on the "cleaning rules" and refinement.
I have a lot more "text, name, address cleaning rules" based on my long experience. Every set of data is different. I have an eye for the details and patterns.
By the way, do you have 400k or 4 million rows of data?
Please message me with any questions.