Find Jobs
Hire Freelancers

Compare Strings Using Python (Preferred) or Perl

$30-250 USD

Completed
Posted almost 10 years ago

$30-250 USD

Paid on delivery
Objective: Use Perl or Python (strongly prefer Python) to compare similarities between, both orthographic similarity and phonetic similarity. (This is also a replication of previous academic study in drug names). Actual programming is not hard at all. You might need to take a little bit time to figure out each measure for orthographic similarity and phonetic similarity in Python or Perl. Instructions: I have 2 csv, tab delimited datasets: TreatName and ControlName. There are 4 columns in each dataset. The first column contains the ID of the drug, the second column represents the year of the production, the third column is date, and the fourth column indicates drug names. You will not be given the 2 datasets, but exerts from the 2 datasets are shown as below to give you a better idea of how they look like. You can simulate your own datasets if necessary. In fact, each dataset contains about 800 million row observations. Therefore, it is crucial to make sure your code is cohesive and efficient. Dataset 1: TreatName Dataset 2: ControlName ID Year Date DrugName ID Year Date DrugName 510001 2001 20010101 Axnieo Dex 16322 1996 19961111 Olexiny 510002 2001 20010630 Deliow 16358 1999 19991012 Weiliny 82468 1999 19990208 Tyleno.A 47829 2001 20010201 Delexiny.2 98465 1999 19991112 Plownix 78966 2001 20010911 Rexineo Celio Following the attached pdf named “drug name”, you will write a program to measure both the orthographic similarity and phonetic similarity between two drug names between TreatName and ControlName in a given year. All measures are listed in the pdf “drug name”. You may need to I believe, Python and Perl have most of the functions built in already. For each measure, please document the Python or Perl function used for it. In the end, I want you to produce a csv dataset named Targe, containing results of orthographic similarity and phonetic similarity between two drug names. This data set . See attached WORD file for a sample snapshot of Target. Basically, for each year, you compare all drug names in dataset TreatName to all drug names in dataset ControlName for that given year. Please let me know if you have any questions.
Project ID: 6180778

About the project

2 proposals
Remote project
Active 10 yrs ago

Looking to make some money?

Benefits of bidding on Freelancer

Set your budget and timeframe
Get paid for your work
Outline your proposal
It's free to sign up and bid on jobs
Awarded to:
User Avatar
Hi! I am well experienced in python programming. I suppose I will be able to accomplish a task you need. I can deliver you high quality and efficient code in reasonable time.
$80 USD in 3 days
5.0 (228 reviews)
6.5
6.5
2 freelancers are bidding on average $165 USD for this job
User Avatar
Hello, your project is very well explained and clear. I'm a Python developer. I will develop software to monitor and compare CSV for you, we will talk about all the details. a greeting.
$250 USD in 10 days
5.0 (1 review)
1.4
1.4

About the client

Flag of UNITED STATES
atlanta, United States
5.0
2
Payment method verified
Member since Feb 22, 2014

Client Verification

Thanks! We’ve emailed you a link to claim your free credit.
Something went wrong while sending your email. Please try again.
Registered Users Total Jobs Posted
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Loading preview
Permission granted for Geolocation.
Your login session has expired and you have been logged out. Please log in again.