Data engineer to complete below task looking for a pro who can also provide logging
₹600-1500 INR
Closed
Posted about 3 years ago
₹600-1500 INR
Paid on delivery
Requirement: Provide two solutions for the following scenario –
1) SQL programming language
2) Python scripting**. The python script should be orchestrated in airflow. Provide the
read me with the details to build the docker image.
1. Deduplicate the data in players.
oprimary keys in players table are name, team and position.
oif two records are available with the same primary key - pick the record that has
maximum data (ignore the rows that have nulls).
2. Replace empty cells for height and weight in the players csv with the average height
and weight from the data.
3. Generate first excel output with total number of players by position.
4. Generate second excel output with -
oadd a new column that shows "Yes" for players older than 30 years. Default
value should be "No".
oadd bmi calculation
omerge players and teams - the output file will show all columns in players with
the above two additional columns and columns from teams excel as well.
5. Also provide code to ingest from AWS S3 bucket and output to another AWS S3 bucket
instead of ingesting and writing data to csv file.
Input files: [login to view URL] and [login to view URL]
**Please send the python code in html format, if using jupyter notebooks.
Feel free to reach out for any questions.
Though i am new in Freelancing, but currently as i am a Masters student, i have very well experience in SQL, PYTHON, Excel and all since i am in Data Analysis. I have a great working experience in Jupyter notebook since I work on it daily.
I'm having more than 4 years of experience with Python and ML and data science. I primarily work on data analytics services to the clients with the focus being on solving business problems. Being into Data Analytics from the past 4 years, I developed predictive models that help in bringing out the right insights from the data that comes in.
Achievements:
1. Applied LDA model to review chat data analysis and resolved client issues by 15%
2. Created sentiment Analysis for news articles
3. Developed match rate algorithm to figure out nearest skills based on distance measuring technique.