I need a web scraper written for the .xlsx file in the following directory:
[login to view URL]
The latest .xlsx file within that directory will need to be downloaded.
The name of the file is subject to change and will need to be identified by the latest .xlsx extension.
The output should be a pipe (|) delimited file with the following column mappings:
origin_city --> data located in the "Starting city" column (column D)
origin_state --> data located in the "Starting State" column (column E)
ship_date --> data located in the "Pickup Date" (column G), changed to the YYYY-MM-DD format
destination_city --> data located in the "Destination City" column (column I)
destination_state --> data located in the "Destination State" column (column J)
receive_date --> data located in the "Delivery Date" column (column L)
trailer_type --> data located in the "Type of Equipment" column (column A)
load_size --> data text "Full"
weight --> data located in the "Weight" column (column O)
length --> leave blank
width --> leave blank
height --> leave blank
trip_miles --> leave blank
pay_rate --> data located in the "Rate" column (column Q)
contact_phone --> leave blank
contact_name --> leave blank
tarp_required --> leave blank
comment --> data located in the "Special Information" column (column R), do not include the text starting with "P/U" and "Del:"
IE: REF#: NM75383 P/U: 09/27 19:00- Del: 09/30@06:00 1PICK-1DROP Equip: FTL 53FT VR / 1663 miles - 40K LBS ... P/U: 09/27 19:00- Del: 09/30@06:00 is not needed in that "comment" column
load_number --> data located in the "Order" column (column C)
commodity --> data located in the "Commodity" column (column B)
The first line of the output should contain all of the column headers.
Any field that contain no data should be left blank.
Please do not use words like "null" or "blank" in blank columns.
Below is a sample output of the first 5 columns using sample data:
The deliverable will be a Perl .pl file that must run on
Ubuntu Linux and must use Modern::Perl. The Perl .pl file
should be called '[login to view URL]' and the output file should be
called '[login to view URL]'
It will be scheduled in cron to run unattended every 15 minutes.
Please specific what language/OS/modules you plan to use.
Also, please include the word "raccoon" in your bid so I know that
you read this description.