Cell Phone Website Scraping

Closed Posted Nov 3, 2006 Paid on delivery
Closed Paid on delivery

Verizon, Cingular, and Sprint provide cellular customers with password-protected web pages allowing them to view detailed call data records. We need three separate Perl programs to log into the personal cellular phone account websites and scrape all the current call data records into both a CSV file and a Perl DBI database. If run more than once on the same call data, the programs should not add duplicate records to the CSV file and database. The programs should be written to support any DBI-compliant DBD, and tested with SQLite. The programs should return status indicating success or cause of failure; html contents of web pages traversed should be saved. After completion, the programs should log out of the website. Command-line arguments should define the phone number to be scraped, login, account password, output directory, allowed retries, and retry interval. Programs should accommodate accounts with both a single phone number and multiple phone numbers per account. Depending on the number of call records, there may be multiple pages of call records, or there may be a "show all" option to view all the call records on one page. The programs should validate field data. Date and time fields should be converted to fixed field width. In the case of invalid data fields, the program should be able to retry the entire scraping operation based on command-line arguments. The programs should handle reasonable changes to the pages or structure of the website without major modifications required. The program should be heavily commented to facilitate Seller’s modifications. The Seller is responsible for providing access to cellular phone accounts for purposes program development. Please submit three separate bids as well as a bid for all three programs as a group. Include in your bid a description of how you propose to accomplish the project, including what other Perl modules are to be used. Thank you and best regards.

## Deliverables

1) Complete and fully-functional working program(s) in executable form (compiled with Perl2Exe [url removed, login to view] or later) as well as complete Perl source code (compatible with ActivePerl [url removed, login to view] or later) of all work done.

2) Deliverables must be in ready-to-run condition, as follows (depending on the nature of the deliverables):

a) For web sites or other server-side deliverables intended to only ever exist in one place in the Buyer's environment--Deliverables must be installed by the Seller in ready-to-run condition in the Buyer's environment.

b) For all others including desktop software or software the buyer intends to distribute: A software installation package that will install the software in ready-to-run condition on the platform(s) specified in this bid request.

3) All deliverables will be considered "work made for hire" under U.S. Copyright law. Buyer will receive exclusive and complete copyrights to all work purchased. (No GPL, GNU, 3rd party components, etc. unless all copyright ramifications are explained AND AGREED TO by the buyer on the site per the coder's Seller Legal Agreement).

## Platform

ActiveState ActivePerl [url removed, login to view] for both Windows XP SP2 and Linux (Linux version to be negotiated)

Perl PHP

Project ID: #3899483

About the project

2 proposals Remote project Active Nov 17, 2006

2 freelancers are bidding on average $935 for this job

ashk23

See private message.

$850 USD in 14 days
(0 Reviews)
3.1
metosoft

See private message.

$1020 USD in 14 days
(2 Reviews)
0.0