*/ Project Information /* You will need to design/program a website product crawler that should crawl [login to view URL] and gather all the product information that we have asked for. The project has a deadline of MAX 2 weeks to be finished. The price for this project is 150USD which is payable after the program is complete and is seen as 100% working or best offer. The person that completes this project and makes it run 100% so that we will be able to gather the almost 14 million products will be given a chance to work on our new project which will be to design a full working website. More information will be given after the current project is complete. */ End Project Information /*
## Deliverables
*************************************************************************************** */ Site Information /* [login to view URL] has about 13,878,082 products (does not include Auto, Travel, Services, Online Degree Programs, Deals & Coupons, Event Tickets) Main categories do not include Auto, Travel, Services, Online Degree Programs, Deals & Coupons, Event Tickets). So we will only need 22 from 28. 22 Main categories 277 Sub Categories 1000 Tri Categories (Estimate) Baby: 11 Sub Categories Books: 2 Sub Categories Clothing & Accessories: 10 Sub Categories Collectibles & Art: 17 Sub Categories Computers: 22 Sub Categories Electronics: 14 Sub Categories Flowers & Plants: 0 Sub Categories Wood & Wine: 14 Sub Categories Gifts: 0 Sub Categories Health & Beauty: 11 Sub Categories Home & Garden: 10 Sub Categories Jewelry & Watches: 14 Sub Categories Magazines: 35 Sub Categories Movies: 14 Sub Categories Music: 25 Sub Categories Musical Instruments: 6 Sub Categories Office Products: 18 Sub Categories Software: 9 Sub Categories Sports & Outdoors: 6 Sub Categories Toys: 14 Sub Categories Video Games: 25 Sub Categories Others: 0 Sub Categories */ End Site Information /* *************************************************************************************** */ Crawler Information /* - Crawler will crawler [login to view URL] - Will gather 22 Main categories, 277 Sub categories, and approximately 1000 Tri-categories (which means all tri-categories) - Will gather all the product image url's, names, descriptions, specifications, and all the main, sub and tri categories that they may belong too - Crawler will make a MySQL database with the following information in the order given: MainCat SubCat TriCat Name Description Specs ImageURL */ End Crawler Information /* *************************************************************************************** */ Programmer Requirements /* These are the requirements that we require from the programmer and the program. - Must program a 100% working script - Must who us how to run the script - Must run the script on our server - Must fix if ANY problems are found - The program should be able to continue from the place it stops or if the server crashes - Program MUST be able to calculate the amount of products that it is about to crawl - Program MUST show how many products have been gathered and the percentage done - Deadline is 2 Weeks - Program Must log any errors it comes upon - If the program hits a product it cant get, it should show in error log all the information of the product so it can be manually added or fixed. - Program MUST be fast, accurate, and efficient */ End Programmer Requirements /*
## Platform
Linux or Windows PHP, Python, or ASP MySQL or MSSQL