Modify python's scrapy to specific use (Javascript rendering using splash/PhantomJS and match some regex etc)

Closed Posted 6 years ago Paid on delivery
Closed Paid on delivery

Hi guys, it's a simple project, the architecture/scrapy logic is designed, all detail specification will be given later, some important point:

1. Read seed URL from txt file, one URL per line.

2. Scraping webpage content for 2 levels(hops), javascript rendering is need, like splash/PhantomJS etc.

3. Covert scraped 2 level content from html to text, and matching 2 predefined regex.

4. Output result as CSV format, comma separated, other columns are like title of root page, keyword info and description info of root page etc.

5. As the output format is CSV, some filtering of non-printable characters and special characters like punctuations. As webpages using different encodings, the final output will be using UTF8, and the characters will be convert to lower-case. Please kindly handle this in the program.

6. It will be well appreciated if the program/script can be done within 1~2 days. And as I'm not live in a well-developed country and this project has limited budget(10~20 USD), lower quote is very appreciated. Detail specification will be given later.

Regards

JavaScript Linux Python VPS Web Scraping

Project ID: #14058611

About the project

9 proposals Remote project Active 6 years ago

9 freelancers are bidding on average $75 for this job

mantislin

Hi sir, This is kimi and I am scraping expert, I have did too many scraping projects, please check my profile page then you will know. https://www.freelancer.com/u/mantislin.html Can you tell me More

$109 USD in 3 days
(195 Reviews)
7.1
schoudhary1553

Greetings sir, i am an expert freelancer for this job and your 100% satisfaction is assured if you allow me to serve. Here is the reason. Why you should pick me? a) I am a very expert and have the same kind of ex More

$150 USD in 1 day
(18 Reviews)
5.2
stevegtdbz

Hello sir, I have completed many similar projects in the past. I mainly use python + selenium + phantomjs to scrap data. I can provide a very powerful python script using phantomjs - multi threading - proxy support - More

$25 USD in 2 days
(16 Reviews)
4.4
shaliniramadass

Hi, we are a 1000 + employee firm. Charging 6$ an hour. Can start any technology immediately. Direct access to developers via Skype, G talk and hotline – 24/7 availability for all 1000+ resource. We can bet you that no More

$25 USD in 1 day
(0 Reviews)
0.0
techcrunch2

Hello Sir, We have gone through the details you have provided and we have already worked on a similar project before and can deliver as u have mentioned and would be pleased to work on this with you to deliver the resu More

$28 USD in 6 days
(0 Reviews)
0.0
iqranabi

Hello, I am Iqra, I am Data Entry/Data Processing Expert who knows the value of time, very hard working and always delivers the work on time. My Motive is to make my employer happy without adding additional charges. More

$22 USD in 3 days
(0 Reviews)
0.0
ishtiyaqlone

Hello, My name is ishtiyaq, I am certified python expert I have 6 years+ experience in python language and I have completed 100+ projects using python .. Expertise : Python, Django, Django-Rest- Framework and many pyth More

$10 USD in 5 days
(0 Reviews)
0.0