Find Jobs
Hire Freelancers

Creation of a Blog Crawler

$250-750 USD

Cancelled
Posted over 15 years ago

$250-750 USD

Paid on delivery
Project Brief: Blog Crawler Tool To create a tool that can analyse a multiple number of blog URLs (from a .txt document of a .csv/.xls spreadsheet) and extract all the outgoing blog links, noting them down and returns the following information. All outgoing links pointing to other blog URLs. It is vitally important that these are blogs and not normal websites – the tool will need to be able to not just take down every URL we are looking for the outgoing links not on every part of the blog but rather the piece that says "favourite Blogs" or "Related Blogs" rather than all of the hyperlink embedded within the text as deep links. The tool will need to be able to register this and select hyperlinks accordingly. It is normally the case that these links are usually submitted on the first page and replicated onto other pages within the blog. The tool should also be able to do the following: Removal of any duplication URLs Only inclusion of homepage URLs – as opposed to individual blog posts Google PageRank of each blog URL found(there are constrictions on the number of requests that can be made per day for Google PageRank scores. Therefore, the tool will have to be able to utilise different proxy addresses to circumnavigate this problem) Technorati Authority of each blog URL (there are constrictions on the number of requests that can be made per day for Technorati Authority scores. Therefore, the tool will have to be able to utilise different proxy addresses to circumnavigate this problem) Blog Title - This is contained in the Meta Data of almost every blog as mentioned earlier you are aware of sourcing this data of which needs to be added into the tool – if this isn’t available in the meta data the tool should accommodate for this so that the title can be extracted. Blog Description - This is contained in the source code of almost every website – if this is not available within the source code the tool should find this information from the “About” section – very common in blogs Blog Keywords - This is contained in the source code of almost every website, for SEO purposes – from our earlier conversation - for the last point we would realistically need the title and description of the blog - this information can normall be found on the blog as text, we also require the keywords so that the blog can be catagorised by subject. This could be achieved in the following way then....either....The tool writes down all the tags from all of the posts and remover the duplicates or it picks out the meta tags (or both) Tool Process Input the proxies into a appropriately titled .txt file Input multiple URLs into a .txt file Blog Tool goes through each URL one-by-one finds all of the URLs found and for each URL found populates an excel in the following way:
Project ID: 297926

About the project

10 proposals
Remote project
Active 16 yrs ago

Looking to make some money?

Benefits of bidding on Freelancer

Set your budget and timeframe
Get paid for your work
Outline your proposal
It's free to sign up and bid on jobs
10 freelancers are bidding on average $588 USD for this job
User Avatar
i can help you for a fair budget,thanks
$750 USD in 0 day
4.9 (163 reviews)
6.4
6.4
User Avatar
Hello, we udnerstand your project and we are able to do it , we have Exp. in Crawling . Lets Dsicuss more , waiting for your reply raj
$750 USD in 15 days
4.7 (213 reviews)
5.4
5.4
User Avatar
Crawling experts are here. Please check pmb.
$700 USD in 5 days
5.0 (56 reviews)
5.1
5.1
User Avatar
Please check PM.
$500 USD in 10 days
5.0 (2 reviews)
1.3
1.3
User Avatar
I can provide the best solution to you on given time frame. Thanks, Suresh
$300 USD in 7 days
0.0 (0 reviews)
0.0
0.0
User Avatar
I'm good at this and can show you some script examples which are similarly matching your current requirement.
$480 USD in 7 days
5.0 (2 reviews)
0.0
0.0
User Avatar
I already many crawlers from basic Site Monitor to Product Crawlers. I can complete your Blog Crawler tool as best. Thanks, D SENTHILKUMAR.
$750 USD in 7 days
0.0 (0 reviews)
0.0
0.0
User Avatar
Hi, We have a crawler that currently track few hundreds thousands of blog. Please check PM
$750 USD in 30 days
0.0 (0 reviews)
0.0
0.0

About the client

Flag of UNITED KINGDOM
London, United Kingdom
5.0
42
Member since Sep 13, 2007

Client Verification

Thanks! We’ve emailed you a link to claim your free credit.
Something went wrong while sending your email. Please try again.
Registered Users Total Jobs Posted
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Loading preview
Permission granted for Geolocation.
Your login session has expired and you have been logged out. Please log in again.