Find Jobs
Hire Freelancers

Crawl, Edit and Publish

$500-5000 USD

Cancelled
Posted over 16 years ago

$500-5000 USD

Paid on delivery
This project is about building a web application to collect information, do automatic editing, and re-publish it in a wiki-style. Applicaiton to: 1. COLLECTING. a. Crawl a pre-defined list of web sites (Or use Google to list all pages from the sites) Example: www.do.se. b. Save all pages that meet certain criteria. Criteria might include keywords or a specific "form", like find all single-word titles. Example: pages containing the words "it is" in the title. c. Information to be saved is main headings & text, original URL, and links to images/other media. Since some sites might use improper html structure, there might be a need to add criteria to define more exactly what information to fetch, eg based on css styles or similar. d. When possible each page should be associated with some keywords from meta-keywords and based on for instance the original site structure... Eg. say that the site has breadcrumbs, each part of the breadcrumbs could be saved as keywords. Example: This page has breadcrumbs, and the word "Lagar" should be saved as a keyword in this example: [login to view URL] 2. EDITING. a. The title of each page should be edited automatically following certain criteria. Other criteria might give warnings for an editor to check. For instance certain words in the title removed automatically, and titles containing too many words giving an alert to the editor. b. Images should be removed and replaced with a URL to the original site. 3. PUBLISHING. a. Each page recreated into a wiki page. The title of the page comes from the title generated in the previous 2. b. Multiple pages with same titles should automatically be merged into one. c. The site should contain a search engine that searches on titles and keywords (probably not text content). d. The wiki should be standard model (like wikipedia). All URL:s should be natural-language. ## Deliverables 1) Complete and fully-functional working program(s) in executable form as well as complete source code of all work done. 2) Deliverables must be in ready-to-run condition, as follows (depending on the nature of the deliverables): a) For web sites or other server-side deliverables intended to only ever exist in one place in the Buyer's environment--Deliverables must be installed by the Seller in ready-to-run condition in the Buyer's environment. b) For all others including desktop software or software the buyer intends to distribute: A software installation package that will install the software in ready-to-run condition on the platform(s) specified in this bid request. 3) All deliverables will be considered "work made for hire" under U.S. Copyright law. Buyer will receive exclusive and complete copyrights to all work purchased. (No GPL, GNU, 3rd party components, etc. unless all copyright ramifications are explained AND AGREED TO by the buyer on the site per the coder's Seller Legal Agreement). ## Platform Open for suggestions.
Project ID: 3538367

About the project

5 proposals
Remote project
Active 16 yrs ago

Looking to make some money?

Benefits of bidding on Freelancer

Set your budget and timeframe
Get paid for your work
Outline your proposal
It's free to sign up and bid on jobs
5 freelancers are bidding on average $842 USD for this job
User Avatar
See private message.
$2,125 USD in 14 days
5.0 (180 reviews)
7.8
7.8
User Avatar
See private message.
$425 USD in 14 days
4.1 (27 reviews)
7.4
7.4
User Avatar
See private message.
$595 USD in 14 days
4.9 (55 reviews)
5.1
5.1
User Avatar
See private message.
$425 USD in 14 days
4.5 (12 reviews)
3.0
3.0
User Avatar
See private message.
$637.50 USD in 14 days
0.0 (3 reviews)
0.0
0.0

About the client

Flag of SWEDEN
SOLNA, Sweden
5.0
234
Payment method verified
Member since Jun 2, 2007

Client Verification

Thanks! We’ve emailed you a link to claim your free credit.
Something went wrong while sending your email. Please try again.
Registered Users Total Jobs Posted
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Loading preview
Permission granted for Geolocation.
Your login session has expired and you have been logged out. Please log in again.