Bioinformatic data analysis (plain text file processing)
$30-250 USD
Awarded
Posted over 9 years ago
$30-250 USD
Paid on delivery
The Perl program has to compare text files (also using external programs listed below) and produce an output according to specifications. This is a bioinformatics program, so it's a benefit, although not a mandatory requisite, to have some knowledge in the field.
Accepted dependencies: standard Perl modules (i.e. found in every Debian based distribution, since 2010); "samtools", "bedtools".
Used formats: VCF [[login to view URL]], BED [[login to view URL]]
Input:
0) real_bed: a BED file describing the "sure" regions
1) real_vcf: a VCF file with "sure" variants
2) whole_vcf: a VCF file with all variants found by an external software
3) target_bed: a BED file containing all the regions in the genome that could be possibly validated: any variant eventually falling outside these boundaries has to be discarded
4) covered_bed: a BED file describing which regions have been covered by the experiment that produced
Output: Classification (and statistics) on variants resulting from the comparison of real_vcf and whole_vcf considering also the bed files. Details attached.