Extract some info from PDF files, capture this info and load it into DB

Actually, I need to to load manually information from aprox 1000 pdf files/day (invoices, purchase orders, etc.). Those files have approximately 60 different templates and resides in a Windows Server with a folder tree like: "d:\Folder\Folder\ Folder\001\pdffile00[1...n].pdf"

I need to do:

1. Connect to the Win server and pull the files to a linux server preserving directory structure

2. Once the files are in the linux server extract the info I need (between 20 and 60 fields per file) and load this info into a table in MariaDB.

Other considerations:

1. To pull the files could be a shell script

2. To read the files a java application is preferred because, later we will need to integrated into

OpenKM ([login to view URL]) as an extension

Please send your offer in hour/man a $/hour.

Skills: Java, Linux, MariaDB, Shell Script, Windows Server

See more: extract data from pdf python, extract data from pdf to excel, how to extract data from pdf file, pdf parser, extract data from multiple pdf files to excel, extract data from pdf to database, extract specific information from pdf, extract data from pdf java, extract dbx files, download pdf files, google books extract pdf, extract 3gp files, asp upload pdf files issues, extract pictures text pdf files, extract photos pdf files, extract data pdf files, using itext extract images pdf files, able extract data pdf files excel, extract pdf files website, extract data pdf files java program

About the Employer:
( 6 reviews ) Bogotá, Colombia

Project ID: #17495769

9 freelancers are bidding on average $8/hour for this job


Hi Friend, I have huge experience in java development and i worked on pulling files from FTP server, parsing it and loading data to the Database. I reviewed your requirement it's looking good to me. I will do this aut More

$6 USD / hour
(35 Reviews)

hello I have done similar word recently. I read pdf file, extracted exam data and calculated results for students. also updated all data into databse. so I am confident about your work. but as you said there are aroun More

$11 USD / hour
(114 Reviews)

Using smbclient (Samba client), It is possible to connect Windows Server from Linux; Just need to install smbclient (Samba client). Then can access the windows machine like the following: smbclient //MIKE-SERVER -U <y More

$5 USD / hour
(13 Reviews)

Hi, sir! I had a close look to your project. I am an experienced programmer and I'm sure I can complete your project asap. If you award this project to me, I'll complete it in time. I promise a high quality and pun More

$5 USD / hour
(17 Reviews)

Hi, We are a team of java and python developers who ensure on time task completion with complete customer [login to view URL] find our portfolio below [login to view URL]

$7 USD / hour
(35 Reviews)

i have 8 years of experience in Java J2EE spring web application development including Eclipse RCP plugin development. i have experienced in pdf extraction using java itext or pdfbox apis. spring boot project we can do More

$15 USD / hour
(5 Reviews)

Expert in shell scripting, python and Big Data ************************************************************************************************************************************************************************** More

$5 USD / hour
(0 Reviews)

I already working PDF edit in java spring project using PDFBox lib. i have working last 6 month for PDF add annotation and modify annotation and extract data from PDF . let me know so i can show you demo for simple p More

$8 USD / hour
(0 Reviews)

Hi, I am a java developer for almost 5 years and I know shell script as well. Extracting records from pdf is easy if PDFs aren't locked. So tell me if you want this to be done. Have a nice time

$11 USD / hour
(0 Reviews)