Hi,
I have a 312 page PDF that each page contains a table. The table has images and some text rows. I need a JSON formatted extracted file from that PDF. I will need the script itself to run by myself in case of a PDF file update. So, it should be working without any bug or issue.
Folder structure:
/images
[login to view URL]
JSON structure:
[
{
id: ###,
name: ###,
image: "./images/###.png",
column1: "",
column2: "",
date: "",
},
.....
]
If you can do this, I will provide you with the actual PDF file. You can find the page example from attachments.
Note: By the way, the PDF file is public data provided from a government branch and it is ethical and legal. So, no worries.
Hi I have worked with pdf libraries in several programming languages. I can help you with your request. If this interests you, I'd be happy to discuss it further.