Ancient RPG Program Assistance and Automated Data Entry Question.

Ed M.

Ars Tribunus Angusticlavius
9,259
I haven't posted in a while, but normally, for technical questions of this nature, Ars always seems to be a helpful place. Anyway, long story short, a friend's institution has implemented a new digital record keeping process for applicants; previously they've been using paper applications. It's all digital via the cloud and stored in nice, clean PDF format with access to each data field entry if necessary. The fields contain the usual: Name, Address, etc etc. . . The trouble is that the system the data needs to be input to is a legacy AS/400 and the data taken from the paper apps was entered manually by several people (don't ask) :rolleyes:. The program used to key-in the applicant's data is written in RPG and was probably written in the mid 70s. The requirement is to have the data continue to reside on the AS400, so the question is, what could be done to automate the process instead of having the individuals print out the completed PDF forms and enter the data manually into the old software. Is there process that can be used to read the data fields from the PDF and dump them into the record fields in the software that's used? Thanks!
 

Mark086

Ars Tribunus Angusticlavius
10,595
I haven't touched an RPG system since the mid-90s and don't know what's required for data formats for it. (IBM seems to always make that extra-extra fun).

There's plenty of tools to extract data from PDFs, so what you should be looking to do is integrate them into your workflow, then figure out how to reformat their output to match your inputs.

Getting the data into Excel, and then extracting from Excel to csv or another format that is likely compatible would be a reasonable workflow.
 

Apteris

Ars Tribunus Angusticlavius
8,938
Subscriptor
Getting the data into Excel, and then extracting from Excel to csv or another format that is likely compatible would be a reasonable workflow.
It's probably a good idea to reduce the number of intermediary steps as much as possible. As it seems that the OP will need to write some code in any case -- a tiny ETL pipeline, in fact -- he might be able to have all of the transformation logic in just one location.

Is there a canonical solution for extracting data out of interactive PDFs these days, or does one still need to hack through the search result thickets of "best PDF extractor ever"?