Information Architecture Blog Posts
Do you want to share your thoughts and expertise with the community? Post a blog in the Information Architecture group to get the information exchange started.
Showing results for 
Search instead for 
Did you mean: 
Active Contributor
0 Kudos

Recently, I've researched drinking water quality in nearby rural areas, documenting historic "haves" and "have-nots" for an environmental equity study. Viewing government documents obtained via public information requests, I spotted column numbers between 1 and 80, as we would have done in the 1960s and 1970s before tape and disk storage arrived.

Deck of CardsDeck of Cards

For the benefit of future application developers who may "inherit" similar legacy workflow methods, I will share what I can discern from those sparse numbers and other clues, and ways to pull past data into future systems.

Location data then


Starting with "you want to drill where?," the above clip shows "B3" in a small rectangle, and "B4" in another section below. Believe it or not, these are 80-column "IBM punch card" record formats, where, for instance the local county would be punched between columns 8 and 21 (just long enough for the longest name in the set). Yes, Imperial measurements of miles not kilometers, and for some odd reason the multi-part carbonless paper form has "hard-coded" the letters MI for miles, in columns 77 and 78.

Location data now

A more recent form has nearly identical information fields, except the "paper" form is now a fillable PDF.



Orientation Then and Now

Besides the where-is-it question there is the in-reference-to-what question.


Distance from the nearest road can be entered as either feet or miles, and the cool compass rose is supposed to point in a direction. If you fill it in.

The newer data collection method is not that different.


I don't think the ordinal points look as nice as before; the column 32 on the punch card is still with us, getting one character: NSE or W.

Workflow: Driller

Since the work must be done only be a certified driller, the form includes their contact and license information. Not too many punch card columns here (76-81 for license). Beyond the margin!


Now, it's like this:


Nicely done calendar in the PDF. Some algorithm was inserted in the form that allows easier selection, less transcription, and probably deals with leap years correctly.

Card B2: Approximate Pumping Rate, in Gallons per Minute. This is before the well is drilled, so an educated guess. The government records your guess for posterity in columns 8 to 12; the card letter/number go into columns 1 and 2. Somewhere in time, a FORTRAN (or COBOL) program once crunched those data into some type of report. The paper hanging chad content lives on, somewhere in an SAP or Oracle system.

Card B4: Water sources (up to 3), no punches.

Quantity Needed

In classic American post-WW2 affluence, everyone (* with exceptions) was awarded a 100 gallon (400 liters) stipend as an expected privilege of suburbia. People that live on boats or other small footprints know this is absurd. We can't know why this question is necessary or important other than being yet another educated guess. I've seen the value 300 (so a 3-consumer space) in this field, for single residential use. For agriculture or mass residential (towns), you can enter up to 20-14+1 digits. 

The provenance of legacy data can be established if the original systems were documented and you can find the docs, or if meta-data chain of custody is visible. Definitely not block chain tech here, just fill in the boxes.

What can be added to the historic record are errors and omissions found reviewing the previous data archives. I've posted elsewhere using the key "blue pencil" as a nod to the days of paper copy editing before publication. Reversing the north and south values was one example, corrected by crossing out since getting a new form and starting over is frustrating.  One land record change had to be recorded for posterity because numbers were reversed on the first try (so if you only looked there you'd get bad data).





Future Proofing

Okay, stop laughing now. We can't predict the future but can learn from the past. Supposing our new project assignment is taking the data collection and approval workflow into a mobile app so QR codes and pin drops take the place of "Please Type."

Once you grasp the provenance of historic data points and collections, look at the implicit or explicit data constraints, whether that is a data domain or range, or a field constraint, for example:


enum Level {
/* reference: */


The idea of using 2 digits for a year should have gone out of the style at, say, the turn of the century, however this anachronism continues on local government forms:







After collecting the 2 digits for the year, an algorithm stores the guessed 4-digit year for archival storage. Using a data collection mechanism like the PDF calendar form is an option, as is requiring 4 for digit years (and doing a sanity check that someone doesn't select the year 0042, say).

Hidden biases, not to mention oversights by developers, may be exposed in user testing. The sooner the better after noting areas to be addressed, such as accessibility, multiple language support, low-or-no vision options, and addressing environmental equity should past practices need attention.

My focus on gathering historic data is not research as much as prediction and a basis for action to mitigate climate change impacts on the double jeopardy of low-cost structures/infrastructure aligned with literal low-lands, sinking into coastal waters in some cases. I've envisioned the flat page data as having multiple dimensions of data collection, that can be woven together into stories using maps. 

If you've reached this far, below is the SAP HANA hook for GIS data which you might be able to use after prototyping like I've done with other databases.


1 Comment
Developer Advocate
Developer Advocate
0 Kudos

Fascinating post, thanks for sharing! What we see about the forms, with the different sections (B3, B4, etc) reminds me of some of the key reasons for the standard of fixed record length datasets on IBM mainframes, too, and all that entailed when defining them with JCL DD statements. Happy memories.

Additional comment: Before I saw (from what you wrote) that the more modern forms were PDF based, I had thought, from how they appeared (horrible) that they were Java Applets. Now _that's_ a technology I haven't thought about for a long while! I remember building a Java Applet for some Oil & Gas calculations, when I worked on the IS-Oil team at SAP in the 1990's. I didn't enjoy that task too much.