The Data is Already in a Relational Database
demarc

Program Conversion - Mainframes - Schema


A computing paradigm that is coming into common use is 'MVC', or Model, View, Controller.  The Model is the database structure: tables, columns, views, and the other elements that define data, metadata, or data relationships.  The View is the user presentation: what someone sees on a screen or printout.  The Controller is the business rules, the way data stored in the model is manipulated on it's way to the view, and the way it's validated and formatted when transmitted from the view to the model.

Once a system's objectives and value proposition have been identified, the next task in most business applications is to identify what data is to be maintained and how it will be structured.  If a prototype exists (i.e., the existing application) the schema has been pretty clearly laid out.  Migrating this to the 'new' computing platform has to be done before business rules have anything to operate on or presentation layers have anything to present.

Given that a mainframe data store exists, the initial scope of conversion efforts should be focused on migrating the schema.  For the following discussion it is assumed that the source database is hierchical (i.e., IMS), and the target is an instance of SQL Server.  If the data store is already in a relational database, some but not all of the work is already done.

Naming conventions in mainframes tended to run towards THIS-THAT-THE-OTHER-THING, in other words all caps with dash separators, with a legth limit of 32 characters.  The more common naming conventions in client/server systems is UpperLower, so the name would be translated to ThisThatTheOtherThing.  The length limit is probably closer to 64 characters, so that situations where abbreviation was forced by circumstances certain naming elements can be more descriptive.

It is common in some designs to have a 'type' field in one of the first colums, with the type indicating that a record is, for instance, a settings records, password record, customer master record, etc.  What becomes clear is that certain structures that should have been dedicated to specific roles ended up, over time, getting pressed into several kinds of service.  This means that what appears to be normalized isn't, and that the schema conversion is going to be more involved than simply replicating the table structure in SQL.  In hierarchical databases a 'master' record can have 'groups' that contain multiple repeating elements, each of these makes up a 'detail' record, which may in turn contain repeating elements: for example Invoice contains product Line Items which contain Serial Numbers.  In SQL this becomes an Invoice table, an InvoiceLineItem table, and an InvoiceLineItemSerialNumber table, or some similar construction.

If this table is storing information that is not invoice information and has several type codes, then these need to be split off into their own tables.  Some of these other 'types' may have repeating groups that are only populated within that type, therefore not all repeating groups are saved in one-to-many table relationships in all derived tables.  These 'other' tables all have to be given their respective names, and some information that is really configuration should be stored in the registry or in XML settings files.

In a large conversion, this phase should be carved out of the larger whole as a project in it's own right.  The scope of the project should be redefined to focus on the schema representation and ad-hoc report construction.  In the SQL Server case, this means hiring database designers, ETL experts, database adminstrators, and a programmer or two.  The scope of the system should be limited to production of demand reports using SSRS or Report Builder.  The server database should be refreshed from the mainframe host every night, and the reporting users given a 'full day' snapshot representing a complete daily business cycle. 

This pathway creates two substantial benefits: the first is that the database people know their part of the system completely, and make a good resource to programmers that show up later.  The second is that periodic revisions to the database schema don't force rewrites of all dependent applications code.  The requirement that ad-hoc reporting be part of the system insures that users are aware of both the table design and the database contents, because they're using it daily.  The users become 'auditors': they know if the data representations aren't right, and if this is due to structure or record contents.

For those that feel like their system is way overdue for an upgrade, please call 210-734-5575 for free initial consultation.

Or, eMail us at Info@ResourceLogic.net