Additional Blogs by Members
cancel
Showing results for 
Search instead for 
Did you mean: 
boobboo
Contributor
295

During my last blog post, I talked about how my previous project  started an MDMP Unicode conversion. At this point in the story, we had  completed two Unicode conversions, but without much success on  converting the data correctly.
With the help of Nils from SAP, the project team sat down and took a  long hard look at the problem, as well as where our pain points lay. We  had enough data to establish that our big pain points were Financial  accounting tables for the Russian codepage issues, also the client had  some custom tables which were pain points.
The project team took a number of decisions from this analysis
1. Another person needed to be responsible for the Unicode conversion   language assignment, because I was becoming a Single Point of Failure - I   understood the process better than anyone else on the team. This also   reduced the pressure of two heavily technical roles to a single role.    (We actually ended up with two people, which I have to say is better.)

2. We  would not attempt any archiving or data migration steps to try  and  correct the data, because we knew where we had problems. If we had  done  any data migrations, we ran the risk of changing our known  problems into  unknown problems.

3.Setting only English as ambiguous


4. Executing two more dry runs

5. Using SUMG to repair data, not the reprocessing logs from SPUM4,  this was because the Reprocessing logs were not as controllable as SUMG.

We had set the stage for another attempt, this time we were confident  we could get it right. So we ran the CUUC process, and ended up with a  system that was actually quite useable 🙂
There was a lot of stuff to fix in SUMG, but we had accomplished a  Unicode conversion and most of the data was intact.
This was for three main reasons
1. I stopped trying to do too much at once, during the CUUC process I  executed all the technical steps. Also I stopped trying to optimise the  Unicode process for speed, we were much more concerned with data quality  - so we decided to take the hit on the performance.
2. We executed a lot of work on the vocabulary assignment for unknown  words and also we delved deeply into the Reprocessing scans. Using the  reports um4_analyse_replog and um4_replog_stats, myself and the language  team worked hard to resolve collisions and unknown words.

It is very important to note here that quite often this process does  not yield a correct assignment for a word, especially with related  languages like Polish and Russian. The best that someone can hope for is  to achieve the least wrong answer, which does not create too many  reprocessing logs.


3. Better use of table comparison tools to automate comparison of data  from pre and post conversion extracts

Once we delivered the 3rd system, the response from the business and  the project team was one of greater confidence, we now had a process  that worked and we knew the data well enough to go for our final dry run  before Production.

We set out very clear criteria for our final dry run, it had to meet  strict conversion completeness standards as well as run to time without  major incident.
The process started with a copy of Production, upon which we executed a  final round of language assignment, again bringing the total of unique  words below 10000. We then tackled the reprocessing logs, here we did  not want tables with over 100000 entries in them as this would produce  too many reprocessing logs for SUMG to handle.


Next we ran the CUUC process just like we would on Production, which  completed successfully. Following extensive testing, it was found that a  few of the tables had failed their conversion threshold - but the team  was confident that these could be fixed in Production.

At this point we were flying high and the Steering committee had  given permission to start preparations for PRD, which I will address in  my next blog post.