≡ Menu

Evaluating a BSM Solution: Measuring Effectiveness

in BSM, Guest Authors, metrics, metrics catalog, Uncategorized

In my first post, we talked about what is wrong with current solutions followed by a post of sharing my experience of making BSM happen (realizing/implementing it). Then I side tracked for a post to share a really great research invention by folks at IBM and its relevance in BSM (Strategic Capability Network).

In this post, I intend to share insights from my experience of evaluating BSM/SQM for clients to gauge effectiveness, and performance of the solution .  I am sure most consultants on the ground might have encountered this situation when they were hired to evaluate someone else’s BSM solution and recommend changes to make it WORK!!

Measuring effectiveness of a BSM solution is not easily quantifiable as it involves multiple factors which are not just statistical but are also related to organization structure, architectural implications, rational behind decisions, culture, process, usability analysis and ecosystem of the company. Guess what, to do all the aforementioned –  I was given 4 weeks + 1 week for planning. The planning week was  the most challenging with debates on what factors/indicators to include and which ones to leave out. Eventually the following were the priorities: measure the usability, effectiveness, completeness (coverage) and accuracy.

After researching endlessly on how to accomplish this WE came to an agreement on using the following approach to measure holistic performance of the BSM solution:

Performance = Complexity Process *  Team *  Tools  [1]

Let us take these terms one at a time, I have explained these factors with an real examples and the lesson I learnt from these incidents:

>> Complexity:  Does an executive really care about memory on server displayed on executive dashboards?  Are the indicators really accurate and reliable? If yes, How much? These are some of the indicators which are measured very seldom.  Complexity is also driven by context and environment we deal with; for this we measured utilization, ease of information accessibility for stakeholders, number of influenced decisions/quarter, time to address issues (before vs. after) and some other subjective quantitative indicators.

Real incident: While evaluating  BSM built by this great Service Assurance team, we found the dashboards for a production support teams (of various silos’) had fault management metrics which made no sense to the users. Of Course, no one used it!! Only change we did to make this dashboard a hit was changing the metric terms and status aggregation pattern(auto-population logic and SLA rules). In this case, accuracy and reliability really contributed to the complexity to the users who were too skeptic about using an interface which did not even use the language they understood to check on the applications they supported.  This change was not a big development effort, it was only adapting to the environment and reducing the accidental complexity by streamlining the process of displaying domain driven language. 

Lesson learnt: Well defined processes will reduce the planned and accidental complexity; measure the effectiveness with the organizational awareness of how to use the solution.  

>> Team:  How much information is easily accessible to the stakeholder? Is every category of stakeholder considered in the solution? Does everyone think this “Dashboard” is of any value or Do they prefer some other medium to achieve the same objective? In all the above cases, we need to adapt to the environment and put forward a balanced approach.

Real incident: One enviornment where I was working on a solution Executives had imposed Netcool for an Operations team which was used to custom built tools and situation was that of a RIOT!! Users complained for months that Netcool did not show accurate information on device status which they used to get out of the old custom tool.  Everyone in the Service Assurance team shooed them away 🙂 After talking to them, I realized that they had a point. The old tool used to report after pinging the server but also when the server came up, it would check for sysuptime and if the report if the server was unavailable due to power outage or some other reason. Poor users did not know the logic or the details behind the homemade tool.

Lesson learnt:If they (users) are using it, their is a valid reason, look for it!! Hammer will take you only so far. Balance personalization with layering and tiering the solution so that everyone gets what (information) they need, the way they need it, and when they need it. Most importantly, BSM is not about changing the organization 180 degrees, its about increasing productivity and reporting the information for making the best business decisions.

>> Tools are not only critical to task accomplishment but are also related to the overall organization productivity.  Caution: Imposition of tools is not BSM!! Personalization is the only way BSM can really be a successful offering. In my experience, implementations where a team selects what suits them the best and communicates information upstream to the enterprise instance have been much more successful and used.  Ample experiences are already out there for tools but the lesson that I learnt out of it was that, we should not look for silver bullets when evaluating tools. It is best left to the users as to which tool they are comfortable with.

>> And finally, Performance: Although some of my friends will argue that performance is not a holistic term; we took a objective approach rather than a subjective one to ensure that WE had statistics to back our results.  This helped us immensely!! 

All and all, evaluating a BSM was much more challenging than building it because of the merging/conflicting visions and principles followed while original implementation of the solution. I think underscores the need for standards and guidelines for BSM solutions. (Remember: Only when X.733 was put in, we knew how to define events in a standardized way. ) I am not lobbying for enforcement (via standards) but the Industry really needs at least some vendor neutral guidelines to retain the value, vision and capabilities  for Business Service Management Solution.

References:

[1]  Grady Booch has used the definition of performance in his famous speech at 9th Annul Turings Lecture  :