WYNTK on TBSM v4.2 Design
This is generally where those who will be the most successful TBSM clients separate themselves from the average TBSM client. (Putting execution, politics and other internal stuff aside for now.)
The TBSM architecture phase is that component which is going to provide you with a solid platform to build upon. It’s the TBSM design phase that defines exactly how the architecture and installed software will be leveraged to meet the value proposition and overall goals and objectives. For my WYNTK on TBSM v4.2 Architecture, visit here.
Designing solutions built upon TBSM and the enabling BSM products and datasources happens on many levels and dimensions. I will introduce the high level design areas that should be considered in any TBSM deployment project.
Design Goals and Objectives
BSM Event Design: As previously mentioned in another WYNTK on TBSM v4.2, events are one of the key components of any TBSM solution. A well thought out design for how all incoming event sources will be incorporated, normalized and enriched to a defined BSM Event Standard is critical.
Event Management Event Design: I’m torn here as to whether I recommend this, but I know many clients will want to leverage the included Netcool/WebTop for event management functions. This is an important design phase if you intend to have end users actively managing events. This could be the design for how tickets will be created from events, how events are enriched or suppressed during maintenance windows, or implementing an event acknowledgement, ownership, transfer schema. Establishing an event schema for these activities is critical if you plan for event management scenarios using TBSM v4.2 (Netcool/WebTop).
Event Integration: With the world’s leading event management platform provided with TBSM v4.2 you will want to design the event integration requirements appropriate for your environment. Which probes, gateways or other custom means will be required to feed events into Netcool/OMNIbus? How will you align these to your event designs described above? How will you handle stateful and non-stateful events? Which events require sophisticated processing, lookups, conversions, suppression, etc.? Lots of effort should be placed here!
Datasource Integration Design: If you’re integrating with various databases, web services or XML sources, design specifications for these interfaces need to be thought through. How will you integrate? What’s the most efficient approach for getting the necessary data? What types of SQL queries may be required? Will you hit a full table or a view? If you’re going to pull from a web services interface, do you understand the various components such as the WSDL and structure for query and response? How will you throttle queries to not impact the other system? Will you need to change the caching on TBSM to accommodate the amount of data returned?
AAA Design: As I talked about in a previous WYNTK on TBSM v4.2, the design for AAA is crucial towards the end state solution you desire. How will you structure access and authorization to the various TBSM components? What level of access control granularity is required? Will you have requirements for read-only users or groups? Will you have TBSM “superusers” that have a subset of roles allowing them to perform some TBSM administration? How will you authenticate your end users? Will you integrate with a corporate LDAP or Active Directory source? Do you know what information is required to establish these integrations?
Behavior Modeling (component, application, transaction, process/activity, sub-service, service, etc.): I’ve talked about the concept of behavior modeling in many of my posts, specifically in the WYNTK on TBSM Design Patterns series. Far too many clients simply take the “get some color in the model” approach and map all incoming events by severity to the managed system name (hostname). This is the least useful approach for creating your template and service models. Instead, you need to design behavior models for your environment using your own custom designed template library. Think of this as the approach for holistically assessing the sum of all the parts for a specific “thing” in your environment. I typically start with these categories: Availability, Performance, Capacity, User Experience, State/Status, etc. There are dozens of categories that can be used here (I’ll write more) but you need to design the model for each component. What incoming status rules, metrics, KPI, etc. will you use to SPECIFICALLY assess each category? DO NOT JUST MAP IN ALL EVENTS WITHOUT A PURPOSE AND DESIGN!
Template Model and Design: Building on the behavior modeling concept, how will you design the right library of templates for your environment? How will identify the patterns of IT infrastructure and create representative template models that exhibit the correct propagation and aggregation behaviors? Think about the concept of generic templates and specific templates. How will you manage all Windows servers? How will you manage Windows 2003 and Windows 2008 servers? Will you do this all the same or will you have unique approaches for one over the other? State propagation and aggregation are critical and often poorly implemented features of TBSM. Do you know how to accurately model highly available, redundant or load balanced infrastructure? Do you know the behaviors of common components like Web, App and Database servers? Will you design for a containment model approach or an end-to-end flow model approach? Will you take a shallow approach or deep approach? How will you adopt the KISS approach? How will you make use of the autonomic capabilities that will greatly improve your TBSM administrative experience?
Service Decomposition and Modeling: Finding the right level of detail to empower your end user audiences to think, act and respond different is key here. Will you simply decompose a service into its functional groupings or tiers or take a more detailed end-to-end flow approach? How will you assess the architectural implementations and accurately represent them using your template libraries? Focus on this area in your design with the right end state in mind. There’s no sense in creating thousands of service instances for every component in your environment if it makes the end solution too complex to use. This takes significant thought in conjunction with the previous two areas to create service models that enable the most value to be realized from the TBSM solution.
Scorecard Design: Scorecard design begins with understanding the message you’re trying to communicate to the end user audience. Why do you need to create the scorecard? What decisions, actions, questions do you want the audience to make based on the information you present in the scorecard? Do you want someone to navigate from the scorecard? What conditional formatting will you apply? Will you override the state or status of other service instances within the scorecard? How will you handle aggregation and roll ups of information? Will you need to create “dummy instances” to better organize your instances for a more logical presentation? Will you need to put text (varchar/string) or special characters ($, %, :, sec, etc) within the scorecards? What will your column naming scheme be to make the most efficient use of the scorecard real estate throughout all levels? How will you use the column sizing tool? What templates will be applicable to your scorecard? How will you handle gaps in the hierarchy?
Charting/Reporting Design: Similar to above, what are you trying to accomplish with your charts and reports? Don’t create charts and reports if they’re not there for a specific purpose for each audience. Don’t just create eye candy! When will you use TBSM Charts, TIP Charts or TCR Reports? Do you know the right application of pie, bar or line charts? Will you need to create sophisticated, parameter driven reports? Have you established your own guidelines for which one to use and when? How do your end users define real-time, near real-time, near term and long term historical? Do you need to establish a design guide for charting and reporting (look, feel, font, colors, logo)? What is your report distribution and scheduling design? Do you need to get PDF’s sent out to your end users by 9AM each morning? How will you perform scalability and performance testing? Do you need to establish an AAA schema for controlling access to certain charts or reports?
Custom Dashboard and Layout Design: Ahh, my specialty. Way too many guidelines for design in this post but this is the sum of all the parts. Each and every component of the custom dashboard and layout must be designed and designed with a purpose. Controlling information overload is critical. Understanding how the content you develop will prompt action, decision making and navigation must be fully understood and designed for. Navigation and launch in context within and from the custom dashboard and layout must be designed. You must have a design plan for every component, every clickable option and every menu. How will you visualize complex data? Do you need to create a “walled garden”? Do you need to navigate from one custom dashboard to another to another? What view definitions and “dummy instances” do you need to create to support your custom dashboards and layouts? How will you integrate your AAA schema into the custom dashboard and layouts?
In Closing
No design alone is guarantee that you’ll be able to successfully implement the solution. I’ve seen plenty of great design documents resulting in failed implementation and execution. KISS is a good rule to design by. Don’t design in unnecessary complexity. Don’t use a feature or capability “just because”. Get feedback from your peers and end users to be sure you’ve designed things that will meet their requirements. Design documents can be thought of as the “CYA” component of your TBSM v4.2 solution implementation. They’re great for keeping your project (and staff) organized and on track to ensure that all bases are covered. I’ve seen way too much lack of attention to detail derail end user experience and expectations not being met because things were not thought through well enough in the beginning.
I’ll share some thoughts on implementing your designs next.