thoughts on business, service and technology operations and management

Random header image... Refresh for more!

Interesting Links for October 20th

Links that I have found interesting for October 20th:

  • Integrien Files Four Patents for Real-Time Performance Analytics - extend Integrien’s four existing patent filings in the development of a new mathematical foundation for analyzing IT performance data to enable the prediction of performance and availability problems in complex IT systems.

    “Recently, we’ve begun to see claims from traditional systems management vendors of ‘predictive analytics.’ However, the underlying foundation of these technologies is based on standard statistical methods and techniques,” said Dr. Mazda Marvasti, CTO of Integrien. “Through our experience with live customer environments, we have learned that these standard methods are inapplicable to IT data and that a new foundation is required to make sense of such data. These patent filings represent the latest work in the development of this foundation, without which, real-time, predictive analytics for IT performance data is not possible.”

    * This is a great PR!! SMACK! Don't we have even one patent laying around that could put us in the predictive/proactive space?? *

  • Tideway’s CEO on the Green Data Center - the new key metric is “performance-per-watt,” and there is plenty of scope to improve it. The problem is the people, processes and systems required to drive this initiative forward are seriously lagging behind. A research program on best practices set up by the EPA has only 54 volunteers. Most data center administrators are nowhere close to knowing what’s running on which servers in their data centers. That makes adoption of more efficient hardware and co-locating software programs on a single box through virtualization risky, slow and arduous.

October 20, 2008   No Comments

Novell and Managed Objects: Vision, Convenience, Need or ?

The announcement of the definitive agreement for Novell to purchase Managed Objects has been on the street for almost a week now. I’ve held back partly because of a pressing work engagement, but also to try and really think about this some on my own. I’ll admit I do not know much about Novell, its recent acquisitions, vision or strategy. What crosses my mind about Novell is their legacy and recent entrance into the world of open source Linux with SUSE and virtualization with ZENworks stuff.

I’ve read through their website and see the very early beginnings of marketing and spin to play in the enterprise buzzword bingo game of datacenter automation, virtualization, Linux servers/desktops, compliance, security and basic systems management. From what I see they have a long ways to go to fill in the gaps if one was to compare them to the mega product suites of the incumbent “big4″ or “other6″.

Some questions that initially come to my mind are:

  • What’s the bigger strategy and vision?
  • Is this really a CMDB play with BSM as a nice to have?
  • Will MO become a Novell oriented solution?
  • Will the Novell name keep exclude MO from some clients and opportunities?
  • Will this help or hinder their green field competition against the “big4″?
  • Will they be relegated to only compete with the “tier 2″ business service management providers (Quest Software, Compuware, ASG, etc)?
  • How will Novell fill out some of the key gaps in its portfolio to fully enable an end-to-end solution that can leverage MO’s architecture and data collection capabilities? (systems, network, application, service, transaction, user experience, etc.) I see a big need here, starting with basic systems, network and application management (open source opportunities?).
  • How will the MO target markets or customers change?
  • Will MO sales and engineering team experience be diluted by other product quotas and priorities? BSM takes a unique sales team and ideally a very consultative approach to sell the value proposition and overall lifecycle. With more products to sell, how will this impact the overall BSM message?

I hope to speak with some of my friends at MO in the near future as the dust settles to get a glimpse into the future vision and strategy. In any case, it’s all business and the MO products can and will continue to stand on their own and remain a viable option for many around the world. I look forward to developments here and within the field in competitive situations. There will no doubt be wins and losses for both of our companies!

Other notable press and discussion available here:

October 20, 2008   4 Comments

WYNTK on TBSM v4.2 Preparation: Architecture, Design, Implementation, Operationalization Part 2

WYNTK on TBSM v4.2 Design

This is generally where those who will be the most successful TBSM clients separate themselves from the average TBSM client. (Putting execution, politics and other internal stuff aside for now.)

The TBSM architecture phase is that component which is going to provide you with a solid platform to build upon. It’s the TBSM design phase that defines exactly how the architecture and installed software will be leveraged to meet the value proposition and overall goals and objectives. For my WYNTK on TBSM v4.2 Architecture, visit here.

Designing solutions built upon TBSM and the enabling BSM products and datasources happens on many levels and dimensions. I will introduce the high level design areas that should be considered in any TBSM deployment project.

Design Goals and Objectives

BSM Event Design: As previously mentioned in another WYNTK on TBSM v4.2, events are one of the key components of any TBSM solution. A well thought out design for how all incoming event sources will be incorporated, normalized and enriched to a defined BSM Event Standard is critical.

Event Management Event Design: I’m torn here as to whether I recommend this, but I know many clients will want to leverage the included Netcool/WebTop for event management functions. This is an important design phase if you intend to have end users actively managing events. This could be the design for how tickets will be created from events, how events are enriched or suppressed during maintenance windows, or implementing an event acknowledgement, ownership, transfer schema. Establishing an event schema for these activities is critical if you plan for event management scenarios using TBSM v4.2 (Netcool/WebTop).

Event Integration: With the world’s leading event management platform provided with TBSM v4.2 you will want to design the event integration requirements appropriate for your environment. Which probes, gateways or other custom means will be required to feed events into Netcool/OMNIbus? How will you align these to your event designs described above? How will you handle stateful and non-stateful events? Which events require sophisticated processing, lookups, conversions, suppression, etc.? Lots of effort should be placed here!

Datasource Integration Design: If you’re integrating with various databases, web services or XML sources, design specifications for these interfaces need to be thought through. How will you integrate? What’s the most efficient approach for getting the necessary data? What types of SQL queries may be required? Will you hit a full table or a view? If you’re going to pull from a web services interface, do you understand the various components such as the WSDL and structure for query and response? How will you throttle queries to not impact the other system? Will you need to change the caching on TBSM to accommodate the amount of data returned?

AAA Design: As I talked about in a previous WYNTK on TBSM v4.2, the design for AAA is crucial towards the end state solution you desire. How will you structure access and authorization to the various TBSM components? What level of access control granularity is required? Will you have requirements for read-only users or groups? Will you have TBSM “superusers” that have a subset of roles allowing them to perform some TBSM administration? How will you authenticate your end users? Will you integrate with a corporate LDAP or Active Directory source? Do you know what information is required to establish these integrations?

Behavior Modeling (component, application, transaction, process/activity, sub-service, service, etc.): I’ve talked about the concept of behavior modeling in many of my posts, specifically in the WYNTK on TBSM Design Patterns series. Far too many clients simply take the “get some color in the model” approach and map all incoming events by severity to the managed system name (hostname). This is the least useful approach for creating your template and service models. Instead, you need to design behavior models for your environment using your own custom designed template library. Think of this as the approach for holistically assessing the sum of all the parts for a specific “thing” in your environment. I typically start with these categories: Availability, Performance, Capacity, User Experience, State/Status, etc. There are dozens of categories that can be used here (I’ll write more) but you need to design the model for each component. What incoming status rules, metrics, KPI, etc. will you use to SPECIFICALLY assess each category? DO NOT JUST MAP IN ALL EVENTS WITHOUT A PURPOSE AND DESIGN!

Template Model and Design: Building on the behavior modeling concept, how will you design the right library of templates for your environment? How will identify the patterns of IT infrastructure and create representative template models that exhibit the correct propagation and aggregation behaviors? Think about the concept of generic templates and specific templates. How will you manage all Windows servers? How will you manage Windows 2003 and Windows 2008 servers? Will you do this all the same or will you have unique approaches for one over the other? State propagation and aggregation are critical and often poorly implemented features of TBSM. Do you know how to accurately model highly available, redundant or load balanced infrastructure? Do you know the behaviors of common components like Web, App and Database servers? Will you design for a containment model approach or an end-to-end flow model approach? Will you take a shallow approach or deep approach? How will you adopt the KISS approach? How will you make use of the autonomic capabilities that will greatly improve your TBSM administrative experience?

Service Decomposition and Modeling: Finding the right level of detail to empower your end user audiences to think, act and respond different is key here. Will you simply decompose a service into its functional groupings or tiers or take a more detailed end-to-end flow approach? How will you assess the architectural implementations and accurately represent them using your template libraries? Focus on this area in your design with the right end state in mind. There’s no sense in creating thousands of service instances for every component in your environment if it makes the end solution too complex to use. This takes significant thought in conjunction with the previous two areas to create service models that enable the most value to be realized from the TBSM solution.

Scorecard Design: Scorecard design begins with understanding the message you’re trying to communicate to the end user audience. Why do you need to create the scorecard? What decisions, actions, questions do you want the audience to make based on the information you present in the scorecard? Do you want someone to navigate from the scorecard? What conditional formatting will you apply? Will you override the state or status of other service instances within the scorecard? How will you handle aggregation and roll ups of information? Will you need to create “dummy instances” to better organize your instances for a more logical presentation? Will you need to put text (varchar/string) or special characters ($, %, :, sec, etc) within the scorecards? What will your column naming scheme be to make the most efficient use of the scorecard real estate throughout all levels? How will you use the column sizing tool? What templates will be applicable to your scorecard? How will you handle gaps in the hierarchy?

Charting/Reporting Design: Similar to above, what are you trying to accomplish with your charts and reports? Don’t create charts and reports if they’re not there for a specific purpose for each audience. Don’t just create eye candy! When will you use TBSM Charts, TIP Charts or TCR Reports? Do you know the right application of pie, bar or line charts? Will you need to create sophisticated, parameter driven reports? Have you established your own guidelines for which one to use and when? How do your end users define real-time, near real-time, near term and long term historical? Do you need to establish a design guide for charting and reporting (look, feel, font, colors, logo)? What is your report distribution and scheduling design? Do you need to get PDF’s sent out to your end users by 9AM each morning? How will you perform scalability and performance testing? Do you need to establish an AAA schema for controlling access to certain charts or reports?

Custom Dashboard and Layout Design: Ahh, my specialty. Way too many guidelines for design in this post but this is the sum of all the parts. Each and every component of the custom dashboard and layout must be designed and designed with a purpose. Controlling information overload is critical. Understanding how the content you develop will prompt action, decision making and navigation must be fully understood and designed for. Navigation and launch in context within and from the custom dashboard and layout must be designed. You must have a design plan for every component, every clickable option and every menu. How will you visualize complex data? Do you need to create a “walled garden”? Do you need to navigate from one custom dashboard to another to another? What view definitions and “dummy instances” do you need to create to support your custom dashboards and layouts? How will you integrate your AAA schema into the custom dashboard and layouts?

In Closing

No design alone is guarantee that you’ll be able to successfully implement the solution. I’ve seen plenty of great design documents resulting in failed implementation and execution. KISS is a good rule to design by. Don’t design in unnecessary complexity. Don’t use a feature or capability “just because”. Get feedback from your peers and end users to be sure you’ve designed things that will meet their requirements. Design documents can be thought of as the “CYA” component of your TBSM v4.2 solution implementation. They’re great for keeping your project (and staff) organized and on track to ensure that all bases are covered. I’ve seen way too much lack of attention to detail derail end user experience and expectations not being met because things were not thought through well enough in the beginning.

I’ll share some thoughts on implementing your designs next.

October 17, 2008   No Comments

Interesting Links for October 9th

Links that I have found interesting for October 9th:

  • QSFT Quest Software appoints CEO and executive chairman - Quest Software, a provider of enterprise systems management software products, has appointed Doug Garn as its new CEO. The firm also appointed Vinny Smith to the newly created position of executive chairman of the company's board of directors.

October 10, 2008   No Comments

Interesting Links for October 8th

Links that I have found interesting for October 8th:

  • Harris: Nimsoft is growing fast and bullish on its future - San Jose Mercury News - Gary Read, chief executive of Redwood City-based Nimsoft, professes to have the inside story: "It's because Goldman Sachs completed their investment in us that Warren decided to invest in them."
  • myDIALS Named Top 10 Company to Watch in 2009 by Managing Automation - myDIALS, which is pioneering a new standard in operational performance management, today announced that the company has been recognized as one of Managing Automation's "Companies to Watch in 2009." In the October 2008 issue, Managing Automation, a leading publication that focuses on technology solutions for progressive manufacturers, recognizes 10 technology companies that have demonstrated forward-thinking solutions to address manufacturers' pain points.
  • ITIL Process WIKI - Wiki for generic IT Service Management Process descriptions based on Ideas of ITIL V2 , ITIL V3, ISO 20000 and experience.

    The Process descriptions are published under Creative Commons Licenses Attribution-Noncommercial-No Derivative Works 3.0 Unported.

  • Nimsoft Blogs » Woo Hoo - we're excited!!!!! - An InformationWeek product review ranked Nimsoft the “Best of the Best” among nine application performance management (APM) solutions. APM is a critical segment of systems management because it monitors the performance of the customer experience – in addition to core IT infrastructure performance. InformationWeek reviewers noted Nimsoft was “a real leader” in service level management for its “flexible and robust SLA reporting engine” and ability to “report SLA performance granularly … which many APM tools are unable to do.”
  • How to Succeed at Capacity Planning Without Really Trying : An Interview with Flickr's John Allspaw on His New Book | High Scalability - When I read statements about The Art of Capacity Planning like capacity planning is a term that to me means paying attention, All the information you need to make an educated forecast is in your historical metrics, and startups that are going to experience massive growth simply don't have time for anything but a 'steering by your wake' approach, I get the same sea change feeling I felt when the industry ran from waterfall design and embraced agile design. Current capacity planning is heavy. All up-front. Too analytical and too divorced from real life.

    Other capacity planning books assault you with models, math, and simulations. Who has the time? John has developed a common sense, low math approach to capacity planning that works using the system you already have. John's goal is to have you say: Oh, right, duh. That's common sense, not voodoo.

    Here's my email interview with John Allspaw on The Art of Capacity Planning. Enjoy.

  • Goldman Sachs leads $12 million investment in Nimsoft - Goldman Sachs isn't letting the current economic meltdown hinder its investment in systems management software maker Nimsoft. The financial services firm is among three contributors to pony up funds for Nimsoft’s $12 million second round of funding.

    "Despite the tough economic times, we were overwhelmed with interest from multiple prestigious investors," said Gary Read, president and CEO of Nimsoft, in a statement. "An investment from Goldman Sachs is further recognition of the strength of our business model, product performance and future business opportunity."

    *** Congrats Gary - you still owe me discussion on your BSM strategy! :-) ***

  • Zenoss Adds Network Management and Systems Management Veterans to Leadership Team From Sun Microsystems, IBM, BMC, CA, and Novell - MarketWatch - Zenoss, Inc., a leading provider of commercial open source application systems, and network management software, today announced the expansion of its executive team to add five seasoned professionals in systems management and enterprise software. These additions will help to continue the exponential customer growth and demand for Zenoss' enterprise-grade open source systems management solutions.

October 9, 2008   No Comments

Interesting Links for October 7th

Links that I have found interesting for October 7th:

  • IBM Tivoli Expanded to Include Predictive Analytics - IBM Tivoli has announced new capabilities for its event management software designed to enable customers to predict potential events such as network outages, online trading transaction failures, cell phone service disruptions or retail point-of-sale downtime.

    The IBM Tivoli offers new “predictive analytics” that take event management to the next level through trending and historical data to help customers be proactive instead of reactive when managing millions of potential events per day.

    These predictive capabilities are now available in recently released versions of IBM Tivoli Monitoring, IBM Tivoli Netcool/OMNIbus and IBM Tivoli Network Manager. The new version of Tivoli Business Service Manager – available later this fall – will also feature these new capabilities.

    *** Hmmm, something doesn't smell right here to me….***

  • People Over Process » myCMDB Demo - You might remember Managed Objects myCMDB announcement from a little while ago. In my view, myCMDB is trying to provide a layer on-top of CMDBs (Managed Objects own, BMC’s, and “custom”) that pulls features from the consumer, Web 2.0 world. As you go through the overview and then demo, you can pull out that the main idea with myCMDB is to get people using the CMDB more. There are other things, of course, like reports a-plenty and activity streams.
  • Aligning Business Process Management, Service-Oriented Architecture, and Lean Six Sigma for Real Business Results - This paper is intended to help companies that are leaders in their markets and are looking for new ways to differentiate themselves from their competitors. In this paper, we describe the key BPM, SOA, and LSS components, highlight the linkages between them, and summarize the results that leading firms have achieved. We outline the think big, start now steps that are needed to move your own initiative forward. In this paper, we also suggest ways to successfully avoid some of the barriers that have hampered others by focusing on the tools that deliver measurable results quickly.

October 8, 2008   No Comments

Interesting Links for October 7th

Links that I have found interesting for October 7th:

  • The IT-Finance Connection - It is axiomatic that IT and finance folks “speak different languages.” This is a label for something that is a lot deeper and pervasive.

    The bottom line is that companies are losing revenue and seeing their competitive position slip because these two groups don’t communicate well. Indeed, they often don’t try.

    The IT-Finance Connection covers these and myriad other issues from the perspective of the information that must flow between IT and Finance in order to make the wisest decisions. This information transfer is where the rubber meets the road.

  • The Value of Business Transaction Management - Business transaction management enables assessment and control of IT functions and equipment through the experiences of end users. The emerging category, says Motti Tal, the executive vice president of marketing, product and business development at OpTier, leads to higher customer satisfaction and reduces the number and severity of outages.

October 7, 2008   No Comments

WYNTK on TBSM v4.2 Preparation: Architecture, Design, Implementation, Operationalization Part 1

These are critical key areas for preparing for your TBSM v4.2 deployment. I’d strongly recommend that you think through the bigger picture here as you begin your TBSM v4.2 journey, especially if you’re investing in many other Tivoli products. The desired end state for most clients investing in TBSM v4.2 is to have the consolidated operations management platform with end-to-end visibility realized. You can’t get there if you’re not thinking about how the sum of all the decisions and parts will come together in the end.

Part 1: Architecture

There are many significant changes to the fundamental software architecture require you to evaluate (re-evaluate) your architecture in four core areas.

Software Architecture (Failover, Load Balancing, Standalone, Split Server)

Considerations for how you will deploy the software should be discussed as early as possible. How you decide to move forward probably has a lot to do with how you will position TBSM v4.2 as a critical business and IT application or not. Many clients choose not to implement a highly available architecture initially. There are certainly economic and total costs of ownership considerations for not implementing this in the near term. The architecture for failover requires more hardware investment up front. Fancy ‘warm or cold’ standby systems approaches for failover can’t be used here, you’ve got to deploy and configure the systems day one or go through the potential headaches of re-configuration at a later date.

TBSM v4.2 introduces a new architecture model part in support of the new Tivoli Integrated Portal (TIP) platform but also in part for supporting a front end GUI and back end processing server split for increased scalability and performance. The dashboard server and data server concept basically enables scaling out the front end now offloading presentation layer processing from the backend data (events, datasources). TIP incorporates many components such as the application server (eWAS), portal server framework (ISC), Tivoli Common Reporting (TCR), AAA, and associated data access/API layers. When other TIP enabled products are installed such as Netcool/WebTop in TBSM v4.2, this is installed into the TIP server (dashboard server).

When to add more dashboard servers (TIP servers) is definitely a bit fuzzy at this time. There are certainly front end user load dynamics that come into play here where a firm understanding of the types of user volume and what those users will be doing should be understood. I’m thinking about how to classify types of users versus how TBSM v4.2 will be used to gauge these decisions along lines similar to these:

  • Passive or Read Only Users
  • Light TBSM/Event Management (both servers, OMNIbus server)
  • Active TBSM/Event Management (both servers, OMNIbus server)
  • Light Charting/Reporting (dashboard server)
  • Heavy Charting/Reporting (dashboard server)
  • Light TIP Customization (dashboard server)
  • Heavy TIP customization (dashboard server)
  • Multiple TIP enabled applications installed (TBSM, WebTop, ITNM, etc) (dashboard servers)
  • etc.

When adding multiple front end dashboard servers one should always think about the end user and how they’ll be directed into the application. Most clients will front end multiple servers with a hardware server load balancer. TBSM v4.2 *may* include a piece of software called the IBM HTTP Server (from Websphere) that is basically a software based server load balancer. If you choose neither approach and you have multiple front end dashboard servers you will need to educate your end users how to move from one IP address to another should a server failure occur. Also note that TBSM v4.2 dashboard servers in a load balanced configuration (there’s really not a failover configuration here) require that an instance of DB2 v9 is available to support this load balanced architecture. This DB2 v9 instance is not managed by TBSM v4.2 in any way so be sure to invite your DB group to the planning meetings or be prepared to manage this DB2 v9 instance. (The DB2 v9 license entitlement has been verified, but AFAIK it’s not in the installation media. At this time, I do not have firm information that the IBM HTTP Server is being provided.)

My personal choice for future architectural improvements would be addressing TBSM v4.2 back end scalability improvements needed (supporting multiple Netcool/OMNIbus ObjectServers) or a hierarchical based model where multiple TBSM implementations could be aggregated to a master TBSM server (model aggregation, state and status across multiple TBSM servers and associated Netcool/OMNIbus ObjectServers). I think these are the last core architectural improvements needed apart from the much needed broad based expansion of ILOG use for all visualization and GUI components instead of Dojo Widgets and BIRT charts. A premium architecture upgrade would include some of the Cognos dashboard and GUI components like Decision Dashboard or Cognos Now operational BI platform. Hopefully these will come next year and not take three years to get done!

I think it’s probably a good starting point to assume a three server architecture as a minimum and doubling that when failover resiliency is required.

Operating Systems Architecture (OS, 32b v 64b, IPv4 v IPv6)

There are certainly more choices now in the area of how you choose to deploy the TBSM v4.2 software on an operating system platform. Formal support for more OS’s such as Red Hat 5, Suse 10, Windows 2008 and z/Linux are now available. I’m not very fresh on the internals of operating system mechanics but the 32b vrs. 64b decision is certainly growing more important. I know that TBSM v4.2’s optimization for 64b platforms is in its early stages still with plenty of room for optimization ahead. IPv4 and IPv6 decisions may be relevant in your environments though I have yet to work with a client who’s concerned here yet.

I can’t say one OS platform is better than another; IBM will support you within any supported platform for TBSM v4.2. Review the operating system patch level, driver and library requirements. There are some minimum update levels required in certain instances. I do EVERYTHING with Red Hat 4 or 5.

Systems Provisioning (CPU, Memory, Disk)

You will want to carefully review the system sizing recommendations. With the addition of the TIP component (based on eWAS + ISC) there has been an uptick in system resource requirements. Basically, throw as much CPU and Memory as you can at TBSM v4.2. Core Netcool/OMNIbus has always been pretty efficient so I don’t expect much change there. You can’t go wrong starting with 4CPU/8Gb RAM per dashboard and data server as a minimum. Disk requirements should not have changed much in terms of space. I continue to recommend as fast a disk as possible.

TBSM v4.2 should work within a virtual machine environment without any problems. The same CPU, memory and disk requirements hold true here. Some client systems administrator groups like to throttle the recommended requirements down and make application owners justify the CPU/Mem/Disk provisioning recommendations. If you’re in this boat, set application and user experience SLAs with your end users and use these as justification should TBSM v4.2 not perform in accordance to your end user’s requirements.

Integrations, Scenarios and Use Cases

Integrations should always be included in the high level architectural decisions and designs. TBSM v4.2’s many integration touch points include the event management integration layer into Netcool/OMNIbus utilizing the vast library of Netcool/OMNIbus probes and gateways.

Database integration for use within core TBSM v4.2 service models, charting and reporting should not be overlooked. How will you design access, authorization and authentication integration with a corporate LDAP, AD or other source? Are you using TADDM or other CMDB repositories? How will you integrate with them for building core TBSM v4.2 service models?

Integration with trouble ticketing systems using the Netcool/OMNIbus gateways or TBSM v4.2 Request Processor is a very common requirement. Event enrichment and advanced correlation performed by Netcool/Impact requires some thought and planning to ensure desired functions are provided and event processing in TBSM v4.2 occurs as expected. Lots of things to think of here as expected with Tivoli’s focus on integrations.

Where will event management be done? Is TBSM v4.2’s included Netcool/WebTop the best place to do this? Will you try to use TBSM v4.2 as an SLA platform? Do you really understand what SLA’s are (are not) in terms of TBSM? How about reporting? Do you envision hundreds of custom reports from across the enterprise served up by the TBSM v4.2 platform? Do you see console integration and consolidation as the main driver for your adoption of TBSM v4.2 and TIP? Do you have an environment with other non-Tivoli products?

The point here is that you’ve got to think through all of the intended operating scenarios here and plan out the architectural needs and decisions that must be made to support the desired operating models, scenarios and use cases. TBSM v4.2 is the start of a new platform for business service management for Tivoli. It probably should’ve been a dot zero release because of these significant changes introduced with the TIP architecture. We’ll certainly know more as time moves on and we get more clients on the platform and more experience under our belt, but there’s been a learning curve even for me here!

Next up, I’ll talk about the importance of the design phase.

October 6, 2008   1 Comment