dougmcclure.net — thoughts on business, service and technology operations and management in the digital transformation era

Interesting Links for November 27th

by delicious on November 28, 2009

in General

Links that I have found interesting for November 27th:

John Resig – Deep Tracing of Internet Explorer – After reading a recent post by Steve Souders concerning a free tool called dynaTrace Ajax, I was intrigued. It claimed to provide full tracing analysis of Internet Explorer 6-8 (including JavaScript, rendering, and network traffic). Giving it a try I was very impressed. I tested against a few web sites but got the most interesting results running against the JavaScript-heavy Gmail in Internet Explorer 8.
NASDAQ OMX to Provide Customers With Independent Latency Measurement Service by Correlix (Nasdaq:NDAQ) – The NASDAQ OMX Group (Nasdaq:NDAQ) announced today that it has selected Correlix Inc. to provide real-time latency insight to its customers. The service will allow NASDAQ OMX customers access to independent real-time latency measurement information for intraday and post-day analytics.
NASDAQ OMX will initially provide latency measurements for The NASDAQ Stock Market. Through the Correlix RaceTeam(TM) service subscription, NASDAQ Stock Market trading customers will be able to monitor and analyze real-time latency information associated with their orders, executions and market data. Further plans include expanding the service to make latency information available for other NASDAQ OMX markets around the world.
American Family Mutual Insurance Co. Selects Correlsense SharePath to Ensure Peak Performance and Availability of Its Online Insurance Quoting System – – Correlsense, a leading provider of IT Reliability(TM) solutions through business transaction management (BTM), today announced that American Family Mutual Insurance (http://www.amfam.com/default.asp) has selected the Correlsense SharePath solution to ensure peak performance and availability of the insurance company's online insurance quoting system.
Connecting the computer dots | Business Features | Jerusalem Post – SharePath is so unique, Shacham says, that Correlsense has three patents on the technology. "It took us several years to build SharePath; we had to dig deep into the guts of computers and networks, both online and offline, to figure out a way to make these connections," he says. "And as a result, we are able to analyze 100 percent of the traffic in a computer or network, or on a Web site, and determine exactly what is causing certain behaviors."
Morning Round-up – IBM Software Analyst Connect 2009 – What I’m looking for here is how IBM is going to motivate the world’s enterprises to spend money on all these new geegaws and technologies. We seem to be at another point in IT innovation where the features and functionality available are far ahead of what companies are asking for (or know to ask for). The stuff of cloud-nut daily frothing like advanced development and automation, for example, are too uncontrolled, unknown, and new at this point for companies to quantify the risks and benefits of use. Companies like IBM (and all its peers) along with the associated communities need to help IT (and their companies) transition to using these new technologies without slicing off too many toes in the process.
Compuware Gomez Accelerates Diagnosis of Web Application Performance Issues Through New Solution – Compuware Corporation (Nasdaq:CPWR), today announced that Gomez, Compuware's Web performance division, released a new product, Gomez Active Data Center. It connects Gomez's "outside-in" end-user Web performance monitoring with "behind-the-firewall" infrastructure monitoring systems to provide an integrated view of application performance and IT service delivery across the Enterprise and the Internet. By aggregating Gomez Web application performance alerts with internal systems events, IT operations teams can more quickly identify internal systems causing business-impacting Web application performance issues.
Network and Security Operations Convergence – What do you call the new converged NOC and SOC?
It's called the Core Operations Center-not Network Operations or Security Operations.

We are a heavy Microsoft shop and we are leveraging SharePoint to provide us Web-based access to a single portal from anywhere, anytime, like a traditional NOC. We use AccelOps' integrated monitoring, analytics and reporting, for both security and network operations. Also its business service instrumentation can complement the Core Operations Center.
Original MARS Creators Set Out to Take Cisco Users Beyond CS-MARS – AccelOps' founders, who previously created Cisco's popular security monitoring appliance, offer a holistic monitoring approach that results in greater operational control, efficient incident response and compliance automation beyond that of current SIEMs (Security Information Event Management systems).
Functional Parity In The IT Service Management Support Tool Market « Stephen Mann’s Non-Blog – Beyond the delivery of core ITIL-based capabilities, however, many vendors have their own unique selling point or points to differentiate them from the pack. This is usually their own particular slant on, or flavour of, ITSM innovation. For instance, IBM takes a business view extending asset management to all business assets and supporting business service management, Service-now.com offers value for money and convenience through what it terms ‘modern Software-as-a-Service’, FrontRange offers additional ITSM efficiencies through telephony integration, and both CA and HP offer the benefits of true service portfolio management capabilities through their Project and Portfolio Management solutions.

1 comment

Interesting Links for November 26th

by delicious on November 27, 2009

in General

Links that I have found interesting for November 26th:

Generating events in WebSphere Message Broker for transaction monitoring and auditing – This article shows you how to configure and generate monitoring events in a WebSphere Message Broker message flow. Monitoring events are very useful built-in features for transaction monitoring and auditing, and this article describes them in detail.
Data Center Asset Management, Inventory Control and Configuration Management Database Software for your IT data Center – Data Center Audit (DCA) is an Open Source web application designed for inventory control and tracking of IT data center hardware.
DCA is specifically targeted for small to medium size data center administrators because DCA's strength is in its simplicity, effectiveness, and ease of use.

No Agents! No Device Probing!

Web based. Access anywhere

Single view of equipment details

View available or in-use systems

View end-to-end connections

Reserve (check out) elements

Visual detailed view of each rack

Export views into text CSV's

Detailed log of device activity
Why Monolith? Things Change | A Monitoring Odyssey….. with Monolith Software – ** Go go Bill the Blogger! 🙂 **
I recently joined Monolith and since doing so I have received numerous calls and emails asking me; what is Monolith and why did you go to Monolith???

My experience in the Infrastructure management space has been at Micromuse, Voyence, and EMC. Each firm had a “best in breed” tool which was appropriate for the infrastructure management market at that point in time, but as David Mamet once titled a great movie, “Things Change”.

Over the last several years, I have had customers consistently asking me for end-to-end views of their infrastructure. In the past, the answer was to wheel in multiple products, and then propose a services engagement to stitch them together. In reality, we recommended they add more software, hardware, and administrators, effectively increasing the cost and complexity to address a simple requirement – end-to-end management.
Integrien Claims Record Growth – Irvine-based Integrien, a developer of IT analytics and performance measurement software, reported today that it had "record" quarterly bookings in Q3 of 2009. The privately held firm said its bookings were up 238% year-over-year. Actual financials were not disclosed by the company. The firm said it closed the largest deal in its history in the quarter from a financial firm.
PDQ: Pretty Damn Quick – PDQ (Pretty Damn Quick) is open source software associated with the books Analyzing Computer System Performance with Perl::PDQ (Springer 2005) and The Practical Performance Analyst (McGraw-Hill 1998, iUniverse.com Press 2000). The PDQ software package may be downloaded freely from this web site whether or not you own a copy of the book. PDQ uses queue-theoretic paradigms to represent all kinds of computer systems.
Nastel BTM Webinar Replay – In the blindingly fast world of financial services the difference between gains and losses is often measured in microseconds. Financial Service firms often stay on the bleeding edge of hardware and software in order to ensure ultra low-latency in their transactions. Transaction volumes are increasing and as they do, so does latency. Increased competition also necessitates moving towards an ultra low-latency strategy.
But…

Normal accident theory suggests that in complex, tightly coupled systems (like trading floors), accidents causing latency are inevitable. To defend against "normal" latency, you need visibility – 360° situational awareness of your trading environment.

Please join Charley Rich, VP of Marketing & Product Management at Nastel Technologies and featured guest Ellen Carney, Senior Analyst at Forrester Research as they discuss these issues and how to identify and resolve them.
Correlating Events to Recognize Problems | Heroix Blog – What’s The Problem?
Events can be misleading. Consider an example where several servers are behind a switch. We’ll further assume that we are monitoring the availability of the switch and the servers. When the switch goes down, what happens? A ton of notification is sent alerting everyone that all the servers are down, which is effectively true, but isn’t really the problem. Of course eventually the switch down alert comes in with all the server down messages. This is a simple example, where most good engineers will immediately diagnose the problem when they read the switch down alert, but a lot of messages were sent to notify you of the true problem. I always cringe when I know my boss is getting flooded with email that the sky is falling. Now, what if we use some logic in our notification that only sends out server down messages when the switch is OK, and suppresses all the server down messages when the switch goes down? That would be useful.
The keys to Effective SLAs | Heroix Blog – Service Level Agreements are usually the object of desire, fear, and uncertainty all at the same time. They can be such useful tools that it’s important to demystify them. SLAs are desirable because they provide accountability and timely feedback to managers. They are to be feared when they include factors beyond control or that are poorly aligned with reality. SLAs are commonly approached with a high degree of uncertainty about what to measure and how to report results as an effective tool for all parties. While the ingredients in SLAs are as varied as applications and service providers, all effective SLAs share a few critical characteristics.
Good and Bad SLAs

Let’s start by poking fun at what will be the worst example of an SLA you’ve ever heard of or that I’ve been a party to implementing. I should point out this happened long before I became part of the Heroix team.
The HP Universal CMDB SPARQL Adapter – ITSM is about making IT accountable to business. IT and business functions meet at the service interface where business functions define themselves in terms of the services they deliver to their customers. The IT function, in turn, must provide the infrastructural capabilities and resources necessary to support it. The ITIL proposes that the IT configuration should be explicitly modeled within a Configuration Management System (CMS). The HP Universal Configuration Management Database (UCMDB) is one component of a CMS; maintaining a comprehensive and up-to-date snapshot of all managed assets and their inter-relationships across the IT environment. The UCMDB is not particularly web-friendly; there is no easy way to access configuration data using a conventional browser. Another drawback is that configuration records do not correspond directly to the language of business. This report addresses the first of these issues; making the IT configuration navigable on the web.
After 20 years event and fault management gets a make-over and dramatically breaks away from ALL current market offerings – The old fashioned way was to track, locate and fix faults on the infrastructure. But this meant that they had already found a way in and required manpower to sort out.
Without a radical departure from legacy tools, there is the threat that, as systems become more sophisticated and dynamic, more and more potentially arresting demands will be placed on the infrastructure. For those relying on out-dated management tools this can lead to unprocessed events and missed alerts resulting in blind spots and silent failures that can cost a business millions…

The new approach: Prevention is better than cureEvent trending is clearly the path to detecting standard operating behavioral anomalies and is the most effective way to address potential faults and problems.

Business Logic + Configuration Management = Effective Event and Fault Management• RiverMuse Core is differentiated by its super-agility, yet still emulates the functionality found in popular legacy fault management systems.
RiverMuse appoints President and CEO Jean-Luc Valente – RiverMuse appoints Jean-Luc Valente as President and CEO. This announcement coincides with the company's roll out of next generation event management software. RiverMuse is the first – and only – commercial open source company of its kind in the event and fault management field and it is on course to set a new industry benchmark.
Mariner to Participate in Cisco’s TelcoTV Workshop – Mariner will demonstrate its xVu suite of IPTV service monitoring tools. These tools bring complete end-to-end monitoring of the access and home network to Cisco's IPTV solution by leveraging the innovative Visual Quality of Experience (VQE) platform. The collaboration follows the success of Mariner's full integration of xVu with Cisco's VQE platform.
RiverMuse – Formidable forces move in at RiverMuse with new board level appointments – 18th November 2009: Today, RiverMuse welcomes Rich Green to its Board of Directors as well as Con Blackett and Matt Asay to the company’s Advisory Board. In their respective roles they will provide counsel to the RiverMuse leadership team, advise on strategic initiatives according to their relative expertise, as well as assist in forging new industry relations. This elite group adds deep industry knowledge and leadership to RiverMuse and complements its premier line up.
JL Valente, CEO and President at RiverMuse, said: “For a company aiming as high as we are it was imperative to attract a crack team of world class business and technology leaders. I have no doubt that these three industry heavyweights, with their collective experience, will enrich RiverMuse and help propel us to the next level.”
Dead Man Walking (CIC) | A Monitoring Odyssey – One of the most interesting observations for me during our trip related to CIC. For those of you unfamiliar with CIC, it is the Cisco OEM of Micromuse Netcool. Cisco called it Cisco Info Center (i.e. CIC). Cisco started the OEM relationship with Micromuse back in December, 1997, and initially the partnership had a great deal of focus from Cisco. They were not just going to simply put their bridge logo on the interface, but were committed to actually adding their value to the product. They built a development team to put (or try to put) their influences into the product: the object server was called the info server; the probes were called mediators. I recall one of the areas being their “real time trapd mediator”. Cisco was adding sequence numbers to traps to validate that none were missed.
Knoa Delivers New End-User Experience Monitoring Solution for Virtualization, Cloud, SaaS – # Dynamic Benchmarking, which enables the IT organization to compare system performance prior to the change with system performance before and after each change in the back-end infrastructure.
# Comprehensive Threshold Alerting, which allows IT organizations to create and manage alerts based upon established Service Level Agreements (SLA).
# Dynamic Base-lining, which allows the IT organization to monitor when any performance metric (response time, quality or utilization) varies from short or long-term trends. Dynamic base-lining directly attacks the difficult issue of ensuring minimal performance degradation for the thousands of transactions for which meaningful SLA thresholds have not yet been set.
# Advance Root Cause Analysis, which allows the IT Operations team to evaluate the impact of end-user behavior and desktop resources and conditions on any performance anomaly. This is available as an additional capability.

1 comment

Interesting Links for November 25th

by delicious on November 26, 2009

in General

Links that I have found interesting for November 25th:

Psytechnics Introduces New Service Desk Feature – Service-level management company Psytechnics (News – Alert) has come out with a new module, called Service Desk Manager, on its Experience Manager solution.
The new capability enables service desk staff at service providers to get first level call information on people that contact the desk so they are better able to help address their concerns. The Service Desk Manager also allows service providers to create workflow rules to ensure various problems are escalated to the appropriate divisions and people elsewhere in their organizations.
On Service Levels (Jon on Performance) – Payments systems and customer-facing systems require two additional measurements of service level: transaction success and response time. Both are extremely important.
Response time is a traditional measure of service level: i.e., 98% of the transactions must finish within 2 seconds. Note that this is a distribution, not an average. An average could be stated as “the average response time must be less than 2 seconds.” Averages are terrible specifications of response. For example, one could have three transactions taking 3 seconds each and three taking 1 second each. The average response in this case is 2 seconds. But this scenario wouldn’t meet the 98% criteria, since its distribution (3 at 3 secs and 3 at 1 sec) would be only 50%.

Transaction success rates should also be tracked. In the real world of multi-stage transaction paths, it’s possible to interrupt a transaction somewhere in the path, deny it, and return it within the required response time.
The Amber Point: Who’s Responsible for Sorting Out Failed Transactions? – When Transactions Fail, Which Group is Responsible for Sorting Things Out?
We got a variety of answers, as you’d expect. Not every organization handles failed transactions the same way. However, by far the most common answer was Application Support Groups. Operations was a distant second, followed closely by Business Units. Here are the results of our poll.

1) Application Support Group – 68%
2) Operations – 13%
3) Business Units – 12%
4) We just muddle thru – 7%

For companies that don’t have Business Transaction Management, it’s typically the Business Units who first hear about the issue—often from irate customers whose transactions did not complete properly. The Business Units then notify Operations and App Support (please note that I’m using the word “notify” here as a very gentle euphemism for the way they actually tell them about the problem).
BMC Atrium CMDB Visualization – Many companies struggle to foresee what impact an infrastructure failure will have on their services to a client. This is caused due to an incomplete view of their Configuration Items (CIs) and more importantly – the relationships between the CIs. In other words – if a CI needs to be configured, how will my service to the client be effected?
This is where Data Visualisation comes in. It allows for better Service Impact Management, because it provides a graphical presentation of your CIs and allows you to quickly and effectively build a service infrastructure.

Two of Tiberone’s senior consultants provide an online demo of how BMC Atrium acts as a Data Visualisation Tool, followed by a short presentation on data visualisation.
Service Level Dashboard 2.0 is available for SCOM 2007 – Service Level Dashboard 2.0 plugs into an existing SCOM installation to add new features. The premier feature is the service level objective (SLO). An SLO loosely aligns to the industry-standard term service level agreement (SLA). The critical difference is that an SLO within this tool does not enforce the SLA that you may have; it is used to configure service goals for the applications monitored.
Also, Service Level Dashboard 2.0 introduces a relatively quick turnaround, with reporting and visibility displayed at three minutes or less latency. SCOM reporting is generally very slow because there is a lot of data to aggregate and report when configured for large installations. Another new feature that the Service Level Dashboard introduces is the dashboard metrics to track measurable performance elements.
CA Upgrades System Management Products – CA has upgraded several management software products to better supervise virtualized software assets and has introduced CA Spectrum Service Assurance as a new product.
The management console of CA Spectrum Service Assurance r1.1 displays both the physical and virtual parts of the infrastructure, maps them, and helps identify trouble spots as they develop, said Stephen Elliot, VP of business unit strategy for CA's Systems and Network Management unit.

CA Spectrum Service Assurance offers visualization of system data in simple-to-read charts and diagrams and enhanced analytics for pinpointing causes of slowdowns or outages. The new product is available immediately and is priced starting at $175,000, with integration of up to five system management data sources, Elliot said.
How to Cut IT Energy Consumption Using Business Service Management – How to Cut IT Energy Consumption Using Business Service Management
For years, IT has been under intense pressure to implement an expanding number of new business services that are critical to the enterprise. To do so, IT staffs have typically added hundreds or even thousands of servers over the years, which has resulted in skyrocketing IT power consumption and energy costs. Here, Knowledge Center contributor Chris Rixon explains how companies can cut IT energy consumption and achieve other green IT objectives by using an approach based on business service management.
BTM – the pain relief for CMDB? « Business Transaction Management Blog – I’ll put my head on a lance and state that Business Transaction Management (BTM) can add significant value to any CMDB project. When you start to monitor business transactions you start to acquire lots of key intelligence on how your business runs and maps to IT. You auto-discover transaction flows and the IT assets they interact with, all in real-time. It also gets better, you can store all this data historically so that you can report and compare business services and their CI’s before and after a change. You can even visualise how the business and IT asset dependencies change over time using transaction flow/topology diagrams as key evidence. When a change occurs on an IT asset you can instantly report whether this change had a positive or negative impact on your business services or transactions by reviewing related latency and SLA. I’m not claiming BTM is the answer to all CMDB pain but it solves some of the most basic and common challenges:
Open Source Zenoss Core Project Makes Cloud Monitoring and Community-Led Systems Management Innovations – What's New in Zenoss Core 2.5, according to the company: – Amazon Elastic Compute Cloud (EC2) Monitoring – New Amazon EC2 monitoring provides dynamic snapshots of both the individual and collective performance of the instances within an account. In addition, Zenoss Core now includes the ability to immediately track performance without user intervention and to drill deep on performance issues with "inside-the-instance" monitoring.
Knoa Delivers New End-User Experience Monitoring Solution for Virtualization, Cloud and SaaS – Knoa® Software, the leading provider of end-user experience and performance management software, today announced the availability Knoa Virtual/Cloud Experience Manager (VCEM). Knoa VCEM is the first truly ‘off-the-shelf’ product that monitors and manages real end-user experience for enterprise applications that are running in virtualized environments, delivered via SaaS, or provisioned via Cloud Computing.
Knoa VCEM extends this concept to the worlds of SaaS, Virtualization and Cloud computing with a streamlined, turnkey application designed specifically to support the management needs of the IT Operations team changing the provisioning model for business critical applications like SAP, Oracle EBS, Oracle Seibel and Microsoft.
Survey Finds IT Departments Still Struggle with Application Performance Issues Despite Availability of Effective Management Tools – – OpTierÂ®, the leader in Business Transaction Management (BTM), today announced results of a survey that indicates that most organizations are experiencing significant IT issues including trouble resolving performance problems due to a lack of visibility into IT transactions, challenges collaborating between departments and poor alignment between business and IT priorities.
It found that 40 percent of organizations have experienced serious IT problems over the past year. While BTM has been proven as an effective technique for understanding complex architectures, 67 percent of respondents fail to track business transactions flowing through their systems. At the same time, 88 percent believe the number of transactions flowing through their IT infrastructure is increasing.
The Problem with SLA Monitoring in Virtualized Environments | Virtualization Journal – This affects ALL performance metrics that rely on the operating system clock time to keep track of time which includes system counters like CPU or I/O Utilization. Performance Management solutions therefore run into the problem that the monitored metrics are inaccurate and can lead to incorrect enforcement of SLAs or wrong assumptions about application performance.
This blog explains the time keeping problem, how it impacts Application Performance Management in virtualized environments and what can be done to solve this problem.
Performance Antipatterns in AJAX Applications Performance, Scalability and Architecture – Java and .NET Application Performance Management (dynaTrace Blog) – This post covers 3 major areas of JavaScript performance. Distributed Communication – a core part of AJAX – can help reduce transferred data content. At the same time communication between distributed systems is one of the biggest sources of performance problems.A second aspect is JavaScript based widgets that allow browsers to catch up with Rich Clients in terms of usability. Different JavaScript Frameworks offer nice visual effects implemented with heavy JavaScript using timers and DOM manipulations. However these effects can result in poor usability if e.g.: the expansion of a visual area takes several seconds.Last, but not least we will address the topic of memory leaks. Some memory problems are similar to those on the server-side.
Caveat Emptor « Business Transaction Management Blog – Dear reader: If you are one of the IT professionals who are evaluating BTM solutions and having a hard time telling them apart, start by checking which solution evolved from a single-tier monitor and which was built for BTM from day one. Compare the platform coverage matrices, and take the time to verify each vendor’s claimed abilities to auto-discover and show complex business flows with many different protocols inside and outside of the data center. One of the nicer aspects of BTM is that you should see its value right away, literally a few hours after you install. If a tool can only show you parts of your topology, or requires you to install several different products and integrate them together, then it is probably not a BTM solution. Buyer, beware! The slides of many vendors may look alike, but only a few have real BTM technology as well as the expertise required to get it implemented in a way that will deliver on the true promise of business transaction management.
Adventures in Open Source » Blog Archive » The Kindness of Strangers – Unfortunately, also like us, they seem to be the victim of code theft, or at least some very questionable licensing practices. A company called Firescope appears to have appropriated some of the Zabbix code. They have raised several million dollars in VC funding, lead by a company called Technology Advisors, and it seems they have used this money to put a prettier front end to the Zabbix code (sound familiar?). Apparently efforts by Zabbix to get this issue resolved have been a failure.
Note: Please read the comments below. The CEO of Firescope has responded to these claims with more information. I am hoping that Alexei will respond when he can. When I get some time I plan to download the trial version of Firescope as well as Zabbix and see for myself if the similarities still exist, but until then I will withhold judgement and I ask that you do as well.
People Over Process » Contempory Root Cause Analysis – A short while ago, I did a webcast with RedMonk client AccelOps around the tried and true IT process of root cause analysis – hunting down what’s causing problems in IT. My part of the presentation spoke to how root cause analysis is effected by three new technologies: virtualization, multi-tier applications (OK, not so new), and cloud computing. AccelOps’ Scott Gordon spoke to how AceelOps All-in-One can help.
While my slides are above, if you’re interested you’ll probably want to check out the recording for the full run-down, including a pretty good AccelOps overview. You’ll have to fill out a form to see it, but it’s freely available.

Also, check out my review of AccelOps release from a few months back.

3 comments

Next Posts Previous Posts