≡ Menu

dougmcclure.net

thoughts on business, service and technology operations and management in a big data and analytics world

These are my links for March 3rd through March 25th:

  • Numenta Releases Grok for IT Analytics on AWS | Numenta – Grok anomaly detection leverages sophisticated machine intelligence algorithms to enable new insights into critical IT systems. Grok automatically learns complex patterns and then highlights unusual behavior. As software topologies and usage patterns change, Grok continuously learns and adapts, eliminating the need for frequent resetting of thresholds. Visualization of Grok output is displayed on a constantly updated mobile device, enabling IT professionals to assess the health of their systems anytime, anywhere. Using Grok, IT operators can better prevent business downtime while reducing false positives.

    Grok is the first commercial application of Numenta’s groundbreaking Cortical Learning Algorithm (CLA), biologically inspired algorithms for machine intelligence. The core CLA technology is ideal for large-scale analysis of continuously streaming datasets and excels at modeling and predicting patterns in data.

    “Grok provides an early warning system to IT professionals to give them real-time insights into their system performance,” said Numenta CEO Donna Dubinsky. “Grok anticipates problems before they happen, reduces false positives, and lowers engineering costs through automated modeling and continuous learning.”

    Grok features include:

    Monitoring of performance and health of AWS environments or other systems
    Automatic modeling to determine normal patterns
    Automatic identification and ranking of unusual patterns
    Continuous learning of new patterns as environments evolve – no need for manual threshold setting
    Notification to user when an anomaly occurs
    Output displayed graphically on an Android mobile device
    Simple setup via a web-based or command-line interface
    Support for AWS auto-scaling groups and logical clusters

  • LMAO if you don’t logstash | by Paul Czarkowski | @pczarkowski
  • Elasticsearch.org Kibana 3.0.0 GA Is Now Available! | Blog | Elasticsearch – Today is a big day for Elasticsearch and the Kibana team. After 5 milestone releases and over 1000 commits, we’re happy to announce the release of Kibana 3.0.0 GA. Over the last year, Kibana has moved from a simple interface to search logs to a fully featured, interactive analysis and dashboard system for any type of data. Everyday, we’re incredibly inspired by the people who tell us they’ve solved major problems, optimized their existing deployments and found insights in places they never imagined.
  • Apache Solr vs ElasticSearch – the Feature Smackdown! – The Feature Smackdown
  • SiLK: enterprise-grade log analysis solution | LucidWorks | LucidWorks – LucidWorks Solr integration with LogStash and Kibana (SiLK) is an enterprise-grade log analysis solution that enables the ad-hoc search and analysis of billions of events and transactions across multiple applications, servers and devices.
  • Advanced Web Analytics for Big Data & Hadoop – Alpine Data Labs
  • wise.io | Machine Learning as a Service & Big Data Analytics – Our state-of-the-art machine learning technology reveals hidden value in your data. Our applications integrate seamlessly into your business.
  • Home | Skytree – Machine Learning on Big Data for Predictive Analytics – Machine Learning is the modern science of discovering patterns and making predictions from complex data.
  • UCI Machine Learning Repository – We currently maintain 273 data sets as a service to the machine learning community. You may view all data sets through our searchable interface. Our old web site is still available, for those who prefer the old format. For a general overview of the Repository, please visit our About page. For information about citing data sets in publications, please read our citation policy. If you wish to donate a data set, please consult our donation policy. For any other questions, feel free to contact the Repository librarians. We have also set up a mirror site for the Repository.
  • Log.io – Real-time log monitoring in your browser
  • doubaokun/node-ab – A command tool to test the performance of HTTP services.
  • zanchin/node-http-perf – Node HTTP Server Performance Tool
  • Uptime by fzaninotto – A remote monitoring application using Node.js, MongoDB, and Twitter Bootstrap.
  • Cloud Foundry and Logstash – Scott Frederick’s humble blog – The Cloud Foundry Loggregator component formats logs according to the syslog standard as defined in RFC5424. The logstash cookbook includes an example configuration for syslog consumption, but that configuration follows an older RFC3164 syslog standard.

    Here is a logstash configuration that works with RFC5424 output, with some additional changes for Cloud Foundry:

  • Using your historical data for analytic usage
  • Using R for Educational Research: An Introductory Workshop to Break the Learning Curve – R_intro_SERA_2012.pdf
  • Welcome to a Little Book of R for Time Series! — Time Series 0.2 documentation – This is a simple introduction to time series analysis using the R statistics software.
  • BestFirstRTutorial.pdf
  • Introducing R
  • Introduction to R Seminar – UCLA Institute for Digital Research and Education
0 comments

These are my links for December 11th through March 3rd:

  • Output to Elasticsearch in Logstash format (Kibana-friendly) – In this post you’ll see how you can take your logs with rsyslog and ship them directly to Elasticsearch (running on your own servers, or the one behind Logsene’s Elasticsearch API) in a format that plays nicely with Logstash. So you can use Kibana to search, analyze and make pretty graphs out of them.

    This is especially useful when you have a lot of servers logging [a lot of data] to their syslog daemons and you want a way to search them quickly or do statistics on the logs. You can use rsyslog’s Elasticsearch output to get your logs into Elasticsearch, and Kibana to visualize them. The only challenge is to get your rsyslog configuration right, so your logs end up where Kibana is expecting them. And this is exactly what we’re doing here.

  • GraphLab Notebook | GraphLab – The power of GraphLab with the ease of Python, running in the Cloud.
  • Prelert Introduces Push Button Machine Learning in Anomaly Detective 3.1 – Prelert, the first vendor to package data science into downloadable applications for everyday users, today announced the release of Anomaly Detective 3.1, which introduces the ability to deploy powerful machine learning tools at the push of a button.

    Anomaly Detective is a deeply integrated app for Splunk Enterprise that helps identify and resolve performance and security issues, and their causes, as they develop. It provides a solution to one of the major problems inherent in working with Big Data – gaining valuable insights from otherwise overwhelming volumes of data in real-time.

  • DataLoop.io – Cloud Server Monitoring for DevOps & Operations Teams – Dataloop.IO is a new start-up in the IT Infrastructure Monitoring space, focused on building a new monitoring tool for DevOps/Operations teams that run Cloud services at scale.

    Our Cloud service significantly reduces the time required to setup and deploy your monitoring. It reduces the friction of writing and deploying new monitoring scripts so your team can ensure full coverage regardless of how quickly your environment is changing.

  • Predictions For 2014: Technology Monitoring | Forrester Blogs – Further development of pattern analytics to complement log-file analytics. For the last five years, log-file analytics has been a major focus area in the area of IT operational analytics. During 2014 we expect further development with pattern analytics or features that can make insights based on data in-stream or in-flight on the network.

    Re-emergence of business service management (BSM) features. Increasing technology innovation is leading to greater complexity in business service architecture. This means that any features that simplify the management of complex business services become a must. Hence why we predict the re-emergence of BSM features that will be more successful than previous attempts, as these new BSM approaches will have automated discovery and mapping of technology to business services.

  • Data Mining Map – An Introduction to Data Mining
  • Actian Analytics Platform™ | Accelerating Big Data 2.0™ | Actian – Actian transforms big data into business value for any organization – not just the privileged few. Our next generation Actian Analytics Platform™ software delivers extreme performance, scalability, and agility on off-the-shelf hardware, overcoming key technical and economic barriers to broad adoption of big data, delivering Big Data for the Rest of Us™.
  • Visual Intelligence for your web application – COSCALE – The COSCALE Application Performance Analyzer provides swift and accessible visual intelligence for your web-application through smart correlations of any application and infrastructure metric
  • Qubole | Big Data as a Service – Switch your data infrastructure to auto-pilot using our award-winning, auto-scaling Hadoop cluster, built-in data connectors and an intuitive graphical editor for all things Big Da
  • Altiscale Hadoop as a Service – Altiscale’s offering is ideally suited for today’s data science needs. Features for data science include permanent HDFS volumes, access to the latest tools, resource sharing without conflict, job-level monitoring and support, and pricing plans that eliminate unpleasant surprises.
  • Boundary Surpasses 400% YoY Growth in Processing of Massive IT Operations Performance Analytics in the Cloud – The Boundary service is processing an average of 1.5 trillion application and infrastructure performance metrics per day on behalf of its clients and has computed occasional daily bursts of over 2 trillion metrics.
  • To Log or Not to Log: Proven Best Practices for Instrumentation – Innovation Insights – To log or not to log? This is an age-old question for developers. Logging everything can be great because you have plenty of data to work from when you have a problem. But it’s not so great if you have to grep and inspect it all yourself. In my mind, developers should instead be thinking about logging the right events in the right format for the right consumer.
  • IT Operations Analytics (ITOA) Landscape – Say goodbye to years of chronic IT Operations pains. IT Operations Analytics (ITOA) is here, and gaining strong momentum. You Are a Leader – Seize The Opportunity.
  • Zoomdata – Next Generation Big Data Analytics

    - Built for the Big Data Revolution
    - Connected to the World in Real-Time
    - Designed for the Touch Generation
    - Fuses Data into a Single Experience
    - Easy & Powerful Interface

  • Enterprise Management Services Enterprise Event Management – Enterprise Event Management Trying to manage a modern IT environment without a consolidated view of operations is like trying to drive a car at 100 mph while looking at six different dashboards. The proliferation of development, build, and operations tools has made it increasingly difficult to stay in control of IT and reduce downtime. Too often, developers and administrators have been left with two unappealing alternatives: Either they have to try and write their own event consolidator or they struggle with legacy products from a different era. Boundary is the industry’s leading SaaS-based enterprise event management offering, enabling you to track and optimize your modern, rapidly changing application infrastructures. With Boundary, you can consolidate, standardize, prioritize, enrich and correlate events and notifications from hundreds of systems into a single console.
  • Cubism.js – Cubism.js is a D3 plugin for visualizing time series. Use Cubism to construct better realtime dashboards, pulling data from Graphite, Cube and other sources. Cubism is available under the Apache License on GitHub.
  • Crosslet – Crosslet is a free small (22k without dependencies) JavaScript widget for interactive visualisation and analysis of geostatistical datasets. You can also use it for visualizing and comparing multivariate datasets. It is a combination of three very powerful JavaScript libraries: Leaflet, an elegant and beautiful mapping solution, and Crossfilter, a library for exploring large multivariate datasets in the browser. D3, a data driven way of manipulating objects. Crosslet also supports TopoJSON, a GeoJSON extension that allows to present geometry in a highly compact way. Crosslet is written in CoffeeScript and uses less for styling.
  • Charts, Graphs and Images – CodeProject
  • dc.js – Dimensional Charting Javascript Library – dc.js is a javascript charting library with native crossfilter support and allowing highly efficient exploration on large multi-dimensional dataset (inspired by crossfilter's demo). It leverages d3 engine to render charts in css friendly svg format. Charts rendered using dc.js are naturally data driven and reactive therefore providing instant feedback on user's interaction. The main objective of this project is to provide an easy yet powerful javascript library which can be utilized to perform data visualization and analysis in browser as well as on mobile device.
  • SharePoint Development Lab by @avishnyakov » Go Cloud – A better logging for SharePoint Online/Office365/Azure apps – It seems that cloud based products and services have a significant impact on how we design, write, debug, trace and deliver our applications. The way we think about this is not the same anymore; there might be no need to have SharePoint on-premises and SharePoint Online/O365 might be a better choice, there might be no reason to host a web application on dedicated hardware/hosting provider, but Azure could bring more benefits. All these trends cannot be simple ignored, and it is a good thing to see how new services and offerings might be used in you applications.
0 comments

To catch up, check out part 1, part 2 and part 3.

I wanted to get an up to date configuration out based on some recent work for our upcoming Pulse 2014 demo making use of the latest versions of logstash v1.3.3 and our SCALA v1.2.0 release. Nothing significantly different per se, but the changes in logstash syntax and internal event flow/routing has significantly changed from v1.1.x.

I’ve included an example logstash v1.3.3 configuration file in my git repo here. It should be simple to follow the flow from inputs, filters and outputs. The use of tags and conditionals is key to control filter activation and output routing. It’s very powerful stuff!

I’ll get another post out this week with our next key component being the SCALA DSV pack to consume the events routed via logstash to SCALA.

0 comments

These are my links for November 19th through December 11th:

  • The Netflix Tech Blog: Announcing Suro: Backbone of Netflix’s Data Pipeline – Suro, which we are proud to announce as our latest offering as part of the NetflixOSS family, serves as the backbone of our data pipeline. It consists of a producer client, a collector server, and plugin framework that allows events to be dynamically filtered and dispatched to multiple consumers.
  • Sensu | An open source monitoring framework – Designed for the Cloud The Cloud introduces new challenges to monitoring tools, Sensu was created with them in mind. Sensu will scale along with the infrastructure that it monitors.
  • datastack.io – data integration as a service – collect data. share insights.data integration as a service * * Kinda Logstash or Heka. But without the pain.
  • Glassbeam Begins Where Splunk Ends – Going Beyond Operational Intelligence with IoT Logs | Glassbeam – Glassbeam SCALAR is a flexible, hyper scale cloud-based platform capable of organizing and analyzing complex log bundles including syslogs, support logs, time series data and unstructured data generated by machines and applications. By creating structure on the fly based on the data and its semantics, Glassbeam’s platform allows traditional BI tools to plug into this parsed multi-structured data so companies can leverage existing BI and analytics investments without having to recreate their reports and dashboards. By mining machine data for product and customer intelligence, Glassbeam goes beyond traditional log management tools to leverage this valuable data across the enterprise. With a focus on providing value to the business user, Glassbeam’s platform and applications enable users to reduce costs, increase revenues and accelerate product time to market. In fact, Enterprise Apps Today’s Drew Robb recognized this critical value proposition naming Glassbeam a hot Big Data startup for analytics, which is attracting interest from investors, partners and customers. Today’s acquisition serves to showcase a market that is heating up, and new requirements around data analytics. But this is only the start and Glassbeam deliberately picks up where Splunk ends. We remain committed to cutting through the clutter and providing a clear view of operational AND business analytics to users across the enterprise.
  • Splunk Buys Cloudmeter to Boost Operational Intelligence Portfolio – The acquisition of Cloudmeter rounds out Splunk's portfolio with a capability to analyze machine data from a wider range of sources. Financial terms of the deal were not disclosed. The transaction was funded with cash from Splunk's balance sheet, the company said. Indeed, the addition of Cloudmeter will enhance the ability of Splunk customers to analyze machine data directly from their networks and correlate it with other machine-generated data to gain insights across Splunk's core use cases in application and infrastructure management, IT operations, security and business analytics.
  • Netuitive Files for Ground-Breaking New Patent in IT Operations Analytics – Press Release – Digital Journal – The patent filing is led by Dr. Elizabeth A. Nichols, Chief Data Scientist for Netuitive, a quantitative analytics expert focused on extending Netuitive's portfolio of IT Operations Analytics (ITOA) solutions to new applications and services. "Netuitive is committed to delivering industry leading IT Operations Analytics that proactively address business performance," said Dr. Nichols. "In addition, Netuitive's research and development is actively focused on new algorithm initiatives that will further advance our abilities to monitor new managed elements associated with next-generation IT architecture and online business applications."
  • Legume for Logstash – Legume Web Interface for Logstash & Elasticsearch Legume is a zeroconfig web interface run entirely on the client side that allows to browse and search log messages in Elasticsearch indexed by Logstash.
  • Deploying an application to Liberty profile on Cloud Foundry | WASdev – As part of the partnership between Pivotal and IBM we have created the WebSphere Application Server Liberty Buildpack, which enables Cloud Foundry users to easily deploy apps on Liberty profile.
  • IBM’s project Neo takes aim at the data discovery and visualisation market – MWD’s Insights blog – Project Neo is IBM’s answer to data visualisation and discovery for business users. It promises to help those who don’t possess specialist skills or training in analytics, to visually interact with their data and surface interesting trends and patterns by using a more simplistic dashboard interface that helps and guides users in the analysis process. Whereas previous tool incarnations are often predisposed to using data models, scripting or require knowledge of a query language, Project Neo takes a different tack. It aims to bypass this approach by enabling users to ask questions in plain English against a raw dataset (including CSV or Excel files) and return results in the form of interactive visualisations.
  • Machine learning is way easier than it looks | Inside Intercom – Like all of the best frameworks we have for understanding our world, e.g. Newton’s Laws of Motion, Jobs to be Done, Supply & Demand — the best ideas and concepts in machine learning are simple. The majority of literature on machine learning, however, is riddled with complex notation, formulae and superfluous language. It puts walls up around fundamentally simple ideas.

    Let’s take a practical example. Say we wanted to include a “you might also like” section at the bottom of this post. How would we go about that?

  • Where Are My AWS Logs? – Logentries Blog – Over my time at Logentries, we’ve had users contact us about where to find their logs while they were setting up Logentries. As a result, we recently released a feature for Amazon Web Services called the AWS Connector, which automatically discovers your log files across your Linux EC2 instances, no matter how many instances you have. Finding your linux logs however may only be a first step in the process as AWS logs can be all over the map… so to speak…. So where are they located? Here’s where you can start to find some of these.
  • Responsive Log Management… Like Beauty, it’s in the Eye of the Bug-holder | – As a software engineer, I’m responsible for the code I write and responsible for what we ship. But designing, building, and deploying SaaS is a real challenge – it means software developers are now responsible for making sure the live system runs well too. This is a real challenge, but with Loggly I get real-time telemetry on how my code is running, how my systems are behaving – and how well our software meets the need of our customers.
  • Mahout Explained in 5 Minutes or Less – blog.credera.com – In the spectrum of big data tools, Apache Mahout is a machine-learning engine that fits into the data mining category of the big data landscape. It is one of the more interesting tools in the big data toolbox because it allows you to extract actionable tasks from a big data set. What do we mean by actionable tasks? Things such as purchase recommendations based on a similar customer’s buying habits, or determining whether a user comment is spam based on the word clusters it contains.
  • Change management using Evolven’s IT Operations Analytics – TechRepublic – Evolven is designed to track and report change across an array of operating systems, databases, servers, and more to help pinpoint inconsistencies. It can also assist you in preventing issues and determining root causes of problems. Evolven can be helpful with automation—to find out why things didn’t work as expected and what to do next—and can also alert you to suspicious or unauthorized changes in your environment.

    Human and technological policies go hand-in-hand to balance each other and ensure the best possible results. Whereas my last article on the subject referenced the human processes IT departments should follow during change management, I’ll now take a look at technology that can back those processes up by examining what Evolven does and what benefits it can bring

  • Fluentd vs Logstash – Jason Wilder’s Blog – Fluentd and Logstash are two open-source projects that focus on the problem of centralized logs. Both projects address the collection and transport aspect of centralized logging using different approaches.

    This post will walk through a sample deployment to see how each differs from the other. We’ll look at the dependencies, features, deployment architecture and potential issues. The point is not to figure out which one is the best, but rather to see which one would be a better fit for your environment.

  • astanway/crucible · GitHub – Crucible is a refinement and feedback suite for algorithm testing. It was designed to be used to create anomaly detection algorithms, but it is very simple and can probably be extended to work with your particular domain. It evolved out of a need to test and rapidly generate standardized feedback for iterating on anomaly detection algorithms.
  • Now in Public Beta – Log Search & Log Watch | The AppFirst Blog – The decision to open our new log applications to the public was not one taken lightly. Giving our customers the ability to search all of their log files for any keywords is quite taxing on our system, so we had to take several precautions. To ensure the reliability of our entire architecture, we decided to create a separate web server solely responsible for retrieving log data from our persistence storage HBase. By making this an isolated subsystem, we don’t run the risk of a potentially large query bogging everything else down as well.
  • Log Insight: Remote Syslog Architectures | VMware Cloud Management – VMware Blogs – When architecting a syslog solution, it is important to understand the requirements both from a business and a product perspective. I would like to discuss the different remote syslog architectures that are possible when using vCenter Log Insight.
  • Why We Need a New Model for Anomaly Detection: #1 | Metafor Software – Share on reddit Share on hackernews Share on email

    I’m not talking about anomaly detection in stable enterprise IT environments. Those are doing just fine. Those infrastructures have mature, tested procedures for rolling out software updates and implementing new applications on an infrequent basis (still running FORTRAN written in the 70s, on servers from the 70s, yeah, that’s a thing).

    I’m talking about anomaly detection in the cloud, where the number of virtual machines fluctuates as often as application roll outs. Current solutions for anomaly detection track dozens or even hundreds of metrics per server in an attempt to divine normal performance and spot anomalous behavior. An ideal solution would adapt itself to the quirks of each metric, to different application scenarios, and to machine re-configurations.

    This is a problem that lends itself to machine learning techniques, but it’s still an incredibly difficult problem to solve. Why?

  • Beyond The Pretty Charts – A Report From #devopsdays in Austin | Metafor Software – Don’t just look at timeline charts. We’ve fallen into the trap of looking at all the pretty charts as time series charts. When we do that, we end up missing some important characteristics. For example, a simple histogram of the data, instead of just a time chart, can tell you a lot about anomalies and distribution. Using different kinds of visualization is crucial to giving us a different aspect on our data.
  • Server Anomaly Detection | Predictive IT Analytics | Config Drift Monitoring | Metafor Software – Know about problems before your threshold based monitoring tool does. Get alerted to issues your thresholds will never catch.

    Metafor’s machine learning algorithms alert you to anomalous behavior in your servers, clusters, applications, and KPIs.

0 comments

If you’d like to catch up, check out the first three posts in this tutorial, starting here.

The easiest way to get started is by having a good understanding of the structure of your Netcool events. With a fairly default deployment we know there are a number of standard alerts.status fields of interest such as first and last occurrence, node, agent, alert group, alert key, manager to name a few. Nearly every customer I have ever worked with has extended their alerts.status schema to accommodate the various probe and gateway level integrations they have as well as to support event enrichment, auto-ticketing, etc.

There’s definitely a level of maturity here that needs to be understood through brief analysis of your events via the AEL. Which slots are you populating with a high degree of completeness? Which ones help you understand the context of an event beyond the node name? Which ones are used to determine if the event is ever acted upon? Which ones will help you assess the event streams, ask questions and take actions on investigating event validity within your environment? Your goal is to ensure you have the best possible set of fields that will enable your event analysis, event analytics and most importantly the decisions, actions and next steps you will be able to take based upon your analysis.

One place you can get a complete snapshot of the alerts.status configuration is the ../omnibus/var/Tivoli_eif.NCOMS.alerts.status.def file. I used this to get the list of all the field names for easy copy and paste when building my socket gateway mapping file.

With the fields of interest identified, download and install the Netcool/OMNIbus socket gateway in accordance with the install instructions in the docs. If you don’t already own the socket gateway, check with your sales rep. In most cases since you’re using it to route events from one C&SI product to another, there isn’t a charge. But, IANAL and T&C’s change with the wind so check. If you have a problem with this, ping me and I can suggest a number of other alternative approaches.

Once installed, the first configuration activity is to update the gateway’s socket.map file with the fields you’re interested in.

  • Make a backup copy of the original.
  • Remove the default fields you’re not interested in.
  • Add fields you are interested in.
  • Place the fields in a logical order.
  • NOTE: I’m placing the @Identifier first as the socket gateway inserts an event type (INSERT, UPDATE, DELETE) in front of each event it sends across so we don’t want that to mess up any other slot.

This is the socket map I used within in our pretty default environment when sending events from ITM, APM/ITCAM, BSM, etc. For a bare bones set up, the ones I’ve highlighted in bold are probably good enough to get started.

CREATE MAPPING StatusMap
(
'' = '@Identifier',
'' = '@LastOccurrence' CONVERT TO DATE,
'' = '@FirstOccurrence' CONVERT TO DATE,
'' = '@Node',

'' = '@NodeAlias',
'' = '@Summary',
'' = '@Severity',
'' = '@Manager',
'' = '@Agent',
'' = '@AlertGroup',
'' = '@AlertKey',
'' = '@Type',
'' = '@Tally',
'' = '@Class',
'' = '@Grade',

'' = '@Location',
'' = '@ITMDisplayItem',
'' = '@ITMEventData',
'' = '@ITMTime',
'' = '@ITMHostname',
'' = '@ITMSitType',
'' = '@ITMThruNode',
'' = '@ITMSitGroup',
'' = '@ITMSitFullName',
'' = '@ITMApplLabel',
'' = '@ITMSitOrigin',
'' = '@CAM_Application_Name',
'' = '@CAM_Transaction_Name',
'' = '@CAM_SubTransaction_Name',
'' = '@CAM_Client_Name',
'' = '@CAM_Server_Name',
'' = '@CAM_Profile_Name',
'' = '@CAM_Response_Time',
'' = '@CAM_Percent_Available',
'' = '@CAM_Expected_Value',
'' = '@CAM_Actual_Value',
'' = '@CAM_Details',
'' = '@CAM_Total_Requests',
'' = '@BSMAccelerator_Service',
'' = '@BSMAccelerator_Function'
);

Next, we need to set up some simple filtering to control the event types we send across the gateway. The socket.reader.tblrep.def is used to define what comes across the socket gateway and what filters we might want to apply. Here are a couple examples I’ve used.

Only sends INSERTS and UPDATES (not DELETES as they don’t send across the entire event structure) and filter out all of the internal TBSM events which are Class 12000.

REPLICATE INSERTS, UPDATES FROM TABLE 'alerts.status'
USING MAP 'StatusMap'
FILTER WITH 'Class !=12000';

Only sends INSERTS and UPDATES (not DELETES as they don’t send across the entire event structure) and filter out events with Severity 0, 1 and 2.

REPLICATE INSERTS FROM TABLE 'alerts.status'
USING MAP 'StatusMap'
FILTER WITH 'Severity >=3';

I was unable to figure out a more complex filter example which I would have liked to use for more filtering so these had to do.

Next, the core socket gateway properties need to be configured. Edit the NCO_GATE.props file as follows.

#Update these based on your install preferences
MessageLevel : 'warn'
MessageLog : '$OMNIHOME/log/NCO_GATE.log'
Name : 'NCO_GATE'
PropsFile : '$OMNIHOME/etc/NCO_GATE.props'

#This will be the IP and Port for your logstash installation and the TCP Input you use
Gate.Socket.Host : '10.10.10.1'
Gate.Socket.Port : 1234

#These will create a comma separated (CSV) event format with fields wrapped in " ".
Gate.Socket.EndString : '"'
Gate.Socket.StartString : '"'
Gate.Socket.Separator : ','

#This sets First/Last Occurrence format to mimic ISO8601 format supported by SCALA
Gate.Socket.DateFormat : '%Y-%m-%dT%H:%M:%S%Z'

Here’s how to start the socket gateway for reference later. We’ll need the remote end of the TCP connection to be started up first.

../omnibus/bin/nco_g_socket &

You can check that your gateway is running by running the ps aux | grep nco_g command. To stop the gateway, kill the process.

Check the output file you created on the Logstash server to verify that you’ve captured some events from the gateway. If you see some there, we’re all set for our next activity to set up annotation and indexing of the events in SCALA v1103.

0 comments

Now that I’m done with what felt like months of work for our big demo at IBM’s IOD show last week, let me get this series done! Next up we’ll walk through the use of Logstash to serve as the collection and mediation tool for streaming in events from Netcool/OMNIbus and getting them indexed within SCALA v1103. We’re still using Logstash v113 here. We’ll have support for Logstash v1.2.x in our next release very soon. NOTE: With SCALA v1103 now available, that will be what I mention moving forward.

To catch up, check out part 1 and part 2.

On a separate system if at all possible, prepare for installation of Logstash v113 and the SCALA Logstash toolkit.

  • Download logstsah v1.1.13 from here
  • Create a new directory for the logtash environment. I generally create /opt/logstash.
  • Copy the SCALA Logstash Toolkit to this directory
  • Review the SCALA Logstash Toolkit installation steps
  • Explode the SCALA Logstash Toolkit
  • Copy the logstash-1.1.13-flatjar.jar package to this /opt/logstash/lstoolkit directory
  • Update the install configuration file install-scala-logstash.conf
  • Update the eif.conf file
  • Run the ./install-scala-logstash.sh script.

The lstoolkit directory contains the following files:

/opt/logstash/lstoolkit/
- LogstashLogAnalysis_v1.1.0.0.zip
- install-scala-logstash.conf
- startlogstash-scala.sh
- install-scala-logstash.sh
- logstash-1.1.13-flatjar.jar
- start-logstash.conf
- logstash/

/opt/logstash/lstoolkit/logstash/
- conf/
-- logstash-scala.conf
- outputs/
-- eif-10.10.10.1.conf
-- scala_custom_eif.rb
- unity/

Next, we need to make a few simple configurations in the Logstash configuration file to get us up and running. In this simple scenario, the following configuration file for Logstash should be updated with a configuration similar to this:

input
{
#Create your TCP input which your Netcool/OMNIbus socket gateway will connect to

tcp
{
type=> "netcool"
format=> "plain"
port=> 1234
data_timeout=> -1
}

} #End of Inputs

filter
{
#Use the Mutate filter to set the hostname and log path to anything you want. This is used in the SCALA LogSource definition.

mutate
{
type=> "netcool"
replace=>["@source_host","MYOMNIBUSNAME","@source_path","Netcool"]
}

#Have some events you want to drop out? I used the Grep filter type to filter out some poorly formatted events whose summary message included commas which broke SCALA DSV processing

grep
{
type=> "netcool"
match=>[ "@message",".*WAS_YN_WebAppNoActivity_W.* | .*WAS_YN_WebAppActivity_H.*" ]
negate=> true
}

} #End of Filters

output
{
#Create a simple output file of all your raw CSV delimited events for future use, replay, etc.

file
{
type=> "netcool"
message_format=> "%{@message}"
path=> "/opt/logstash/raw-events-csv.log"
}

#Create one or more ouputs to spray events to as many SCALA boxes as you'd like

scala_custom_eif
{
eif_config=> "logstash/outputs/eif-10.10.10.1.conf"
debug_log=> "/tmp/scala/scala-logstash-10.10.10.1.log"
debug_level=> "debug"
}

} #End of Outputs

Note: If you have multiple SCALA systems, you can spray events to each of them by having more than one output stanza for the scala_custom_eif plugin. Each one must have its own unique eif_config and debug_log configurations. I just put in the IP address of my end points to easily identify each one.

To start up Logstash, use the ./startlogstash-scala.sh script. You may wish to update this to send Logstash to the background when starting up. To stop Logstash, use ps aux | grep logstash and kill the Logstash process.

When we complete the next series of tasks in Netcool/OMNIbus we can peek at the output file we created via Logstash, we can see the raw CSV events that resemble the example below. This is what’s sent across the socket gateway.

INSERT: "WAS_YN_EJBConNoActivity_W:syswasslesNode01:syswassles:KYNS::ITM_EJB_Containers",
2013-09-27T13: 46: 44EDT,
2013-09-27T13: 46: 44EDT,
"syswasslesNode01:syswassles:KYNS",
"syswasslesNode01:syswassles:KYNS",
"WAS_YN_EJBConNoActivity_W[(Method_Invocation_Rate=0.000 ) ON syswasslesNode01:syswassles:KYNS (Method_Invocation_Rate=0 )]",
1,
"tivoli_eif probe on systbsmsles",
"ITM",
"ITM_EJB_Containers",
"WAS_YN_EJBConNoActivity_W",
20,
2,
6601,
1,
"",
"",
"~",
"09/27/2013 08:29:45.000",
"sysitm.poc.ibm.com",
"S",
"TEMS",
"",
"WAS_YN_EJBConNoActivity_W",
"",
"syswasslesNode01:syswassles:KYNS",
"",
"",
"",
"",
"",
"",
"",
"",
"",
"",
"",
0,
"",
""

This is the event passed in from the TCP Input and through the filters to the scala_custom_eif output:

D,
[
2013-09-27T13: 46: 42.601000#21554
]DEBUG--: scala_custom_eif: Receivedevent: # @data={
"@source"=>"tcp://10.10.10.1:52074/",
"@tags"=>[

],
"@fields"=>{

},
"@timestamp"=>"2013-09-27T17:46:42.588Z",
"@source_host"=>"s3systbsmsles",
"@source_path"=>"Netcool",
"@message"=>"INSERT: \"WAS_YN_EJBConNoActivity_W:syswasslesNode01:syswassles:KYNS::ITM_EJB_Containers\",2013-09-27T13:46:44EDT,2013-09-27T13:46:44EDT,\"syswasslesNode01:syswassles:KYNS\",\"syswasslesNode01:syswassles:KYNS\",\"WAS_YN_EJBConNoActivity_W[(Method_Invocation_Rate=0.000 ) ON syswasslesNode01:syswassles:KYNS (Method_Invocation_Rate=0 )]\",1,\"tivoli_eif probe on systbsmsles\",\"ITM\",\"ITM_EJB_Containers\",\"WAS_YN_EJBConNoActivity_W\",20,2,6601,1,\"\",\"\",\"~\",\"09/27/2013 08:29:45.000\",\"sysitm.poc.ibm.com\",\"S\",\"TEMS\",\"\",\"WAS_YN_EJBConNoActivity_W\",\"\",\"syswasslesNode01:syswassles:KYNS\",\"\",\"\",\"\",\"\",\"\",\"\",\"\",\"\",\"\",\"\",\"\",0,\"\",\"\"\n",
"@type"=>"netcool"
}>

This is the event sent out of the scala_custom_eif output in the IBM Event Integration Framework (EIF) format fit for consumption by the SCALA EIF Receiver.

D,
[
2013-09-27T13: 46: 42.602000#21554
]DEBUG--: scala_custom_eif: Sendingtecevent: AllRecords;hostname='s3systbsmsles';RemoteHost='';text='INSERT: "WAS_YN_EJBConNoActivity_W:syswasslesNode01:syswassles:KYNS::ITM_EJB_Containers",
2013-09-27T13: 46: 44EDT,
2013-09-27T13: 46: 44EDT,
"syswasslesNode01:syswassles:KYNS",
"syswasslesNode01:syswassles:KYNS",
"WAS_YN_EJBConNoActivity_W[(Method_Invocation_Rate=0.000 ) ON syswasslesNode01:syswassles:KYNS (Method_Invocation_Rate=0 )]",
1,
"tivoli_eif probe on systbsmsles",
"ITM",
"ITM_EJB_Containers",
"WAS_YN_EJBConNoActivity_W",
20,
2,
6601,
1,
"",
"",
"~",
"09/27/2013 08:29:45.000",
"sysitm.poc.ibm.com",
"S",
"TEMS",
"",
"WAS_YN_EJBConNoActivity_W",
"",
"syswasslesNode01:syswassles:KYNS",
"",
"",
"",
"",
"",
"",
"",
"",
"",
"",
"",
0,
"",
""';logpath='Netcool';END

Logstash is far more powerful than what I’ve showed in this very simple example. I’d encourage you to investigate its capabilities further by reading the website, user group or IRC.

Up next, we’ll walk through the configuration of Netcool/OMNIbus and get our events flowing towards Logstash and SCALA.

0 comments

Wish I was there to see this talk on how Loggly has evolved at the AWS re:Invent show! Very impressive scale numbers (EPS) for logging geeks out there. Check out there use of tools like Kafka, Storm and ElasticSearch in this deck. This is definitely something anyone planning on building or buying “logging as a service” needs to review.

0 comments

Jason Wilder published a nice overview and comparison of Logstash and Fluentd today and it’s well worth a read if you’re looking for a tool to help with data (log, metric, event, etc.) collection, mediation and routing.

We chose to use Logstash as part of our integration and mediation toolkit for our IT Operations Analytics (ITOA) portfolio and appreciate the flexibility it offers.

0 comments