≡ Menu

dougmcclure.net

thoughts on business, service and technology operations and management in the digital transformation era

If you’d like to catch up, check out the first three posts in this tutorial, starting here.

The easiest way to get started is by having a good understanding of the structure of your Netcool events. With a fairly default deployment we know there are a number of standard alerts.status fields of interest such as first and last occurrence, node, agent, alert group, alert key, manager to name a few. Nearly every customer I have ever worked with has extended their alerts.status schema to accommodate the various probe and gateway level integrations they have as well as to support event enrichment, auto-ticketing, etc.

There’s definitely a level of maturity here that needs to be understood through brief analysis of your events via the AEL. Which slots are you populating with a high degree of completeness? Which ones help you understand the context of an event beyond the node name? Which ones are used to determine if the event is ever acted upon? Which ones will help you assess the event streams, ask questions and take actions on investigating event validity within your environment? Your goal is to ensure you have the best possible set of fields that will enable your event analysis, event analytics and most importantly the decisions, actions and next steps you will be able to take based upon your analysis.

One place you can get a complete snapshot of the alerts.status configuration is the ../omnibus/var/Tivoli_eif.NCOMS.alerts.status.def file. I used this to get the list of all the field names for easy copy and paste when building my socket gateway mapping file.

With the fields of interest identified, download and install the Netcool/OMNIbus socket gateway in accordance with the install instructions in the docs. If you don’t already own the socket gateway, check with your sales rep. In most cases since you’re using it to route events from one C&SI product to another, there isn’t a charge. But, IANAL and T&C’s change with the wind so check. If you have a problem with this, ping me and I can suggest a number of other alternative approaches.

Once installed, the first configuration activity is to update the gateway’s socket.map file with the fields you’re interested in.

  • Make a backup copy of the original.
  • Remove the default fields you’re not interested in.
  • Add fields you are interested in.
  • Place the fields in a logical order.
  • NOTE: I’m placing the @Identifier first as the socket gateway inserts an event type (INSERT, UPDATE, DELETE) in front of each event it sends across so we don’t want that to mess up any other slot.

This is the socket map I used within in our pretty default environment when sending events from ITM, APM/ITCAM, BSM, etc. For a bare bones set up, the ones I’ve highlighted in bold are probably good enough to get started.

CREATE MAPPING StatusMap
(
'' = '@Identifier',
'' = '@LastOccurrence' CONVERT TO DATE,
'' = '@FirstOccurrence' CONVERT TO DATE,
'' = '@Node',

'' = '@NodeAlias',
'' = '@Summary',
'' = '@Severity',
'' = '@Manager',
'' = '@Agent',
'' = '@AlertGroup',
'' = '@AlertKey',
'' = '@Type',
'' = '@Tally',
'' = '@Class',
'' = '@Grade',

'' = '@Location',
'' = '@ITMDisplayItem',
'' = '@ITMEventData',
'' = '@ITMTime',
'' = '@ITMHostname',
'' = '@ITMSitType',
'' = '@ITMThruNode',
'' = '@ITMSitGroup',
'' = '@ITMSitFullName',
'' = '@ITMApplLabel',
'' = '@ITMSitOrigin',
'' = '@CAM_Application_Name',
'' = '@CAM_Transaction_Name',
'' = '@CAM_SubTransaction_Name',
'' = '@CAM_Client_Name',
'' = '@CAM_Server_Name',
'' = '@CAM_Profile_Name',
'' = '@CAM_Response_Time',
'' = '@CAM_Percent_Available',
'' = '@CAM_Expected_Value',
'' = '@CAM_Actual_Value',
'' = '@CAM_Details',
'' = '@CAM_Total_Requests',
'' = '@BSMAccelerator_Service',
'' = '@BSMAccelerator_Function'
);

Next, we need to set up some simple filtering to control the event types we send across the gateway. The socket.reader.tblrep.def is used to define what comes across the socket gateway and what filters we might want to apply. Here are a couple examples I’ve used.

Only sends INSERTS and UPDATES (not DELETES as they don’t send across the entire event structure) and filter out all of the internal TBSM events which are Class 12000.

REPLICATE INSERTS, UPDATES FROM TABLE 'alerts.status'
USING MAP 'StatusMap'
FILTER WITH 'Class !=12000';

Only sends INSERTS and UPDATES (not DELETES as they don’t send across the entire event structure) and filter out events with Severity 0, 1 and 2.

REPLICATE INSERTS FROM TABLE 'alerts.status'
USING MAP 'StatusMap'
FILTER WITH 'Severity >=3';

I was unable to figure out a more complex filter example which I would have liked to use for more filtering so these had to do.

Next, the core socket gateway properties need to be configured. Edit the NCO_GATE.props file as follows.

#Update these based on your install preferences
MessageLevel : 'warn'
MessageLog : '$OMNIHOME/log/NCO_GATE.log'
Name : 'NCO_GATE'
PropsFile : '$OMNIHOME/etc/NCO_GATE.props'

#This will be the IP and Port for your logstash installation and the TCP Input you use
Gate.Socket.Host : '10.10.10.1'
Gate.Socket.Port : 1234

#These will create a comma separated (CSV) event format with fields wrapped in " ".
Gate.Socket.EndString : '"'
Gate.Socket.StartString : '"'
Gate.Socket.Separator : ','

#This sets First/Last Occurrence format to mimic ISO8601 format supported by SCALA
Gate.Socket.DateFormat : '%Y-%m-%dT%H:%M:%S%Z'

Here’s how to start the socket gateway for reference later. We’ll need the remote end of the TCP connection to be started up first.

../omnibus/bin/nco_g_socket &

You can check that your gateway is running by running the ps aux | grep nco_g command. To stop the gateway, kill the process.

Check the output file you created on the Logstash server to verify that you’ve captured some events from the gateway. If you see some there, we’re all set for our next activity to set up annotation and indexing of the events in SCALA v1103.

0 comments

Now that I’m done with what felt like months of work for our big demo at IBM’s IOD show last week, let me get this series done! Next up we’ll walk through the use of Logstash to serve as the collection and mediation tool for streaming in events from Netcool/OMNIbus and getting them indexed within SCALA v1103. We’re still using Logstash v113 here. We’ll have support for Logstash v1.2.x in our next release very soon. NOTE: With SCALA v1103 now available, that will be what I mention moving forward.

To catch up, check out part 1 and part 2.

On a separate system if at all possible, prepare for installation of Logstash v113 and the SCALA Logstash toolkit.

  • Download logstsah v1.1.13 from here
  • Create a new directory for the logtash environment. I generally create /opt/logstash.
  • Copy the SCALA Logstash Toolkit to this directory
  • Review the SCALA Logstash Toolkit installation steps
  • Explode the SCALA Logstash Toolkit
  • Copy the logstash-1.1.13-flatjar.jar package to this /opt/logstash/lstoolkit directory
  • Update the install configuration file install-scala-logstash.conf
  • Update the eif.conf file
  • Run the ./install-scala-logstash.sh script.

The lstoolkit directory contains the following files:

/opt/logstash/lstoolkit/
- LogstashLogAnalysis_v1.1.0.0.zip
- install-scala-logstash.conf
- startlogstash-scala.sh
- install-scala-logstash.sh
- logstash-1.1.13-flatjar.jar
- start-logstash.conf
- logstash/

/opt/logstash/lstoolkit/logstash/
- conf/
-- logstash-scala.conf
- outputs/
-- eif-10.10.10.1.conf
-- scala_custom_eif.rb
- unity/

Next, we need to make a few simple configurations in the Logstash configuration file to get us up and running. In this simple scenario, the following configuration file for Logstash should be updated with a configuration similar to this:

input
{
#Create your TCP input which your Netcool/OMNIbus socket gateway will connect to

tcp
{
type=> "netcool"
format=> "plain"
port=> 1234
data_timeout=> -1
}

} #End of Inputs

filter
{
#Use the Mutate filter to set the hostname and log path to anything you want. This is used in the SCALA LogSource definition.

mutate
{
type=> "netcool"
replace=>["@source_host","MYOMNIBUSNAME","@source_path","Netcool"]
}

#Have some events you want to drop out? I used the Grep filter type to filter out some poorly formatted events whose summary message included commas which broke SCALA DSV processing

grep
{
type=> "netcool"
match=>[ "@message",".*WAS_YN_WebAppNoActivity_W.* | .*WAS_YN_WebAppActivity_H.*" ]
negate=> true
}

} #End of Filters

output
{
#Create a simple output file of all your raw CSV delimited events for future use, replay, etc.

file
{
type=> "netcool"
message_format=> "%{@message}"
path=> "/opt/logstash/raw-events-csv.log"
}

#Create one or more ouputs to spray events to as many SCALA boxes as you'd like

scala_custom_eif
{
eif_config=> "logstash/outputs/eif-10.10.10.1.conf"
debug_log=> "/tmp/scala/scala-logstash-10.10.10.1.log"
debug_level=> "debug"
}

} #End of Outputs

Note: If you have multiple SCALA systems, you can spray events to each of them by having more than one output stanza for the scala_custom_eif plugin. Each one must have its own unique eif_config and debug_log configurations. I just put in the IP address of my end points to easily identify each one.

To start up Logstash, use the ./startlogstash-scala.sh script. You may wish to update this to send Logstash to the background when starting up. To stop Logstash, use ps aux | grep logstash and kill the Logstash process.

When we complete the next series of tasks in Netcool/OMNIbus we can peek at the output file we created via Logstash, we can see the raw CSV events that resemble the example below. This is what’s sent across the socket gateway.

INSERT: "WAS_YN_EJBConNoActivity_W:syswasslesNode01:syswassles:KYNS::ITM_EJB_Containers",
2013-09-27T13: 46: 44EDT,
2013-09-27T13: 46: 44EDT,
"syswasslesNode01:syswassles:KYNS",
"syswasslesNode01:syswassles:KYNS",
"WAS_YN_EJBConNoActivity_W[(Method_Invocation_Rate=0.000 ) ON syswasslesNode01:syswassles:KYNS (Method_Invocation_Rate=0 )]",
1,
"tivoli_eif probe on systbsmsles",
"ITM",
"ITM_EJB_Containers",
"WAS_YN_EJBConNoActivity_W",
20,
2,
6601,
1,
"",
"",
"~",
"09/27/2013 08:29:45.000",
"sysitm.poc.ibm.com",
"S",
"TEMS",
"",
"WAS_YN_EJBConNoActivity_W",
"",
"syswasslesNode01:syswassles:KYNS",
"",
"",
"",
"",
"",
"",
"",
"",
"",
"",
"",
0,
"",
""

This is the event passed in from the TCP Input and through the filters to the scala_custom_eif output:

D,
[
2013-09-27T13: 46: 42.601000#21554
]DEBUG--: scala_custom_eif: Receivedevent: #"tcp://10.10.10.1:52074/",
"@tags"=>[

],
"@fields"=>{

},
"@timestamp"=>"2013-09-27T17:46:42.588Z",
"@source_host"=>"s3systbsmsles",
"@source_path"=>"Netcool",
"@message"=>"INSERT: \"WAS_YN_EJBConNoActivity_W:syswasslesNode01:syswassles:KYNS::ITM_EJB_Containers\",2013-09-27T13:46:44EDT,2013-09-27T13:46:44EDT,\"syswasslesNode01:syswassles:KYNS\",\"syswasslesNode01:syswassles:KYNS\",\"WAS_YN_EJBConNoActivity_W[(Method_Invocation_Rate=0.000 ) ON syswasslesNode01:syswassles:KYNS (Method_Invocation_Rate=0 )]\",1,\"tivoli_eif probe on systbsmsles\",\"ITM\",\"ITM_EJB_Containers\",\"WAS_YN_EJBConNoActivity_W\",20,2,6601,1,\"\",\"\",\"~\",\"09/27/2013 08:29:45.000\",\"sysitm.poc.ibm.com\",\"S\",\"TEMS\",\"\",\"WAS_YN_EJBConNoActivity_W\",\"\",\"syswasslesNode01:syswassles:KYNS\",\"\",\"\",\"\",\"\",\"\",\"\",\"\",\"\",\"\",\"\",\"\",0,\"\",\"\"\n",
"@type"=>"netcool"
}>

This is the event sent out of the scala_custom_eif output in the IBM Event Integration Framework (EIF) format fit for consumption by the SCALA EIF Receiver.

D,
[
2013-09-27T13: 46: 42.602000#21554
]DEBUG--: scala_custom_eif: Sendingtecevent: AllRecords;hostname='s3systbsmsles';RemoteHost='';text='INSERT: "WAS_YN_EJBConNoActivity_W:syswasslesNode01:syswassles:KYNS::ITM_EJB_Containers",
2013-09-27T13: 46: 44EDT,
2013-09-27T13: 46: 44EDT,
"syswasslesNode01:syswassles:KYNS",
"syswasslesNode01:syswassles:KYNS",
"WAS_YN_EJBConNoActivity_W[(Method_Invocation_Rate=0.000 ) ON syswasslesNode01:syswassles:KYNS (Method_Invocation_Rate=0 )]",
1,
"tivoli_eif probe on systbsmsles",
"ITM",
"ITM_EJB_Containers",
"WAS_YN_EJBConNoActivity_W",
20,
2,
6601,
1,
"",
"",
"~",
"09/27/2013 08:29:45.000",
"sysitm.poc.ibm.com",
"S",
"TEMS",
"",
"WAS_YN_EJBConNoActivity_W",
"",
"syswasslesNode01:syswassles:KYNS",
"",
"",
"",
"",
"",
"",
"",
"",
"",
"",
"",
0,
"",
""';logpath='Netcool';END

Logstash is far more powerful than what I’ve showed in this very simple example. I’d encourage you to investigate its capabilities further by reading the website, user group or IRC.

Up next, we’ll walk through the configuration of Netcool/OMNIbus and get our events flowing towards Logstash and SCALA.

0 comments

Wish I was there to see this talk on how Loggly has evolved at the AWS re:Invent show! Very impressive scale numbers (EPS) for logging geeks out there. Check out there use of tools like Kafka, Storm and ElasticSearch in this deck. This is definitely something anyone planning on building or buying “logging as a service” needs to review.

0 comments