Bookmarks for November 19th through December 11th

by delicious on December 11, 2013

in General

These are my links for November 19th through December 11th:

The Netflix Tech Blog: Announcing Suro: Backbone of Netflix’s Data Pipeline – Suro, which we are proud to announce as our latest offering as part of the NetflixOSS family, serves as the backbone of our data pipeline. It consists of a producer client, a collector server, and plugin framework that allows events to be dynamically filtered and dispatched to multiple consumers.
Sensu | An open source monitoring framework – Designed for the Cloud The Cloud introduces new challenges to monitoring tools, Sensu was created with them in mind. Sensu will scale along with the infrastructure that it monitors.
datastack.io – data integration as a service – collect data. share insights.data integration as a service * * Kinda Logstash or Heka. But without the pain.
Glassbeam Begins Where Splunk Ends – Going Beyond Operational Intelligence with IoT Logs | Glassbeam – Glassbeam SCALAR is a flexible, hyper scale cloud-based platform capable of organizing and analyzing complex log bundles including syslogs, support logs, time series data and unstructured data generated by machines and applications. By creating structure on the fly based on the data and its semantics, Glassbeam’s platform allows traditional BI tools to plug into this parsed multi-structured data so companies can leverage existing BI and analytics investments without having to recreate their reports and dashboards. By mining machine data for product and customer intelligence, Glassbeam goes beyond traditional log management tools to leverage this valuable data across the enterprise. With a focus on providing value to the business user, Glassbeam’s platform and applications enable users to reduce costs, increase revenues and accelerate product time to market. In fact, Enterprise Apps Today’s Drew Robb recognized this critical value proposition naming Glassbeam a hot Big Data startup for analytics, which is attracting interest from investors, partners and customers. Today’s acquisition serves to showcase a market that is heating up, and new requirements around data analytics. But this is only the start and Glassbeam deliberately picks up where Splunk ends. We remain committed to cutting through the clutter and providing a clear view of operational AND business analytics to users across the enterprise.
Splunk Buys Cloudmeter to Boost Operational Intelligence Portfolio – The acquisition of Cloudmeter rounds out Splunk's portfolio with a capability to analyze machine data from a wider range of sources. Financial terms of the deal were not disclosed. The transaction was funded with cash from Splunk's balance sheet, the company said. Indeed, the addition of Cloudmeter will enhance the ability of Splunk customers to analyze machine data directly from their networks and correlate it with other machine-generated data to gain insights across Splunk's core use cases in application and infrastructure management, IT operations, security and business analytics.
Netuitive Files for Ground-Breaking New Patent in IT Operations Analytics – Press Release – Digital Journal – The patent filing is led by Dr. Elizabeth A. Nichols, Chief Data Scientist for Netuitive, a quantitative analytics expert focused on extending Netuitive's portfolio of IT Operations Analytics (ITOA) solutions to new applications and services. "Netuitive is committed to delivering industry leading IT Operations Analytics that proactively address business performance," said Dr. Nichols. "In addition, Netuitive's research and development is actively focused on new algorithm initiatives that will further advance our abilities to monitor new managed elements associated with next-generation IT architecture and online business applications."
Legume for Logstash – Legume Web Interface for Logstash & Elasticsearch Legume is a zeroconfig web interface run entirely on the client side that allows to browse and search log messages in Elasticsearch indexed by Logstash.
Deploying an application to Liberty profile on Cloud Foundry | WASdev – As part of the partnership between Pivotal and IBM we have created the WebSphere Application Server Liberty Buildpack, which enables Cloud Foundry users to easily deploy apps on Liberty profile.
IBM’s project Neo takes aim at the data discovery and visualisation market – MWD’s Insights blog – Project Neo is IBM’s answer to data visualisation and discovery for business users. It promises to help those who don’t possess specialist skills or training in analytics, to visually interact with their data and surface interesting trends and patterns by using a more simplistic dashboard interface that helps and guides users in the analysis process. Whereas previous tool incarnations are often predisposed to using data models, scripting or require knowledge of a query language, Project Neo takes a different tack. It aims to bypass this approach by enabling users to ask questions in plain English against a raw dataset (including CSV or Excel files) and return results in the form of interactive visualisations.
Machine learning is way easier than it looks | Inside Intercom – Like all of the best frameworks we have for understanding our world, e.g. Newton’s Laws of Motion, Jobs to be Done, Supply & Demand — the best ideas and concepts in machine learning are simple. The majority of literature on machine learning, however, is riddled with complex notation, formulae and superfluous language. It puts walls up around fundamentally simple ideas.
Let’s take a practical example. Say we wanted to include a “you might also like” section at the bottom of this post. How would we go about that?
Where Are My AWS Logs? – Logentries Blog – Over my time at Logentries, we’ve had users contact us about where to find their logs while they were setting up Logentries. As a result, we recently released a feature for Amazon Web Services called the AWS Connector, which automatically discovers your log files across your Linux EC2 instances, no matter how many instances you have. Finding your linux logs however may only be a first step in the process as AWS logs can be all over the map… so to speak…. So where are they located? Here’s where you can start to find some of these.
Responsive Log Management… Like Beauty, it’s in the Eye of the Bug-holder | – As a software engineer, I’m responsible for the code I write and responsible for what we ship. But designing, building, and deploying SaaS is a real challenge – it means software developers are now responsible for making sure the live system runs well too. This is a real challenge, but with Loggly I get real-time telemetry on how my code is running, how my systems are behaving – and how well our software meets the need of our customers.
Mahout Explained in 5 Minutes or Less – blog.credera.com – In the spectrum of big data tools, Apache Mahout is a machine-learning engine that fits into the data mining category of the big data landscape. It is one of the more interesting tools in the big data toolbox because it allows you to extract actionable tasks from a big data set. What do we mean by actionable tasks? Things such as purchase recommendations based on a similar customer’s buying habits, or determining whether a user comment is spam based on the word clusters it contains.
Change management using Evolven’s IT Operations Analytics – TechRepublic – Evolven is designed to track and report change across an array of operating systems, databases, servers, and more to help pinpoint inconsistencies. It can also assist you in preventing issues and determining root causes of problems. Evolven can be helpful with automation—to find out why things didn’t work as expected and what to do next—and can also alert you to suspicious or unauthorized changes in your environment.
Human and technological policies go hand-in-hand to balance each other and ensure the best possible results. Whereas my last article on the subject referenced the human processes IT departments should follow during change management, I’ll now take a look at technology that can back those processes up by examining what Evolven does and what benefits it can bring
Fluentd vs Logstash – Jason Wilder’s Blog – Fluentd and Logstash are two open-source projects that focus on the problem of centralized logs. Both projects address the collection and transport aspect of centralized logging using different approaches.
This post will walk through a sample deployment to see how each differs from the other. We’ll look at the dependencies, features, deployment architecture and potential issues. The point is not to figure out which one is the best, but rather to see which one would be a better fit for your environment.
astanway/crucible · GitHub – Crucible is a refinement and feedback suite for algorithm testing. It was designed to be used to create anomaly detection algorithms, but it is very simple and can probably be extended to work with your particular domain. It evolved out of a need to test and rapidly generate standardized feedback for iterating on anomaly detection algorithms.
Now in Public Beta – Log Search & Log Watch | The AppFirst Blog – The decision to open our new log applications to the public was not one taken lightly. Giving our customers the ability to search all of their log files for any keywords is quite taxing on our system, so we had to take several precautions. To ensure the reliability of our entire architecture, we decided to create a separate web server solely responsible for retrieving log data from our persistence storage HBase. By making this an isolated subsystem, we don’t run the risk of a potentially large query bogging everything else down as well.
Log Insight: Remote Syslog Architectures | VMware Cloud Management – VMware Blogs – When architecting a syslog solution, it is important to understand the requirements both from a business and a product perspective. I would like to discuss the different remote syslog architectures that are possible when using vCenter Log Insight.
Why We Need a New Model for Anomaly Detection: #1 | Metafor Software – Share on reddit Share on hackernews Share on email
I’m not talking about anomaly detection in stable enterprise IT environments. Those are doing just fine. Those infrastructures have mature, tested procedures for rolling out software updates and implementing new applications on an infrequent basis (still running FORTRAN written in the 70s, on servers from the 70s, yeah, that’s a thing).

I’m talking about anomaly detection in the cloud, where the number of virtual machines fluctuates as often as application roll outs. Current solutions for anomaly detection track dozens or even hundreds of metrics per server in an attempt to divine normal performance and spot anomalous behavior. An ideal solution would adapt itself to the quirks of each metric, to different application scenarios, and to machine re-configurations.

This is a problem that lends itself to machine learning techniques, but it’s still an incredibly difficult problem to solve. Why?
Beyond The Pretty Charts – A Report From #devopsdays in Austin | Metafor Software – Don’t just look at timeline charts. We’ve fallen into the trap of looking at all the pretty charts as time series charts. When we do that, we end up missing some important characteristics. For example, a simple histogram of the data, instead of just a time chart, can tell you a lot about anomalies and distribution. Using different kinds of visualization is crucial to giving us a different aspect on our data.
Server Anomaly Detection | Predictive IT Analytics | Config Drift Monitoring | Metafor Software – Know about problems before your threshold based monitoring tool does. Get alerted to issues your thresholds will never catch.
Metafor’s machine learning algorithms alert you to anomalous behavior in your servers, clusters, applications, and KPIs.

Next post: Event Analysis using SmartCloud Analytics Log Analysis (SCALA) v1.2.0 – Example logstash v1.3.x Configuration

Previous post: Event Analysis using SmartCloud Analytics Log Analysis (SCALA) v1103 – Setting up Netcool/OMNIbus