Hindle Ottawa Blog - RarelyOnce

Rarely Once

You Rarely do Something Just Once.

When you think that automating an activity is not worth it, think again. Even in a single repair process you will do a test a number of things to see what is working and what is not. Through the fix you will repeat some of these actions and checks while still in fault. You will ensure it is tested one final time after the end fix is completed. Just after restart you may closely check to ensure it does not fail again. For days later you will likely monitoring it regularly. If another failure occurs you will want to confirm that it is not part of the problem. This is for one fix. Over time your will accumulate many faults and fixes that each deserve the same commitment to quality. This perspctive encourages automation of actions because there is longer term value to being able to do things over and over.

A software test is another example. A test of new functionality in a module is needed when it is first built. It is also needed for every regresssion test of the module after initial delivery. It is needed when direct changes to the function are done later. In a typical product year this amounts to between 6-20 runs of that test in CD/CI, Integration and Pre-Production environments. If these test have to be done by hand they they are a major time commitment for even one test. This level of test is needed for tens of thousands cap[ability points present within even the smallest of sofware modules. It is in this cost model of software that demonstrates why automated test should pre-date a module.

Monitoring of all types is by definition something that is not done once. Monitoring is to check something along a time profile appropriate for the object and the reaction environment. Know that a daily monitor is infinite times higher than none. Hourly is 24x more than daily. Minute is 60x higher than hourly. The balance is the importance, the load that monitoring takes from the resources available, balanced against the ability to react to any action states identified. Monitoring also highlights the problem of "Information Overload".

Not only do we want to monitor one computer in our Computer Community but we should also be monitoring every computer and every other component of our whole Computer Community. This multiplies the number of nodes to watch and the number of types of nodes to be watched. While the type of watching is the same on each same type of node, just that these are different nodes means that it is not done once.

Scanning of logs is a treasure trove of what is going on within your Computer Community. It can span a large number of logs which are increasingly available. Traditionally we have thought about monitoring System, Application and Security logs. Now we can see AD logs, network logs, PowerShell logs. All of these are examples of doing similar things over and over again. On different logs, on different machines and potentially multiple times per day in order to effect real operational controls on the Computer Community. Once you have repeated finding a problem then you can start to look at how to repeat that problems remediations.

The upshot is that whenever you do something once, think about when you will need / want / hope this will be available again. It likely will be needed again. Build your toolkit of these pieces incrementally, improve the pieces incrementally, expand the pieces incrementally. Because You Rarely Do Something Just Once.

Return to the Blog Topic Page
Return to the Main Index
Send me a Comment
Update 2021-02-21 rwh