💾 Archived View for capsule.adrianhesketh.com › 2014 › 09 › 01 › logging-maturity-level captured on 2024-12-17 at 09:38:00. Gemini links have been rewritten to link to archived content
View Raw
More Information
⬅️ Previous capture (2021-11-30)
-=-=-=-=-=-=-
capsule.adrianhesketh.com
home
Logging Maturity Level
Chaos
- You don't know whether a program has failed until a user tells you.
- You have no idea why the program failed.
- You don't even know whether the program is wrong or the user's expectations are wrong because the program or business process is completely undocumented.
- Your support team can't read the logs because they don't have access to the server.
Normal
- When a customer tells you that something isn't working, you can look at the logs and sometimes find out where it failed.
- Sometimes you have enough information in the logs to look in the database and find out why the program failed.
- Sometimes you even know which users were affected.
- The program filled up a server's disk because circular logging wasn't enabled and you caused an outage.
- You enabled circular logging but set it to 10MB which only lasts for 60 seconds so you can't find anything useful anyway.
- Your support team can't read the logs because the logs are so big that their text editor crashes.
Useful
- Log messages are recorded at severity levels which match their actual severity, i.e.: ERROR isn't used for DEBUG messages.
- Log messages contain enough information to investigate problems.
- Log messages are retained for long enough to be useful.
- When you get the time, you occasionally look at the logs create a support ticket if you find an error or warning.
- Your support team can't read the logs because they can't read a stack trace and infer meaning from it, other than "it's broken".
- There are so many errors being logged that you wouldn't dare to get the system to email you in case you crash the email server.
Elite
- Logs are collected centrally and retained for a set period using something like logstash or a RabbitMQ log4net appender.
- You can produce a graph of error rates for applications over time.
- This is regularly reviewed and attempts made to reduce the number of errors.
- Your support team can read and understand the messages, because they make sense.
- An error being logged is actually cause for concern.
- Your support team contact customers who've experienced a problem to apologise and explain what's being done to resolve the issue before the customer has had chance to raise a support ticket.
- You feel a deep sense of inner calm.
More
Next
Unit Testing Mapping and Serialization
Previous
Standup Meetings with Remote Workers
Home
home