-
Notifications
You must be signed in to change notification settings - Fork 1
Home
larssb edited this page May 5, 2018
·
2 revisions
I (Lars Bengtsson) started developing HealOps in early 2017. From the ideas and thoughts that it had to be possible to make life as a DevOps Engineer and as a member of the on-call team at work easier. By:
- Improving the info received when getting a call when on on-call duty.
- Even better, developing a system that tries healing 'x' IT component and if that succeeds I can continue my sleep unknowingly of the mishap.
- Although it will be possible to see that this happened, as the issue is logged.
- Having the system automatically contact the person on on-call duty, instead of having manual labour doing this. Because, having manual labor doing this.
- The person/persons alerting an on-call duty person, often does not have the info and even sometimes the necessary skillset required to manage what 'x' component being in a bad state really means for 'x' system.
- Has so far been my experience that this slows the mean time to response.
- The number of incorrect call-ups is too high.
- Automatizing the monitoring, healing and alerting of IT services and its components.
- By making it possible to query the health of IT services and its components over time and thereby making available, to a higher degree of likeliness, the support of informed decisions that are based on data.
- Present and visualize data via dashboard systems.
- Packaging the code needed to monitor and healing 'x' IT service and its components into clearly compartmentalized entities. That makes it possible to:
- Deploy those easily.
- Modularize these packages, which then makes it easier to re-use them for different IT service monitoring and healing situations.
The above is the motivation for developing HealOps.
(this synopsis is the same as the one in the repo ReadMe)