My take on on-call
On-call is a huge issue in every tech company and a large discussion topic in tech communities and fora. Most people hate it, others burn out or switch positions, jobs and careers just to avoid being on-call.
I was on-call myself in the past and eventually i hated it. I remember getting automated calls at 3.00am for at least a whole week because our services were down. I had to wake up, open my laptop, trace the logs, figure out whether it’s a software or hardware (infrastructure) issue, restart services or docker agent in the affected hosts or our load balancer, check that everything is back to normal, close the laptop and get back to sleep after spending 15 minutes - 2 hours solving this. Sometimes i even had to patch the code and push some temporary container until the morning where the rest of the team would properly solve it. Sometimes i couldn’t even sleep again due to the stress. The issues were persisting for weeks and months, alerts were coming all day in any means of communication (emails, slack alerts, calls, sms, etc.). We were just a handful of people in the company, the money was good and i had to do it. Eventually the company grew a little bit more, we hired some good devops engineers, most services stabilized a little bit and i was removed from on-call. But i hated it. Everybody hated it. It was a big discussion topic back then and i complained a lot.
Fast forward some years later and almost 6 months ago, my wife was 40+ weeks pregnant. How pregnancy works in Greece is that each woman is assigned a midwife by her doctor who will take care of her until birth. This means anything from simple things like answering questions, doing seminars and 1-1 sessions to actually delivering the baby together with her doctor and supporting her the next few weeks. This is a real 24/7 on-call for weeks until birth. I remember my wife calling her in the middle of the night while being completely stressed and asking her simple questions. She was always calm, trying to understand the situation and provide clear feedback and responses. In a case when something really worrying happened, she suggested to meet at the hospital at 4.00am in order to be sure that everything is fine with the baby. That night she woke up, left her house 1 hour away from the hospital to meet my wife and do the medical examination together. I cannot count how many times we called her the last 2 days before birth. But i remember her calm tone and caring responses on every single phone call. I remember how she guided us through the birth night when the contractions started, her being awake all night answering our calls, text messages until we decided to go to the hospital at 4.00am where she also joined us, and then staying almost half a day with my wife to successfully deliver the baby. I will remember her forever and i also know how little she’s getting paid for this, while she takes care of multiple pregrant women in parallel. A continuous on-call.
Retrospectively, i go back to my on-call whining and realize how privileged i am to have complained for this. Yes i hated it, i lost some sleep, my wife lost some as well, i also burnt out, but i was clicking some buttons, checking logs while being half-asleep, doing some browsing while waiting deploys and restarts and getting back to sleep. I don’t even remember talking to anyone at that time other than providing some report the next day. This amazing lady had to answer the phone, calm down a stressed woman, accompany her to the hospital a few times in the middle of the night, be there for her and eventually make her feel safe. People will say that it’s her job, and it’s true, but same is ours. Perhaps our company’s on-call processes are terrible, the expectations unreal, the services unstable and nobody cares, but still, we just click some buttons, and we are getting paid a lot for it compared to others. The only thing i regret is not asking to be excluded as soon as i burnt out. Our health obviously matters. But we don’t deal with other people, their emotions and their anxiety at those alerts. We just monitor some logs and restart some services, we don’t have people relying on us, our answers and every single word coming out of our mouth.
Let’s fix tech on-call as much as we can but also let’s not make a big deal about it.