Service Down

what to do when a service stops responding on a server


  • Access the target node
  • Run ps <service name> to list the service
    • Does the service show up?
      • YES - goto unresponsive-service
      • NO - Start the service
        • Did it start?
        • YES - Validate no other issues & close ticket
        • NO - goto next step
  • Run top or htop to identify running processes
  • Is the box under heavy load?
    • YES - Record top processes & goto load-issue
    • NO - Restart the service
      • Was restart successful?
        • YES - Validate no other issues & close ticket
        • NO - Try restarting again, and if it still fails, contact the infra team for assistance



Read more: