India Openings

NOC Engineer

Job ID: NOC-ETP-Pun-1193

Location: Pune

Job Title: NOC Engineer (5–6 Years Experience)


Responsibilities

  • Monitor production systems, applications, and infrastructure for availability, performance, and stability.
  • Respond promptly to incidents, outages, and alerts; perform root cause analysis and drive resolution.
  • Work closely with SRE/DevOps teams to improve monitoring, alerting, and system reliability.
  • Create and maintain detailed runbooks, incident documentation, and SOPs.
  • Troubleshoot and resolve complex infrastructure and application-level issues across distributed systems.
  • Perform log analysis, basic scripting, and system diagnostics to support issue resolution.
  • Participate in 24×7 rotational shift and on-call support as required.
  • Ensure SLAs and SLOs are met and help reduce MTTR (Mean Time to Resolution).
  • Maintain dashboards and observability tools to monitor key health indicators.
  • Contribute to automation efforts for incident response, routine checks, and reporting.
  • Collaborate with engineering and platform teams to support deployments, upgrades, and maintenance tasks.

Required Qualifications

  • 5–6 years of experience in NOC, SRE, or DevOps roles in production environments.
  • Solid understanding of monitoring and observability tools such as New Relic, Datadog, Grafana, Prometheus, or ELK.
  • Experience with cloud platforms (AWS, Azure, or GCP).
  • Familiarity with container orchestration tools (Docker, Kubernetes) and infrastructure as code (Terraform).
  • Strong troubleshooting skills for network, infrastructure, and application issues.
  • Basic scripting experience in Bash, Python, or similar for automation tasks.
  • Understanding of CI/CD pipelines and DevOps best practices.
  • Good communication skills and ability to work in high-pressure environments.
  • Willingness to work in a 24×7 support rotation and take ownership of issues.

Primary Skills

  • Monitoring & Observability (New Relic, Datadog, Grafana, ELK)
  • Infrastructure & Troubleshooting (Linux, Networking, Log Analysis)
  • Automation & Scripting (Bash, Python)
  • Cloud Platforms (AWS/Azure/GCP)
  • Terraform (basic to intermediate)