top of page

Advertisement

Helius Technologies Hiring for DevOps Engineer

Company: 

Helius Technologies

Location:

Hyderabad

Job Type:

Salary:

Full Time

4.5 LPA

Who can Apply:

Fresher

Experience:

Job Description:

Responsible for keeping all user-facing services and other production systems running smoothly.
Provides emergency response either by being on-call or by reacting to symptoms according to monitoring and escalation when needed.
Make monitoring and alerting alert on symptoms and not on outages.
Execution Daily and Monthly Responsibilities
Strives for automation either by coding it or by leading and influencing developers to build systems that are easy to run in production.
Measure the risk of introduced features to plan ahead and improve the infrastructure.
Proposes and drives architectural changes that affect the whole platform to solve scaling and performance problems
Analyse existing, create and maintain new Service Level Objectives.
Define, improve, and engage in adapting architectural application bottlenecks as observed.
Troubleshoot, evaluate and resolve operational challenges contributing to defined SLO’s.
Work with other engineering stakeholders on resolving larger architectural bottlenecks.
Work in close collaboration with software development teams to shape the future roadmap and establish strong operational readiness across teams.
Scale systems through automation, improving change velocity and reliability.
Leverage technical skills to partner with team members and be comfortable diving into a problem as needed.
Helps to develop other team members in to senior levels and leaders in the teamFacilitate / Drive recovery calls for major incidents and coordinate with multiple teams to drive the resolution.
Responsible to communicate on major incidents and provide regular update to the stakeholders
Ensure Preventive and detective measures of the applications are identified and implemented.
Automation of manual activities / processes for Production teams.
Identifies persistent or recurring problems and recommends creative solutions
Strong communications skills and Understands and works well within global team, ensures proper handoff of incidents and details
Ensure incidents are escalated and facilitated to enable efficient and timely service restorations
Drives Root Cause Analysis with technology partners, post incident resolution and facilitates RCA reviews
Manages the identification and development of monitoring and improvements (process/ systemic) to improve the reliability of Production systems
Automate processes and remove Toils.
Build automation and Observability tools to detect, troubleshoot and recover systems faster and improve Production systems reliability and resiliency.
Build predictive tools and solutions using Machine Learning capabilities. Make best use of available logs and instrumentation.
Lead daily production operations including Ops and Infra.
Managing ticketed query system and ensuring comprehensive database of queries and resolutions is kept up to date
Maintaining and updating technical documents and procedures
Identifying and resolving technical issues in co-ordination with various teams.
Preparing maintenance plans and upgrading schedules for the organisation’s systems
Developing reports for teams across the business
Support project teams during implementation
Manage incident and co-ordinate activities with other technology team for incident assignment and resolution, identify root cause and corrective and improvement action, and track improvement action until closure in compliance with the bank standards.

About Company:

telegram_jobshuntindia
jobshuntindia_whatsapp
Screenshot 2023-08-04 100308.png
bottom of page