(352) FASTTEK | (352) 327-8835
FASTTEK GLOBALpowered by Fast Switch - Great Lakes
info@fasttek.com
(352) FASTTEK | (352) 327-8835
Sholinganallur, Chennai

Operations Management Senior #1019279
Job Description:
  • Major Incident and Problem Manager We are seeking a dynamic Incident and Problem Manager to join our Global Incident Management Team.
  • In this role, you will lead the Problem management process to avoid repeate Incidents.
  • As the Global Incident and Problem Manager, you will be responsible for managing Critical and High Incident Problems, conducting Post Incident reviews, reporting Problem metrics, Communications, and Awareness, ensuring effective management and escalation of Major, Critical, and High Incidents.
  • You will also stay informed about ongoing Critical and High Incidents that could impact business operations and facilitate engagement, management, and timely escalation of all Incident Management related issues to the relevant parties.
  • The Problem Manager is responsible for leading and coordinating the end-to-end problem management process within the organization.
  • This includes identifying, analyzing, and resolving the root causes of recurring incidents and other IT-related issues to prevent future occurrences, improve service stability, and reduce overall IT costs.
  • The Problem Manager acts as a champion for proactive problem management and works collaboratively with various IT teams and stakeholders to ensure effective problem resolution.
 
Skills Required:
  • Problem Management Process Ownership: Develop, maintain, and improve the problem management process, ensuring it aligns with ITIL best practices and organizational needs.
  • Promote awareness and understanding of the problem management process across the IT organization.
  • Problem Identification and Logging: Identify and log potential problems based on incident trends, recurring issues, and proactive analysis of the IT environment.
  • Ensure accurate and complete problem records are created, including detailed descriptions of symptoms, impact, and potential root causes.
  • Problem Analysis and Root Cause Determination: Lead and facilitate root cause analysis (RCA) investigations, using appropriate methodologies (e.g., 5 Whys, Fishbone diagrams, Pareto analysis).
  • Collaborate with technical teams to gather data, analyze logs, and conduct experiments to identify the underlying causes of problems.
  • Document the RCA findings clearly and concisely, including the identified root cause, contributing factors, and proposed solutions.
  • Corrective Action Planning and Implementation: Develop and manage corrective action plans to address the identified root causes of problems.
  • Work with IT teams to implement corrective actions, ensuring they are effective in preventing future occurrences.
  • Track the progress of corrective actions and provide regular updates to stakeholders.
  • Problem Closure and Knowledge Management: Ensure that all problem records are properly closed after corrective actions have been implemented and verified.
 
Skills Preferred:
  • Contribute to the knowledge base by documenting problem resolutions, workarounds, and best practices.
  • Share knowledge with other IT teams to prevent similar problems from recurring.
  • Proactive Problem Management: Analyze incident trends and identify potential problems before they result in major outages.
  • Conduct proactive risk assessments and vulnerability scans to identify potential weaknesses in the IT environment.
  • Develop and implement preventative measures to mitigate identified risks.
  • Reporting and Communication: Generate regular reports on problem management activities, including the number of problems identified, root causes found, corrective actions implemented, and impact on service availability.
  • Communicate effectively with stakeholders about problem status, progress on corrective actions, and potential risks.
  • Collaboration and Teamwork: Work closely with incident management, change management, and other IT teams to ensure seamless integration of processes.
  • Build strong relationships with technical experts and business stakeholders to facilitate effective problem resolution.
  • Ensure post-event/incident follow-up actions are addressed
  • Facilitate post-mortems and lessons learned meetings and publish findings as required
  • Escalate and engage the right teams within external vendors for quicker resolution of incidents
 
Experience Required:
  • 4 to 8 years Proven experience in problem management, root cause analysis, and corrective action planning Strong understanding of IT infrastructure, systems, and applications.
  • Familiarity with various IT monitoring and diagnostic tools.
  • Proficiency in data analysis and reporting.
  • The candidate must be willing to work regular and weekend shifts, as well as on North American holidays on a rotational basis.
 
Experience Preferred:
  • Excellent communication, interpersonal, and collaboration skills.
  • Ability to lead and facilitate meetings.
  • Strong organizational and time management skills.
  • Ability to work independently and as part of a team.
 
Education Required:
  • Bachelor's degree in computer science or equivalent
 
Education Preferred:
  • Bachelor's degree in computer science or equivalent
 
 
Additional Information :
  • The candidate must be willing to work regular and on North American holidays on a rotational basis.