Major Incident and Problem Manager, Associate

Locations: Edinburgh, United Kingdom

Apply Now

Overview of Technology roles

At BlackRock, technology has always been at the core of what we do – and today, our technologists continue to shape the future of the industry with their innovative work. We are not only curious but also collaborative and eager to embrace experimentation as a means to solve complex challenges. Here you’ll find an environment that promotes working across teams, businesses, regions and specialties – and a firm committed to supporting your growth as a technologist through curated learning opportunities, tech-specific career paths, and access to experts and leaders around the world.

Job description

About this role

Team Overview

The Service Management team provides industry‑standard Incident, Problem and Change Management, alongside infrastructure operational support for Aladdin. We operate using modern engineering practices and tooling, including ServiceNow and AI‑enabled workflows, and measure outcomes through clear operational metrics.

Incident Management is responsible for restoring service during production incidents and driving scalable stability improvements across BlackRock and its Aladdin clients.

BlackRock operates a 24/7 Major Incident Management function supporting global clients across Europe, the Americas, Asia Pacific and India. This role is based in Edinburgh and is required to cover core European hours between 09:00 and 18:00, Monday to Sunday, with rotational weekend working.

Role

We are seeking an experienced Incident & Problem Manager (5+ years) with a strong passion for technical troubleshooting and the ability to lead multiple simultaneous incidents.

This role exists to deliver rapid time to detect and time to resolve, and to eliminate repeat incidents at a system level by operating an AI‑first incident delivery model. The Major Incident & Problem Manager is accountable for turning incidents into measurable stability improvements—particularly those caused by change—and for building an incident operating rhythm where AI handles correlation, classification and narrative generation by default, allowing humans to focus on decision quality, trade‑offs and prevention.

In complex distributed platforms, incidents are often slowed by manual triage, fragmented ownership and time‑consuming coordination. This role addresses those challenges by creating a decision‑centric incident response model, powered by AI‑driven signal correlation and automation‑first execution, ensuring that:

The right responders are engaged faster

The most likely causes are identified sooner

Mitigation decisions are taken with clearer risk framing

Communications remainaccurate and timely

Repeat failures are systematically removed rather than documented

The role partners closely with Engineering and SRE / DevOps teams, leveraging automation, observability tooling and emerging AI‑driven insights. The successful candidate will have a DevOps mindset, be able to actively troubleshoot, and utilise and enhance AI and automation.

The role also includes participation in continuous improvement initiatives aimed at improving the stability, performance and resilience of the Aladdin platform, and enhancing Service Management services.

Key Responsibilities

1. Lead major incidents as a decision authority (P1–P4)

Lead end‑to‑end management of production incidents, including investigation, recovery execution and closure

Run incidents as a decision system, driving clarity on what is known, what is suspected and what action is taken next

Manage multiple simultaneous incidents while maintaining consistent prioritisation and escalation

2. Operate an AI‑first incident workflow (human‑validated, human‑overridden when required)

Triage and categorise incidents using AI‑driven classification, with human validation and override where appropriate

Drive AI‑automated ticket routing and apply risk‑based escalation judgement when automation is insufficient

Ensure incident timelines and summaries are produced to a high standard using AI‑generated artefacts, correcting them where required

3. Supervise automated remediation and agentic responders

Supervise automated remediation and agentic responders, intervening to pause, override or redirect when risk requires

Ensure automated remediation is safe, auditable and aligned with service ownership and operational readiness

4. Manage a robust Problem Management process to prevent incident recurrence

Ensure root causes and preventative actions are clearly captured and translated into an effective Problem Management process

Identify incident trends and repeat patterns, driving scalable remediation to reduce recurrence

Partner with Engineering and SRE / DevOps to embed learnings into automation, observability, runbooks and readiness controls

Design, build and actively maintain a Known Error Database that functions as a real‑time operational asset

Work with product teams to design, build and deliver a meaningful process for addressing repeat incidents

5. Deliver executive‑grade communications (AI‑drafted, human‑approved)

Validate, approve and issue regular communications that are concise, informative and appropriate for stakeholders

Ensure communications accurately reflect impact, mitigation progress, key risks and confidence‑based ETAs

6. Drive continuous service improvement and regulatory alignment

Drive process and tooling changes that support operational resilience and regulatory requirements, including DORA and GDPR, where applicable

Provide input and ownership for continual service improvement initiatives, with a primary focus on Agentic AI and its application to Incident Management

Required Experience and Capabilities (Must Have)

5+ years’ experience in Incident and Problem Management within a production environment supporting business‑critical platforms

Strong technical troubleshooting capability, with the ability to engage credibly with engineers during complex failures

Proven ability to lead multiple simultaneous incidents and drive structured recovery under pressure

DevOps mindset, with comfort using observability tooling, automation and operational engineering practices

Ability to produce clear, high‑quality communications suitable for senior stakeholders

Experience operating AI systems for triage, correlation and narrative generation, with sound judgement on when outputs require validation or override

Ability to translate repetitive incident activity into automation requirements and drive adoption with engineering partners

Advantages / Desirable Qualities

Experience working in or with FinTech or regulated environments

Knowledge of cloud platforms such as Azure and/or AWS, and understanding of IaaS / PaaS / SaaS service models

Experience with Microsoft Copilot and AI‑enabled productivity tooling

Programming capability (e.g. Python) to automate common tasks or prototype improvements

Familiarity with configuration management, deployment and orchestration tooling (e.g. Ansible)

Strong data analysis skills using tools such as Splunk, Grafana, Tableau, Excel and/or Power BI

Strong experience with ServiceNow and operational reporting

Our benefits

To help you stay energized, engaged and inspired, we offer a wide range of employee benefits including: retirement investment and tools designed to help you in building a sound financial future; access to education reimbursement; comprehensive resources to support your physical health and emotional well-being; family support programs; and Flexible Time Off (FTO) so you can relax, recharge and be there for the people you care about.

Our hybrid work model

BlackRock’s hybrid work model is designed to enable a culture of collaboration and apprenticeship that enriches the experience of our employees, while supporting flexibility for all. Employees are currently required to work at least 4 days in the office per week, with the flexibility to work from home 1 day a week. Some business groups may require more time in the office due to their roles and responsibilities. We remain focused on increasing the impactful moments that arise when we work together in person – aligned with our commitment to performance and innovation. As a new joiner, you can count on this hybrid model to accelerate your learning and onboarding experience here at BlackRock.

About BlackRock

At BlackRock, we are all connected by one mission: to help more and more people experience financial well-being. Our clients, and the people they serve, are saving for retirement, paying for their children’s educations, buying homes and starting businesses. Their investments also help to strengthen the global economy: support businesses small and large; finance infrastructure projects that connect and power cities; and facilitate innovations that drive progress.

This mission would not be possible without our smartest investment – the one we make in our employees. It’s why we’re dedicated to creating an environment where our colleagues feel welcomed, valued and supported with networks, benefits and development opportunities to help them thrive.

For additional information on BlackRock, please visit @blackrock | Twitter: @blackrock | LinkedIn: www.linkedin.com/company/blackrock

BlackRock is proud to be an Equal Opportunity Employer. We evaluate qualified applicants without regard to age, disability, race, religion, sex, sexual orientation and other protected characteristics at law.

Job Requisition #
R262318

Apply

Explore this location View Map

BlackRock Principles

We look to hire people that will embody our BlackRock Principles:

We are a fiduciary to our clients.

This is the bedrock of our identity; it reflects our integrity and the unbiased advice we give our clients.
We are One BlackRock.

We work collaboratively to create the best outcomes for our clients, our firm and the communities where we operate.
We are passionate about performance.

We are relentless in innovating and finding better ways to serve our clients and improve our firm.
We take emotional ownership.

We have a deep sense of responsibility to our clients and to each other.
We are committed to a better future.

We are long-term thinkers, focused on helping people build a better tomorrow.

Career path

We recognize that our technologists benefit from a tailored approach to navigating and advancing their careers in the ways they envision. Our tech career paths are specifically built to support vertical and horizontal trajectories – including Enterprise Leadership (team manager) and Tech Leadership (individual contributor) ‘tracks’ as well as various other career moves.

Engineer I (Analyst)
Engineer II/III (Associate)
Senior Engineer I/II (Vice President)
Lead Engineer
(Vice President)

Engineering Team Manager
(Vice President)
Principal/Sr. Principal
Engineer (Director)

Engineering Team Director/ Sr. Engineering Team Director
Managing Director

Managing Director
TECH
LEADERSHIP

ENTERPRISE
LEADERSHIP

Benefits

We care about your overall well-being and design our benefits package to support you in various aspects of your life.

Financial well-being

We offer resources designed to help you build a sound financial future for you and your family, like retirement savings plans and tuition reimbursement.
Pay for performance

Our pay-for-performance philosophy includes a base salary and a discretionary annual bonus.
Physical well-being

Our healthcare plans and resources help you focus on your physical health, so you and your family can feel your best.
Emotional well-being

We support our people's mental health and emotional well-being by providing access to an Employee Assistance Program and a network of Mental Health Ambassadors.
Life management

You'll be able to focus on moments that are important to you with benefits designed to support life in and outside of work with Flexible Time Off, parental leave and more.