Who we are:
As North America’s oldest startup and Canada’s purpose-driven digital marketplace, The Bay is on a
high-growth mission to rewrite the rules of retail to help Canadians live a colorful life. If you believe in
the power of our iconic brand and thrive on problem-solving at scale, we want you to join our team.
At The Bay, smart, high-performing team members will challenge you to learn and grow every day. We
value ambitious work and great ideas grounded in data and insights. We&#39;re looking for talented
who love a fast-paced environment, embrace change, and are looking to make an impact with
We are building a digital-first company and brand for a diverse world and we need an inclusive team to
reach our potential. We strongly encourage applications from everyone to come and join a winning
team that supports diverse thinking and demonstrates innovation, energy, creativity, and vision every
You can learn more and view available positions in Bengaluru, by visiting
What This Position is All About
The Site Reliability Engineering Lead role assists in the planning, monitoring, and controlling the day-to-
day operations and delivery aspects of the Site Reliability Engineering teams. The role assists in
managing team productivity and works to ensure the optimal health of the The Bay eCommerce & CRM
platforms by overseeing platform performance, resilience, and stability. This role is also active in all
aspects of Site Reliability Engineering, including technical vision, telemetry and observation decisions,
automation strategy, solution delivery, and platform incident and problem management. This is a
leadership role with both technical and people leadership responsibilities. As such, this role participates
in short and long-term systems planning, teams and organizational planning
Who You Are:
● Bachelor’s Degree in Computer Science or equivalent
● Azure/AWS, Microsoft, RedHat, certifications and knowledge of ITIL/MOF practices
Common Roles And Responsibilities
● Be on a PagerDuty rotation to respond to performance,scalability, availability incidents
● Run the production environment by monitoring availability and taking a holistic view of system
● Building and implementing services to make IT and support better at their jobs.
● Improve reliability, quality, and time-to-market of our suite of software solutions
● Measure and optimize system performance
● capabilities forward, getting ahead of customer needs, and innovating to continually improve
● Gather and analyze metrics from both operating systems and applications to assist in
● performance tuning and fault finding
● Experience in an agile working development environment
● Participate in system design consulting, platform management, and capacity planning
● Balance feature development speed and reliability with well-defined service level
Required Skills And Qualifications
● 4+ years of experience working within DevOps or SRE teams.
● 2+ years experience any cloud platforms
● Ability to program (structured and OO) with one or more high-level languages, such
● Site Reliability Engineering: Knowledge of the theories and methodologies of reliability
engineering; ability to design, develop and support various tools, services and applications to
maintain a reliable application environment.
● Performance Measurement and Tuning: Knowledge of system performance, testing and
programming; ability to monitor, measure, and optimize system performance and network
● CI/CD Pipeline: Knowledge of concepts, values and tools applied in building Continuous
Integration (CI), Continuous Delivery and Continuous Deployment(CD) pipeline; ability to design,
build, implement and maintain CI/CD pipelines to achieve the automation of software delivery
process (AWS, Git).
● Software Release Management: Knowledge of strategies, practices, and tools for managing
versions and distribution of software products and enhancements; ability to evaluate and
improve release management practices and tools
● Application Maintenance: Knowledge of production applications; ability to monitor application
functions and resolve issues to maintain optimal conditions for system applications.
● Software Engineering: Knowledge of software engineering; ability to deliver new or enhanced
● Agile Development: Knowledge of agile methodologies and the agile development lifecycle;
ability to utilize formal agile methodologies, disciplines, practices and techniques for the
delivery of new and enhanced applications.
● Container: Knowledge of concept, functions, and capabilities of container tools and techniques;
ability to effectively apply containers in various IT business environments
● Must have experience in troubleshooting production incidents and liaising with vendors
● Identifying bottlenecks and working on solutions either independently or work collaboratively
with the engineering team
● Document every action so your findings turn into repeatable actions–and then into
● Salesforce commerce cloud and retail account experience is a plus
Thank you for your interest In The Bay. We look forward to reviewing your application.
The Bay provides equal employment opportunities (EEO) to all employees and applicants for employment without regard to race, color, religion, sex, national origin, age, disability or genetics. In addition to federal law requirements, HBC complies with applicable state and local laws governing nondiscrimination in employment in every location in which the company has facilities. This policy applies to all terms and conditions of employment, including recruiting, hiring, placement, promotion, termination, layoff, recall, transfer, leaves of absence, compensation and training.
The Bay welcomes all applicants for this position. Should you be individually selected to participate in an assessment or selection process, accommodations are available upon request in relation to the materials or processes to be used.