Site Reliability Engineer / Architect

Location: Hong Kong
Job Type: Permanent
Reference: 2980985
Salary: Annual
Srishtie Haripriya
Email: email Srishtie
See Srishtie other jobs
             Site Reliability Engineer/Architect: Hong Kong /Permanent 
Site Reliability Engineering (SRE) is an engineering discipline that combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. SRE is responsible for the availability and reliability of the platform and ensures it meets the requirements of both our internal and external users. We look for engineers who are self-starters, eager to learn, able to work across business and technical teams to build and run a scalable production platform, which can constantly evolve and disrupt the industry.
  1. Manage, monitor and operate the system platform to ensure all business functions running smoothly
  2. Automate the system operations to attain a high level of efficiency
  3. Partner with technical teams to improve services via rigorous testing and release procedures
  4. Identify pain points and provide recommendations to development teams to streamline the business process as much as possible
  5. Act as the communication contact between clients and the technical team for system issues and queries
  6. Drive incident management process and support a blameless post-mortems culture
  7. Coordinate and implement the platform and infrastructure upgrades or releases with technical and business teams to meet the overall schedule and SLAs
  8. Create and maintain the operational documents to reflect the changes and upgrades
  9. Participate in system design consulting, platform management, and capacity planning

  1. University degree in Information Technology / System, Computer Science / Engineering or related disciplines
  2. Minimum 1-3 years of experience in system development, DevOps or SRE --> Candidates with more experience will be considered for the senior role 
  3. Proficiency in one or more of the following: C#, Angular, TypeScript, JavaScript, PowerShell, Docker, Kubernetes
  4. Experience with algorithms, data structures and software design
  5. Experience with Windows operating systems, Microsoft SQL Server administration and / or networking
  6. Good written and communication skills for incident management or user queries
  7. Good team player and able to work with users and teams across different countries in the region
  8. Good problem solving and analytical thinking to handle incident or user queries
  9. Excellent time management skills and ability to work under pressure
  10. Attention to detail and ability to provide strategic insights on delivery and continuous improvement
  11. Self-starter,entrepreneurial attitude,hands-on solution provider
  12. Experience in financial market operations is a plus
  13. Experience with Azure Cloud Service and/or Azure DevOps is a plus
  14. Experience with UNIX operating systems internals and/or Shell Scripting is a plus
  15. Fluent in English, Cantonese and/or Mandarin preferred