As a site reliability engineer, you will monitor performance and availability of our production environments especially on
themiddleware layer. Whenever an incident or an outage happens, you will work on resolving it with your outstanding
engineering skills and knowledge.
Once it is resolved, you will need to analyze its cause with the help of developers and operators. You will need a solid
understanding of systems and the leadership to improve the system architecture as a permanent solution.
You will also be in charge of improving observability of the systems, reducing toils by automation techniques, and continuously
changing the architecture to improve stability and performance.
・Troubleshooting and postmortem of incidents and outages.
・Improvement of monitoring, system architecture, and automation of operations to enhance availability, performance, and
・Work together with diverse people with a diverse set of skills e.g. application developers and consultants.
2+ years experience of server-side programming in any language
Good communication skills in English
Experience in Ansible or other automation techniques
• AWS Certified Solution Architect or SysOps Administrator
• Certified Kubernetes Administrator or Application Developer
• LPIC-2 or higher
• Knowledge or experience of
（結束招聘了）日本樂天IoT事業招聘（IoT Business UX/UI Designer）
（結束招聘了）日本樂天IoT事業招聘（IoT Business Industrial Designer）
日本第一大ERP系統商Works Applications SRE人才招聘（新加坡分公司職缺）