About the Role
The Cloud Ops Engineer will support Amazon Web Services (AWS) and Linux/Windows environments. The Cloud Ops Engineer will be responsible for all aspects of the production lifecycle of maintenance, and administration, including but not limited to: infrastructure automation, continuous integration and deployment, product release and support, running a scalable production environment for hosting the ARCOS platform, maintaining application/database availability, and ensuring continuous 24x7 production uptime of our services. The Cloud Ops Engineer needs to be familiar with AWS, Apache, Tomcat, PostgreSQL, Oracle, Ansible, Jenkins, Jira, Confluence and SaaS operations. Design, develop and maintain scalable AWS solutions and infrastructure, including but not limited to: EC2, RDS, S3, DynamoDB, Elasticache, and Route53. Develop tooling and processes to automate the deployment of SaaS based applications and their underlying operating systems and infrastructure. Perform PostgreSQL and Oracle database administration, including maintenance, troubleshooting, tuning, optimization, installation, upgrades, backup/recovery, and data migration. Partner with Engineering, Development, Quality Assurance, Professional Services, and Technical Support to ensure the success of the assigned product offerings and schedules. Engage in Agile team practices such as daily standups, backlog refinement, release planning and sprint planning. Coordinate configuration changes, installs, and upgrades with appropriate development teams and product owners while following company change control procedures. Participate in capacity planning to determine future infrastructure needs. Participate in 24x7 on-call responsibilities, maintaining the availability and performance of all customer-facing production services. Triage and participate in the resolution of complex problems, including network connectivity issues, that span multiple tiers of application/infrastructure. Implement monitoring and reporting cap