Job Title: AWS Cloud Platform Engineer
Location: Woodland Hills, CA (Onsite)
Job Overview
We are seeking an experienced AWS Cloud Platform Engineer to support enterprise cloud infrastructure and operations. The ideal candidate will be responsible for designing, provisioning, monitoring, and optimizing AWS-based infrastructure, ensuring high availability, security, and scalability of cloud environments.
This role involves platform engineering, infrastructure automation, cloud monitoring, incident management, and cost optimization while collaborating with DevOps, security, and development teams to maintain reliable and efficient cloud operations.
Key Responsibilities
Platform Engineering
- Provision, configure, and support application infrastructure in AWS cloud environments.
- Build and maintain OS Golden Images for standardized cloud deployments.
- Ensure reliability, stability, and recoverability of enterprise cloud infrastructure.
- Perform capacity planning to optimize server and application performance.
- Manage server storage, shared storage environments, and resource allocation.
- Perform hardware/software upgrades, OS patching, and system updates.
- Validate and implement firewall policies and security configurations.
- Manage backup and restoration processes with automated reporting.
- Implement Infrastructure as Code (IaC) using Terraform and GitHub.
Cloud Operations
- Monitor cloud infrastructure health, availability, and performance across virtual machines, databases, containers, load balancers, and networks.
- Configure monitoring tools such as AWS CloudWatch, Dynatrace, or enterprise monitoring platforms.
- Implement alerting mechanisms to proactively identify and respond to system anomalies.
- Analyze logs, metrics, and system traces to identify performance issues.
Incident Management & Troubleshooting
- Act as the first line of support for cloud-related incidents.
- Troubleshoot infrastructure, networking, and application issues.
- Participate in on-call rotations for 24/7 support of critical systems.
- Conduct root cause analysis (RCA) and post-incident reviews (PIR) to prevent recurring issues.
Infrastructure Management
- Manage provisioning, deprovisioning, and scaling of AWS cloud resources.
- Perform routine maintenance tasks including patching, upgrades, and backups.
- Maintain consistent cloud configurations using Terraform-based infrastructure automation.
- Optimize infrastructure performance and resource utilization.
Automation & DevOps
- Develop automation scripts using Python, Bash, or PowerShell.
- Automate infrastructure provisioning and deployment workflows using Terraform and CI/CD pipelines.
- Integrate automation with DevOps pipelines for efficient deployments.
Security & Compliance
- Implement cloud security best practices including IAM, encryption, and network security groups.
- Monitor and respond to security vulnerabilities and threats.
- Ensure compliance with enterprise security policies and industry standards.
Cost Optimization (FinOps)
- Monitor and analyze AWS cloud spending to identify optimization opportunities.
- Implement cost-saving strategies such as resource rightsizing, reserved instances, and eliminating idle resources.
- Generate cost analysis reports and recommendations for management.
Cloud Governance
- Define and enforce cloud governance policies and standards.
- Maintain documentation, operational runbooks, and best practices for cloud infrastructure.
Collaboration
- Work closely with DevOps engineers, cloud architects, developers, and security teams.
- Communicate system status, incidents, and operational updates to both technical and non-technical stakeholders.
- Participate in cross-functional planning and infrastructure strategy discussions.
Required Skills & Qualifications
- 10+ years of experience in Cloud Infrastructure Engineering or Cloud Operations.
- Strong expertise in AWS Cloud Platform services.
- Hands-on experience with Infrastructure as Code (IaC) using Terraform and GitHub.
- Experience with DevOps practices and CI/CD pipelines.
- Strong understanding of cloud security principles and compliance standards.
- Experience with cloud monitoring tools such as AWS CloudWatch, Dynatrace, or enterprise monitoring platforms.
- Proficiency in scripting languages such as Python, Bash, or PowerShell.
- Strong experience with cloud performance monitoring, logging, and cost optimization.
- Excellent problem-solving, troubleshooting, and communication skills.
Preferred Skills
- Experience with multi-cloud environments (AWS, Azure, or GCP).
- Exposure to container platforms such as Docker or Kubernetes.
- Knowledge of enterprise cloud governance and FinOps practices.