ABOUT US:
Founded in 2016, DataZymes is a next-generation analytics and data science company driving technology and digital-led innovation for our clients, thus helping them get more value from their data and analytics investments. Our platforms are built on best-of-breed technologies, thus protecting current investments while providing clients more bang for their buck. As we are a premier partner for many Business Intelligence and Information Management companies, we also provide advisory and consulting services to clients helping them make the right decisions and put together a long-term roadmap.
Our mission at DataZymes is to scale analytics and enable healthcare organizations in achieving non-linear, long term and sustainable growth. In a short span, we have built a high-performance team in focused practice areas, built digital-enabled solutions, and are working with some marquee names in the US healthcare industry.
JOB LOCATION: Bangalore
QUALIFICATION REQUIRED: Bachelor’s or master’s degree in computer science, Information Technology; Experience with batch job scheduling and identifying data/job dependencies.
EXPERIENCE REQUIRED: 4-8 years hands on experience
EMPLOYMENT TYPE: Full-Time
Responsibilities:- Design and implement scalable data pipelines using AWS services such as S3, Glue, PySpark, and EMR.
- Develop and maintain robust data transformation scripts using Python and SQL.
- Optimize data storage and retrieval using AWS database services like Redshift, RDS, and DynamoDB.
- Build and manage data warehousing layers tailored to specific business use cases.
- Apply strong ETL and data modeling skills to ensure efficient data flow and structure.
- Ensure high data quality, availability, and consistency to support analytics and reporting needs.
- Work closely with data analysts and business stakeholders to understand data requirements and deliver actionable insights.
Required Qualifications:- Bachelor’s or Master’s degree in Computer Science, Information Technology, or a related field.
- Hands-on experience with AWS data services such as S3, Glue Studio, Redshift, Athena, and EMR.
- Strong proficiency in SQL and Python for data processing.
- Experience with batch job scheduling and managing data dependencies.
Preferred Skills:- Expertise in data warehousing, ETL frameworks, and big data processing.
- Familiarity with the pharmaceutical domain is a plus.
- Experience with data lake architecture and schema evolution.
Note:- This role is not DevOps-focused. Candidates with a background primarily in infrastructure automation, CI/CD pipelines, or system administration will not be a fit unless they have significant experience in data engineering and AWS data services.
COMPETENCIES:
- Proficiency in data warehousing, ETL, and big data processing.
- Familiarity with the Pharma domain is a plus.