We are seeking an Associate Data Engineer to support our data engineering team in developing data processing and analytics solutions using big modern data technologies.
Responsibilities
Assist in developing data processing pipelines using Python and basic Spark operations
Support ETL workflow development on AWS EMR clusters
Learn and apply Spark optimization techniques under guidance
Help maintain data exploration tools using Hue and other platforms
Support data scientists and analysts with data pipeline requirements
Participate in monitoring distributed computing environments
Qualifications
1-3 years of software development experience
Basic proficiency in Python & SQL
Understanding of database concepts and ETL fundamentals
Exposure to big data technologies and cloud platforms (AWS preferred)
Basic knowledge of Apache Spark or willingness to learn quickly
Familiarity with AWS services or cloud computing concepts
Interest in data processing and analytics
Strong problem-solving and learning abilities.
Academic or project experience with data streaming technologies
Basic understanding of data security principles
Exposure to machine learning concepts through coursework or projects