**About the job Sr. Data Engineer (PySpark)**:
**Senior Data Engineer**
We are seeking a skilled Data Engineer with expertise in Python, PySpark, and Apache Airflow to join our team. This role requires proficiency in AWS services such as Redshift and Databricks, as well as strong SQL skills for data manipulation and querying.
**Responsibilities**:
- Design, develop, and maintain scalable data pipelines and workflows using Python and PySpark.
- Implement and optimize ETL processes within Apache Airflow for data ingestion, transformation, and loading.
- Utilize AWS services such as Redshift and Databricks for data storage, processing, and analysis.
- Write efficient SQL queries and optimize database performance for large-scale datasets.
- Implement version control and continuous integration using GitLab CI for maintaining codebase integrity.
- Follow Test-Driven Development (TDD) practices to ensure code reliability and maintainability.
**Requirements**:
- Bachelor's degree in Computer Science, Engineering, or a related field.
- 4 to 7 years of professional experience in data engineering or a related role.
- Proficiency in Python and PySpark for building data pipelines and processing large datasets.
- Hands-on experience with Apache Airflow for orchestrating complex workflows and scheduling tasks.
- Strong knowledge of AWS services, including Redshift and Databricks, for data storage and processing.
- Advanced SQL skills for data manipulation, querying, and optimization.
- Experience with version control systems like GitLab CI for managing codebase changes.
- Familiarity with Test-Driven Development (TDD) practices and writing unit tests for data pipelines.
- Excellent problem-solving skills and attention to detail.
- Strong communication and collaboration skills to work effectively within a team environment.
- Certification in AWS or related technologies is preferred but not required.
Open position only for people residing in Costa Rica, México, Argentina, and Brazil.
****English: B2+ proficiency required.