- Experience with the data transformation tool DBT
- Designing and implementing complex data transformations using advanced DBT models, materializations, and configurations to streamline data workflows and improve performance.
- Optimizing and troubleshoot DBT pipelines for scale, ensuring that transformations run efficiently in production environments, handling large datasets without issues.
- Experience programming in Python
- Design and implement scalable, high-performance applications by leveraging Python's advanced libraries and frameworks (e.g., Pandas, FastAPI, asyncio), ensuring clean code, modularity, and maintainability.
- Optimize code for performance and memory usage through profiling and refactoring, ensuring efficient execution, particularly when handling large datasets or real-time processing tasks.
- Experience with the data orchestration tool - Dagster or Airflow
- Designing and orchestrating complex DAGs to manage dependencies, triggers, and retries for data workflows, ensuring reliable and efficient pipeline execution.
- Implementing monitoring and alerting for pipeline failures or performance bottlenecks, using observability tools integrated with Dagster/Airflow to maintain robust pipeline health.
- Experience with a data warehousing solution such as Snowflake, BigQuery, Redshift, or Databricks.
- Architecting and optimizing warehouse environments for performance, including designing partitioning strategies, clustering keys, and storage optimizations for cost-effective scaling.
- Implementing security and governance policies within the wareshouse, including data encryption, access control, and audit logging to meet compliance and security best practices.
- Extensive data engineering experience, including building and managing data pipelines and ETL processes.
- Experience developing CI/CD pipelines for automated data infrastructure provisioning and application deployment.
- Experience in managing infrastructure across AWS, with a focus on performance and security.
Nice to Have
- Experience with Golang, Temporal, Monte Carlo.
- Knowledge of decentralized consensus mechanisms, including Proof-of-Work and Proof-of-Stake.
- Experience in developing custom Terraform modules for data infrastructure.