The client is looking for an ETL Engineer who will be responsible for building and maintain ETL pipelines in SQL, Python, R and ensure performance and operations of the data infrastructure, data products and data APIs as well as designing and implementing new solutions for the data team and wider business. You'll provide subject matter expertise and keep up to date with emerging tools and technologies, pro-actively seeking improvements.
Responsibilities:
- Lead the way in defining pipelines, architecture for all ETL data projects
- Be hands-on, always developing, running and enhancing data pipelines
- Review and manage the end-to-end ETL lifecycle including process management, data modelling, data warehouse architecture, ETL pipeline development and testing
- Build objects in Azure to facilitate the flow of data across the business in the most efficient and sustainable method possible
- Identify areas where improvements can be made, whether that is with applications, architectures, or the processes we use
- Implement and maintain best practices for all software engineering
- Liaise and work closely with other departments to meet both their business and technical ETL/data manipulation requirements
- Data Lake, Big Data tooling and integration coordination/delivery
- Provide technical leadership and mentor other ETL engineers
Requirements
Essential:
- Hands on and experience with SQL, R, and Python and experience with ETL pipelines in Azure
- Experience and knowledge of Azure DevOps, MLOps, SSIS, Databricks, Data Factory, Synapse Analytics
- Extensive knowledge in Microsoft Azure, SQL, SSIS & NoSQL DB, TDD
- Advanced skills in ETL pipelines, data warehousing and analytics frameworks, as well as PostgreSQL and MySQL
- Thought leader for data, and can present clear insights to both technical and non-technical stakeholders to support the value of data
- Extensive experience identifying and solving problems in databases, data processes or data services as they occur
- Experience and understanding of Data APIs
- Experience of building infrastructure with ARM templates or equivalent
- Experience of full life-cycle software development, to include AGILE, git, CI/CD
- Experience operating, automating and supporting complex ETL processes/pipelines, software products and testing concepts alongside container orchestration with AKS, App Service, service fabric, etc.
- Knowledge of optimisation technologies (profiling, indexing, routine maintenance, server configuration, file structures etc)
- Dev/Test, Pre-prod/Prod, QA, etc. fault resolution
- Knowledge, exposure and experience with Numpy, Pandas, Dask, Modin, Ray, Tidyverse, d(b)plyr
- Experience building self-service tooling and workflows for Machine Learning and Analytics users
- Knowledge, experience and strong leadership opinions of specific tooling (Hadoop, Kafka, Spark) to support technical architecture choices
