Job description:
We are seeking a Principal Data Engineer for a client in the language learning sector.
This pivotal role requires a blend of technical prowess and leadership skills, ideally suited for a professional passionate about shaping the future of data engineering. The successful candidate will drive the development and optimization of the clients data architecture, ensuring the data systems are robust, scalable, and efficient.
Tasks will include:
Advanced Data Architecture Design: Design and implement scalable and reliable data architectures. Lead initiatives for data modeling, data warehousing, and data lake development, ensuring the architecture supports current and future business needs. Develop and maintain scalable, efficient data architectures, incorporating modern data stacks like AWS, Databricks, and Snowflake. Ensure these architectures support both current and future business analytics, AI, and machine learning initiatives
Advanced Data Engineering: Apply expertise in big data technologies, real-time data processing, and cloud-based systems to enhance our data capabilities. Implement data pipelines, ETL processes, and data storage solutions
Expertise in ML/AI Tooling: Utilize tools such as AWS SageMaker, Databricks MLflow, and other advanced ML/AI technologies to facilitate data processing and analysis. Implement and oversee machine learning pipelines and data science workflows
Team Mentorship and Collaboration: Act as a mentor to junior data engineers, fostering a culture of technical excellence. Collaborate with cross-functional teams, including data scientists, analysts, and IT professionals, to align data engineering efforts with organizational goals
Data Quality and Governance:Establish and maintain high standards for data quality and integrity. Implement data governance frameworks and ensure compliance with data privacy and security regulations
Performance Optimization:Monitor system performance, identify bottlenecks, and implement solutions to optimize data flow and storage
Experience: 8
- years of experience in data engineering with a demonstrated track record in designing and managing large-scale data systems. Experience in leading data engineering teams is essential
Technical Expertise: Proficiency in big data technologies (e.g., Hadoop, Spark), database management systems (e.g., SQL, NoSQL), cloud services (e.g., AWS, Azure, GCP), and programming languages (e.g., Python, Scala, Java). Advanced skills in AWS, Databricks, Snowflake, and other modern data tools. Experience with big data technologies, real-time data processing, and cloud-based systems
Leadership Skills: Strong leadership and team-building capabilities. Ability to mentor and develop technical teams
You have:
\-Strong programming skills in Python and SQL
\-MLOps experience (mandatory)
\-Knowledge of distributed systems and Spark distributed data processing engine
\-Experience with ETL tools and pipelines that support data ingestion, processing, storage, and delivery, such as Airflow
\-Understanding of data security and privacy regulations and how to ensure data quality, consistency, and accessibility
\-Proficiency in designing, building, and maintaining data warehousing solutions based on Snowflake and Databricks Lakehouse Platform
\-Expert understanding of dimensional data modeling techniques
\-Data governance and management skills, such as defining and enforcing data quality standards, data contracts, data lineage, and data access policies, as well as ensuring data security and compliance
\-Expert knowledge of AWS platform and cloud computing principles
\-Strong infrastructure management skills, such as provisioning, configuring, and maintaining data servers, clusters, and networks, as well as automating and optimizing data workflows and processes
\-Ability to design, plan, drive, and document major architectural changes and propose innovative solutions for data engineering problems.
Start: Soon
Duration: 6 months
- possible extension
Location: Remote
More info tiina.hapuoja@witted.com
If you are interested, please get in touch stating why you would match the project and which requirements are a match. If something is missing, please write that in your application. Please note that you need to have experience from leading teams in this field and MLOps experience.