Data engineer
Job description:
Do you want to help a client leverage data to transform the way people learn languages? This is your chance to join a diverse and supportive team, while taking the lead in shaping the clients data architecture and designing innovative data models. The Data Engineering team is focused on creating simple yet powerful systems that process streaming and batch data to power the user's app experience and drive insightful analytics. You will specialize in handling streaming data from their apps, as well as ETL pipelines and data models, that integrate with their core product features. These services are crucial for understanding how users interact with their app, tracking time spent, and counting activities as they happen. Tasks include: * Spearhead the development and optimisation of the data collector, focusing on event and real-time streaming use cases * Lead the design and implementation of data models to support business needs at scale using dbt, Snowflake and Databricks Lakehouse Platform * Collaborate with data analysts and data scientists to deliver high-quality data solutions * Build a data observability and monitoring solution for data streaming services * Design, build, and maintain APIs and SDKs to enable self-service access to the data products / services for internal teams * Participate in the knowledge sharing sessions * Mentor and coach other data engineers, fostering a culture of learning and growth We'd like to see this experience from you: * Computer science or related engineering degree and 5 - years of experience with Python with focus on building data pipelines * In-depth SQL knowledge and extensive experience working with dbt * Attention to detail, a strong sense for data, and a deep commitment to ensuring data quality * Solid experience in Dimensional Modeling, Data Warehousing and Spark streaming * Hands-on experience with cloud data warehouses, ideally Snowflake or Databricks Lakehouse Platform * Experience with AWS services (ECS, Lambda functions, S3, DynamoDB, Kinesis etc.), operations and architecture * Experience with Infrastructure as Code (preferably Terraform) * Deep understanding of API development and experience building SDKs for internal tooling or third-party use * Strong communication skills and eagerness to participate in cross-functional projects to support the development of our Data Products * Skill to write clear documentation and debug data effectively Nice to have: * Experience in backend development and deployment * Experience organizing knowledge sharing sessions and mentoring other engineers * Building customer-facing data observability and reporting systems with tools such as AWS QuickSight, DataDog * Solid understanding of data governance principles and best practices * Designing data architecture for a domain or a whole company Start: End of May Duration: 6 months, possibly longer Location: Remote If you are interested, please get in touch via slack or tiina.hapuoja@witted.com (EN / FI)