Sr. Staff Data Engineer
The Data Management and BI Team is seeking a Sr. Staff Data Engineer to join our team. We build data pipelines to land, profile, and store multiple internal & external datasets. We also build applications that surface this data to support our business partners’ strategic decision-making. We're an AWS shop that uses open-source technologies including Scala, Python, Pandas, Spark, Hive, and Postgres as well as BI tools, such as Tableau and MicroStrategy. We work in a very agile environment, where product specifications are flexible and often change rapidly over time. We're seeking people who are comfortable with ambiguity, can drive towards clarity, and define execution plans. While the key focus for this role is on backend engineering, engineers who have full-stack expertise will be especially considered. Responsibilities: Designs, builds, and oversees the deployment and operation of technology architecture, solutions and software to capture, manage, store, and utilize structured and unstructured data from internal and external sources Contributor to the overall Data Product roadmap by working closely with our business partners to understand their challenges and develop analytical tools to help drive business decisions Develops technical tools and programming that leverage artificial intelligence, machine learning, and big-data techniques to cleanse, organize and transform data and to maintain, defend and update data structures and integrity on an automated basis Leverage prototyping methodologies to propose and design creative business solutions that exploit our broad toolset of technologies (Big Data, MicroStrategy, Tableau, Python, Spark etc). Creates and establishes design standards and assurance processes for software, systems, and applications development to ensure compatibility and operability of data connections, flow,s and storage requirements. Reviews internal and external business and product requirements for data operations and activity and suggests changes and upgrades to systems and storage to accommodate ongoing needs. Design, develop, and maintain CI/CD pipelines using GitHub Actions to automate deployment, testing, and monitoring of applications. Implement and manage serverless solutions (e.g., AWS Lambda, EMR Serverless, Kafka, SNS, SQS, Athena etc.) as part of the application architecture. Implement infrastructure as code (IaC) practices using tools like Terraform, AWS CloudFormation, or similar to manage cloud infrastructure. Work with development teams to set up automated testing frameworks, ensuring high test coverage and code quality. Must understand the basics of relational data modeling and be able to clearly articulate the reasons for using to non-relational systems in our architecture. Experience in desired but relevant experience in any of the following is acceptable: Singlestore, MySQL, Redshift, Athena, MSSQL Server, Oracle. Decent understanding for the digital media ad sales business and ad serving technologies with experience working with ad serving transactional data logs or Nielsen demographic data. Educate and inform business partners on architecture, capabilities, best practices and solutions to build out future enhancements Assist in analyzing business requirements, source systems, understand underlying data sources, transformation requirements, data mapping, data model and metadata for reporting solutions Writing easily understood documentation and architecture diagrams and keeping them up to date as code and frameworks change over time.