Data Acquisition Engineering - Data Infrastructure, AD/ADAS
Product & Technology AD/ADAS
Tokyo
hybrid
TEAM
Our Cloud and Data Engineering organization is working on accelerating autonomous driving by providing access to petabytes of data collected from our large fleet of autonomous and non-autonomous vehicles. Efficient, fast and cost-effective access to data at large scale is key to tackle the hardest problems in AD/ADAS, from developing the Machine Learning (ML) models for perception and prediction of human driving patterns, to increasing the sophistication of our validation and simulation by identifying rare and interesting real-world driving situations. The Lakehouse ingestion platform developed by the Data Acquisition Engineering team is a fundamental building block for developing and testing modern AD/ADAS products that will impact millions of customers.
The primary goal of our Lakehouse system and Fleetnik mobile app is to seamlessly ingest, enrich, and monitor fleet vehicle data as it flows into the cloud. Our pipelines are based on industry standard frameworks deployed to AWS. We engineer large-scale data acquisition pipelines that process hundreds of terabytes daily from numerous global ingestion sites. Our data acquisition products leverage industry-standard frameworks deployed on AWS, utilizing Java, Golang, Python, and JavaScript. We believe strongly in automation and testing to ensure delivery of robust and correct systems. We are a distributed team, working in Japan, the UK and the US.
WHO ARE WE LOOKING FOR?
RESPONSIBILITIES
- Design, build, maintain, optimize and support large scale, cloud-native data ingestion, data storage, data processing and data serving systems
- Understand the complex data requirements of modern ML development and tailor our data ecosystem to these needs
- Work closely with other Data Infrastructure engineers, Site Reliability engineers, ML Platform engineers, Computer Vision and ML engineers on high-impact projects to create innovative solutions to problems in the self-drive space
- Mentor junior engineers in their day to day work and drive best practices across the organization
- Contribute to the long term strategy for several of our systems and products
MINIMUM QUALIFICATIONS
- Bachelor’s degree in Computer Science, a related field, or equivalent practical experience
- 5+ years of experience with data structures/algorithms and professional software engineering in one or more programming languages (e.g., Python, Go, Java)
- 3+ years designing and building data-intensive, concurrent, scalable applications
- 3+ years of experience with data platforms, data pipelines, workflow orchestration, batch processing, and/or distributed databases
- Experience with cloud-based (e.g. AWS, GCP) microservice architecture, event-driven, distributed architectures
- Business-level proficiency in English speaking, reading and writing (e.g., technical documents, software documentation)
- Business-level proficiency in Japanese
NICE TO HAVES
- 2+ years writing testable, modular code in Python
- 2+ years writing testable, modular code in Golang
- 2+ years writing testable, modular code in Java
- Experience working as a Site Reliability Engineer
- Experience with Terraform, Docker, cloud-native technologies, networking and Kubernetes in production
- Experience designing, deploying, and maintaining multi-region and/or multi-cloud systems
- Experience working in a fast-paced environment, collaborating across teams and disciplines
- Experience with data governance, data privacy and security