Machine Learning Hardware Architect, Google Cloud

Google
Sunnyvale, California
Full Time

Email Address

Apply Now

Minimum qualifications:

Bachelor's degree in Electrical Engineering, Computer Engineering, Computer Science, a related field, or equivalent practical experience.
5 years of experience in computer architecture, chip architecture, IP architecture, co-design, performance analysis, or hardware design.
Experience in developing software systems in C++ or Python.

Preferred qualifications:

Master's degree or PhD in Electrical Engineering, Computer Engineering or Computer Science, with an emphasis on Computer Architecture, or a related field.
8 years of experience in computer architecture, chip architecture, IP architecture, co-design, performance analysis, or hardware design.
Experience in processor design or accelerator designs and mapping ML models to hardware architectures.
Experience with deep learning frameworks including TensorFlow and PyTorch.
Knowledge of Machine Learning market, technological and business trends, software ecosystem, and emerging applications.
Knowledge of hardware/software stack for deep learning accelerators.

In this role, youll work to shape the future of AI/ML hardware acceleration. You will have an opportunity to drive TPU (Tensor Processing Unit) technology that powers Google's most demanding AI/ML applications. Youll be part of a team that pushes boundaries, developing custom silicon solutions that power the future of Google's TPU. You'll contribute to the innovation behind products loved by millions worldwide, and leverage your design and verification expertise to verify complex digital designs, with a specific focus on TPU architecture and its integration within AI/ML-driven systems.

In this role, you will be at the forefront of advancing ML accelerator performance and efficiency, employing a comprehensive approach that spans compiler interactions, system modeling, power architecture, and host system integration. You will prototype new hardware features, such as instruction extensions and memory layouts, by leveraging existing compiler and runtime stacks, and develop transaction-level models for early performance estimation and workload simulation. A critical part of your work will be to optimize the accelerator design for maximum performance under strict power and thermal constraints; this includes evaluating novel power technologies and collaborating on thermal design. Furthermore, you will streamline host-accelerator interactions, minimize data transfer overheads, ensure seamless software integration across different operational modes like training and inference, and devise strategies to enhance overall ML hardware utilization. To achieve these goals, you will collaborate closely with specialized teams, including XLA (Accelerated Linear Algebra) compiler, Platforms performance, package, and system design to transition innovations to production and maintain a unified approach to modeling and system optimization.The ML, Systems, & Cloud AI (MSCA) organization at Google designs, implements, and manages the hardware, software, machine learning, and systems infrastructure for all Google services (Search, YouTube, etc.) and Google Cloud. Our end users are Googlers, Cloud customers and the billions of people who use Google services around the world.

We prioritize security, efficiency, and reliability across everything we do - from developing our latest TPUs to running a global network, while driving towards shaping the future of hyperscale computing. Our global impact spans software and hardware, including Google Clouds Vertex AI, the leading AI platform for bringing Gemini models to enterprise customers.
The US base salary range for this full-time position is $156,000-$229,000 + bonus + equity + benefits. Our salary ranges are determined by role, level, and location. Within the range, individual pay is determined by work location and additional factors, including job-related skills, experience, and relevant education or training. Your recruiter can share more about the specific salary range for your preferred location during the hiring process.

Please note that the compensation details listed in US role postings reflect the base salary only, and do not include bonus, equity, or benefits. Learn more about benefits at Google.

Create differentiated architectural innovations for Googles semiconductor Tensor Processing Unit (TPU) roadmap.
Evaluate the power, performance, and cost of prospective architecture and subsystems.
Collaborate with partners in Hardware Design, Software, Compiler, ML Model and Research teams for hardware/software co-design.
Work on Machine Learning (ML) workload characterization and benchmarking.
Develop architecture for differentiating features on next generation TPUs.

Job ID: 484395254

Originally Posted on: 7/8/2025

Email Address

Apply Now