AI SME-Architect ( w / AWS)

Marvel Technologies Inc
Jersey City, New Jersey
Full Time

Email Address

Apply Now

AI SME/Architect- HYBRID

Auburn Hills, MI - long term (3 days a week to office)

The Role:

Our Client is seeking a AI SME/Architect with AWS Experience to join our team in Jersey City, NJ (Need Onsite day 1, hybrid 3 days from office) .

Our challenge

We are seeking an exceptional hands-on Technical Lead to spearhead our enterprise GenAI engineering program. This is a unique opportunity for a seasoned technologist who combines deep AI/ML expertise with practical engineering skills to build and operationalize cutting-edge generative AI solutions. Candidate will lead the development of AI agents and platforms while remaining deeply involved in the technical implementation.

Responsibilities:

Technical Leadership & Development

Lead the design, development, and deployment of enterprise-scale GenAI solutions using a hybrid of custom developed solutions and open-source platforms (Dify, OpenWebUI, etc.)
Architect and implement AI agents using Python frameworks including LlamaIndex and LangGraph
Drive hands-on development while providing technical guidance to the engineering team
Establish best practices for GenAI development, deployment, and operations

AI/ML Engineering

Design and implement LLM-based solutions with deep understanding of model architectures, fine-tuning, and prompt engineering
Apply classical machine learning techniques where appropriate to complement GenAI solutions
Optimize AI pipelines for performance, cost, and scalability
Implement RAG (Retrieval Augmented Generation) patterns and vector databases

Context Engineering & Advanced RAG

Design and implement sophisticated context engineering strategies for optimal LLM performance
Build advanced RAG systems including multi-hop reasoning, hybrid search, and re-ranking mechanisms
Develop agentic RAG architectures where agents dynamically query, synthesize, and validate information
Implement context window optimization techniques and dynamic context selection strategies
Create self-improving RAG systems with feedback loops and quality assessment

LLM Optimization & Fine-tuning

Lead fine-tuning initiatives for domain-specific LLMs using techniques like LoRA, QLoRA, and full fine-tuning
Implement performance optimization strategies including quantization, pruning, and distillation
Design and execute benchmark suites to measure and improve model performance
Optimize inference latency and throughput for production workloads
Implement prompt optimization and few-shot learning strategies

Platform & Infrastructure

Design event-driven architectures for asynchronous AI processing at scale
Build and deploy containerized AI applications using Kubernetes on AWS
Implement AWS services (SageMaker, Bedrock, Lambda, EKS, SQS, SNS, etc.) for AI workloads
Establish CI/CD pipelines for AI model and application deployment

Security & Governance

Implement secure design principles for AI systems including data privacy and model security
Establish AI security frameworks covering prompt injection prevention, model access controls, and data governance
Ensure compliance with enterprise security standards and AI ethics guidelines
Design audit trails and monitoring for AI system behavior

Requirements:

AI/ML Expertise

Deep understanding of LLMs: Architecture, training, fine-tuning, and deployment strategies
Context engineering proficiency: Expert-level understanding of context window management, prompt engineering, and context optimization techniques
Advanced RAG implementation: Hands-on experience building sophisticated RAG systems with hybrid search, metadata filtering, and agentic capabilities
Fine-tuning expertise: Proven experience fine-tuning LLMs for specific domains using modern techniques (LoRA, PEFT, etc.)
Performance optimization: Track record of optimizing LLM inference for latency, throughput, and cost
Classical ML proficiency: Strong foundation in traditional machine learning algorithms and applications
Python mastery: Expert-level Python with extensive experience in ML libraries (PyTorch, TensorFlow, Pandas, NumPy)
GenAI frameworks: Hands-on experience with LlamaIndex, LangChain, LangGraph, or similar frameworks
Open-source GenAI platforms: Experience with Dify, OpenWebUI, or comparable platforms
Engineering Excellence
Cloud architecture: Proven experience designing and implementing AWS solutions using multiple services
Event-driven systems: Expertise in asynchronous, event-driven architectures for scalable AI processing
Containerization: Advanced knowledge of Docker, Kubernetes, and container orchestration
DevOps/MLOps: Experience with CI/CD, infrastructure as code, and ML model lifecycle management
Security & Enterprise Standards
Secure development: Strong understanding of secure coding practices and security design patterns
AI security: Knowledge of AI-specific security concerns (adversarial attacks, data poisoning, prompt injection)
Enterprise integration: Experience with enterprise authentication, authorization, and compliance requirements

Leadership & Communication

12+ years of hands-on technical experience with at least 5 years in AI/ML
Proven track record of leading technical teams while remaining hands-on
Excellent communication skills to articulate complex technical concepts to diverse stakeholders
Experience working in enterprise environments with multiple stakeholders

Preferred Qualifications