AI SME-Architect ( w / AWS)

  • Marvel Technologies Inc
  • Jersey City, New Jersey
  • Full Time

AI SME/Architect- HYBRID

Auburn Hills, MI - long term (3 days a week to office)

The Role:

Our Client is seeking a AI SME/Architect with AWS Experience to join our team in Jersey City, NJ (Need Onsite day 1, hybrid 3 days from office) .

Our challenge

We are seeking an exceptional hands-on Technical Lead to spearhead our enterprise GenAI engineering program. This is a unique opportunity for a seasoned technologist who combines deep AI/ML expertise with practical engineering skills to build and operationalize cutting-edge generative AI solutions. Candidate will lead the development of AI agents and platforms while remaining deeply involved in the technical implementation.

Responsibilities:

Technical Leadership & Development

  • Lead the design, development, and deployment of enterprise-scale GenAI solutions using a hybrid of custom developed solutions and open-source platforms (Dify, OpenWebUI, etc.)
  • Architect and implement AI agents using Python frameworks including LlamaIndex and LangGraph
  • Drive hands-on development while providing technical guidance to the engineering team
  • Establish best practices for GenAI development, deployment, and operations

AI/ML Engineering

  • Design and implement LLM-based solutions with deep understanding of model architectures, fine-tuning, and prompt engineering
  • Apply classical machine learning techniques where appropriate to complement GenAI solutions
  • Optimize AI pipelines for performance, cost, and scalability
  • Implement RAG (Retrieval Augmented Generation) patterns and vector databases

Context Engineering & Advanced RAG

  • Design and implement sophisticated context engineering strategies for optimal LLM performance
  • Build advanced RAG systems including multi-hop reasoning, hybrid search, and re-ranking mechanisms
  • Develop agentic RAG architectures where agents dynamically query, synthesize, and validate information
  • Implement context window optimization techniques and dynamic context selection strategies
  • Create self-improving RAG systems with feedback loops and quality assessment

LLM Optimization & Fine-tuning

  • Lead fine-tuning initiatives for domain-specific LLMs using techniques like LoRA, QLoRA, and full fine-tuning
  • Implement performance optimization strategies including quantization, pruning, and distillation
  • Design and execute benchmark suites to measure and improve model performance
  • Optimize inference latency and throughput for production workloads
  • Implement prompt optimization and few-shot learning strategies

Platform & Infrastructure

  • Design event-driven architectures for asynchronous AI processing at scale
  • Build and deploy containerized AI applications using Kubernetes on AWS
  • Implement AWS services (SageMaker, Bedrock, Lambda, EKS, SQS, SNS, etc.) for AI workloads
  • Establish CI/CD pipelines for AI model and application deployment

Security & Governance

  • Implement secure design principles for AI systems including data privacy and model security
  • Establish AI security frameworks covering prompt injection prevention, model access controls, and data governance
  • Ensure compliance with enterprise security standards and AI ethics guidelines
  • Design audit trails and monitoring for AI system behavior

Requirements:

AI/ML Expertise

  • Deep understanding of LLMs: Architecture, training, fine-tuning, and deployment strategies
  • Context engineering proficiency: Expert-level understanding of context window management, prompt engineering, and context optimization techniques
  • Advanced RAG implementation: Hands-on experience building sophisticated RAG systems with hybrid search, metadata filtering, and agentic capabilities
  • Fine-tuning expertise: Proven experience fine-tuning LLMs for specific domains using modern techniques (LoRA, PEFT, etc.)
  • Performance optimization: Track record of optimizing LLM inference for latency, throughput, and cost
  • Classical ML proficiency: Strong foundation in traditional machine learning algorithms and applications
  • Python mastery: Expert-level Python with extensive experience in ML libraries (PyTorch, TensorFlow, Pandas, NumPy)
  • GenAI frameworks: Hands-on experience with LlamaIndex, LangChain, LangGraph, or similar frameworks
  • Open-source GenAI platforms: Experience with Dify, OpenWebUI, or comparable platforms
  • Engineering Excellence
  • Cloud architecture: Proven experience designing and implementing AWS solutions using multiple services
  • Event-driven systems: Expertise in asynchronous, event-driven architectures for scalable AI processing
  • Containerization: Advanced knowledge of Docker, Kubernetes, and container orchestration
  • DevOps/MLOps: Experience with CI/CD, infrastructure as code, and ML model lifecycle management
  • Security & Enterprise Standards
  • Secure development: Strong understanding of secure coding practices and security design patterns
  • AI security: Knowledge of AI-specific security concerns (adversarial attacks, data poisoning, prompt injection)
  • Enterprise integration: Experience with enterprise authentication, authorization, and compliance requirements

Leadership & Communication

  • 12+ years of hands-on technical experience with at least 5 years in AI/ML
  • Proven track record of leading technical teams while remaining hands-on
  • Excellent communication skills to articulate complex technical concepts to diverse stakeholders
  • Experience working in enterprise environments with multiple stakeholders

Preferred Qualifications

  • Experience with multi-agent systems and agent orchestration
  • Knowledge of vector databases (Qdrant, OpenSearch, pgvector)
  • Expertise in embedding models and semantic search optimization
  • Contributions to open-source AI/ML projects
  • Experience with model quantization and edge deployment
  • Knowledge of graph-based RAG and knowledge graph integration
  • Certifications in AWS, Kubernetes, or ML platforms

Please ensure that you use the below template forma when submitting profiles. Only the following details along with the resume should be shared:

Do not submit any personal documents along with the profile.

Please always reply on the same email thread and keep all point-of-contacts (POCs) in CC.

Submission Template

Full Name

Contact Number

Email Address

Current Location

Work Authorization

Linked in

Expected Compensation

Job ID: 488693327
Originally Posted on: 8/8/2025

Want to find more Construction opportunities?

Check out the 164,086 verified Construction jobs on iHireConstruction