Pre-screened and vetted in the Bay Area.
Senior AI/ML Engineer specializing in computer vision, NLP, and real-time forecasting
Mid AI/ML Engineer specializing in LLMs, RAG, and multimodal systems
Mid-level AI/ML Engineer specializing in Generative AI, RAG, and MLOps
“AI/LLM engineer with production experience at NVIDIA and Microsoft, including building a RAG-based enterprise knowledge assistant that improved accuracy by 42% and scaled to thousands of queries. Deep in inference optimization (TensorRT-LLM, Triton, quantization, speculative decoding) and MLOps/observability (Prometheus/Grafana, MLflow, LangSmith), plus orchestration with Kubeflow/Airflow across multi-cloud.”
Mid AI/ML Engineer specializing in LLM and enterprise generative AI
“ML/AI engineer focused on taking LLM systems from experimentation to reliable production, including enterprise copilot and RAG-based knowledge retrieval use cases. Stands out for combining data pipelines, model training, inference optimization, automated evaluation, and safety guardrails, with cited impact including 20% throughput gains and 30% less manual evaluation effort.”
Mid-level AI/ML Engineer specializing in LLMs, RAG, and distributed MLOps
Mid-level Machine Learning Engineer specializing in LLMs, RAG, and scalable GPU inference
Mid-level AI/ML Engineer specializing in LLMs, RAG, and production MLOps
Senior Machine Learning Engineer specializing in LLM inference and GPU infrastructure
Mid-level AI/ML Engineer specializing in FinTech risk and fraud systems
Mid-level AI/ML Engineer specializing in LLMs, RAG, and scalable inference
“Backend/retrieval-focused engineer with production experience at Perplexity building a large-scale real-time Q&A system using retrieval-augmented generation, emphasizing low-latency, high-quality answers through ranking, context optimization, and caching. Also has orchestration experience from both product-facing LLM pipelines and large-scale infrastructure workflows at Meta, and has partnered with non-technical stakeholders to align AI trade-offs with business goals.”
Mid-level AI/ML Engineer specializing in LLM fine-tuning, inference optimization, and AI safety
“AI/LLM engineer with production experience at NVIDIA, where they fine-tuned and deployed a financial-services chatbot and cut latency ~50% using TensorRT + NVIDIA Triton, scaling via Docker/Kubernetes. Also has consulting experience at Accenture delivering a predictive maintenance solution for a logistics network, bridging non-technical stakeholders with actionable dashboards.”
Mid-level AI/ML Engineer specializing in GPU inference and LLM platforms
“Built and deployed an LLM-powered platform that turns models into scalable REST/gRPC APIs, focusing on keeping GPU-backed inference fast and stable during traffic spikes. Experienced with AWS orchestration (EKS/ECS/Step Functions), safe model rollouts, and production-grade monitoring/testing for reliable AI agents and workflows.”
Mid-level AI/ML Engineer specializing in NLP, computer vision, and MLOps
Mid-level Full-Stack Java Engineer specializing in scalable microservices and real-time data systems
Mid-level Machine Learning Engineer specializing in LLMs and RAG systems
Mid-level AI/ML Engineer specializing in GPU-accelerated LLM and vision systems
Mid-level AI/ML Engineer specializing in LLM fine-tuning and RAG systems
Mid-level Machine Learning Engineer specializing in LLMs, RAG, and GPU-accelerated cloud systems
Mid-level AI/ML Engineer specializing in LLMs, ranking systems, and MLOps
Engineering Manager and ML/Data Architect specializing in scalable data platforms and personalization
“Hands-on engineering manager at a marketing company leading a highly senior, distributed team (10 direct reports) while personally coding ~60–70% and owning end-to-end architecture across three interconnected products. Built agentic CRM automation and a reinforcement-learning-driven distribution layer for channel spend/bidding, with a strong focus on scalable design and observability (Prometheus/APM/logging) enabling frequent releases and few production incidents.”
Mid-level AI/ML Engineer specializing in LLMs, RAG, and multimodal deep learning
“ML/LLM engineer who has built and productionized a large multimodal LLM pipeline end-to-end—fine-tuning a 20B+ parameter model with distributed/FSDP training and deploying on Kubernetes via Triton for ~5x throughput. Strong focus on reliability and safety (monitoring with SHAP, guardrails, A/B testing) with reported ~22% relevance lift and reduced harmful/incorrect outputs, plus experience orchestrating ETL/retraining workflows with Airflow across S3/Snowflake/RDS.”
Senior Generative AI Implementation Consultant specializing in RAG and agentic AI on cloud
“LLM/RAG practitioner who built an AWS-based enterprise document search and summarization platform with RBAC and scaled it to 10K+ users, solving relevance issues via contextual chunking and hybrid retrieval. Also designed agentic workflows for a telecom forecast-validation use case using sub-agents, tool APIs, and strict context management, and has proven pre-sales influence (supported a $300K manufacturing deal with a roadmap-driven pitch).”