Pre-screened and vetted.
Mid-level AI/ML Engineer specializing in Generative AI, RAG, and MLOps
“AI/LLM engineer with production experience at NVIDIA and Microsoft, including building a RAG-based enterprise knowledge assistant that improved accuracy by 42% and scaled to thousands of queries. Deep in inference optimization (TensorRT-LLM, Triton, quantization, speculative decoding) and MLOps/observability (Prometheus/Grafana, MLflow, LangSmith), plus orchestration with Kubeflow/Airflow across multi-cloud.”
Junior Machine Learning & Data Science professional specializing in LLMs and analytics
“Amazon internship experience building production GenAI analytics for the returns organization: a multi-agent LLM+RAG system that let analysts query multiple heterogeneous data sources in natural language without hand-written SQL. Also built and operationalized four Apache Airflow DAGs for large-scale ETL, emphasizing observability and freshness-aware metadata to keep outputs accurate and up to date.”
Mid-Level Software Engineer specializing in distributed backend systems and cloud microservices
Mid-level Machine Learning Engineer specializing in LLMs, RAG, and scalable GPU inference
Principal Data Scientist specializing in ML, NLP, and forecasting for marketing and supply chain
Principal Machine Learning Scientist specializing in GenAI, LLMs, and RAG
Senior Python Developer specializing in AI/ML and cloud-native microservices
Staff Full-Stack Engineer specializing in data engineering and real-time event platforms
Senior Full-Stack Software Engineer specializing in Telehealth and FinTech
Mid-level Machine Learning Engineer specializing in generative AI, NLP, and MLOps
Mid-level AI/ML Engineer specializing in LLM training, RAG, and low-latency inference
Senior Machine Learning Engineer specializing in GenAI, NLP, and recommendation systems
Mid-level AI/ML Engineer specializing in LLMs, RAG, and scalable inference
“Backend/retrieval-focused engineer with production experience at Perplexity building a large-scale real-time Q&A system using retrieval-augmented generation, emphasizing low-latency, high-quality answers through ranking, context optimization, and caching. Also has orchestration experience from both product-facing LLM pipelines and large-scale infrastructure workflows at Meta, and has partnered with non-technical stakeholders to align AI trade-offs with business goals.”
Staff-level Machine Learning Engineer specializing in LLMs and MLOps for Financial Services
“Machine learning/NLP practitioner at J.P. Morgan who led development of a production RAG system and an entity resolution pipeline for complex financial data. Deep hands-on experience with embeddings (Sentence-BERT), vector search (FAISS/pgvector), LLM fine-tuning (LoRA/PEFT), and rigorous evaluation (human-in-the-loop + A/B testing) backed by strong MLOps on AWS (Docker/Kubernetes, MLflow, Prometheus/Datadog).”
Mid-level AI/ML Engineer specializing in LLM alignment, safety, and scalable inference
“Built and productionized an AWS-hosted, Kubernetes-orchestrated RAG assistant that enables natural-language Q&A over internal document repositories with grounded answers and citations. Demonstrates strong applied LLM engineering: hallucination mitigation, hybrid retrieval + re-ranking, and rigorous evaluation via benchmarks and A/B testing, plus real-world scaling of compute-heavy inference with dynamic batching and monitoring.”
Mid-level AI/ML Engineer specializing in GPU inference and LLM platforms
“Built and deployed an LLM-powered platform that turns models into scalable REST/gRPC APIs, focusing on keeping GPU-backed inference fast and stable during traffic spikes. Experienced with AWS orchestration (EKS/ECS/Step Functions), safe model rollouts, and production-grade monitoring/testing for reliable AI agents and workflows.”
Mid-level AI/ML Engineer specializing in fraud detection and clinical LLM assistants
“Built and deployed a production clinical support LLM assistant at Mayo Clinic using a LangChain-orchestrated RAG architecture (Llama 2/PaLM) over de-identified clinical records, integrating BigQuery with Pinecone for semantic retrieval. Focused on healthcare-critical reliability by reducing hallucinations through grounding, implementing HIPAA-aligned privacy controls (Cloud DLP, VPC Service Controls), and running structured evaluations with clinician feedback.”
Intern Machine Learning Engineer specializing in LLM reasoning, agents, and deployment
“AWS AI Lab engineer who deployed a production Chain-of-Thought analytical agent for tabular reasoning, emphasizing grounded tool-constrained workflows with schema-validated intermediate outputs. Built robust evaluation/logging with step-level observability to catch regressions across model versions, and has experience scaling distributed LLM training via Slurm + DeepSpeed/FSDP with checkpointing and failure recovery.”
Junior AI Engineer specializing in LLM systems, RAG, and full-stack automation
“Built and deployed an AI receptionist product for field-service businesses (HVAC/electrician), including real-time Jobber scheduling integrations and Twilio-based calling. Combines hands-on customer/operator shadowing with strong production engineering (queueing to handle API limits, rigorous testing/mocking, mirrored prod environment) and cross-layer troubleshooting, driving user adoption through review/override workflows.”