Vetted Data Preprocessing Professionals

Pre-screened and vetted.

UJ

Utkarsh Joshi

Screened

Senior Data Scientist specializing in ML, NLP, and GenAI analytics

Remote, US7y exp
University of MinnesotaUniversity of Minnesota

Built and deployed an LLM-powered analytics assistant enabling business users to ask questions in plain English and receive validated Spark SQL executed in Databricks, with a Streamlit/Flask UI. Addressed strict client schema-privacy constraints by implementing a RAG strategy and ultimately leveraging AWS Bedrock and fine-tuned reference docs. Also has production ML pipeline experience using Docker + Airflow and AWS (S3/ECS/EC2) for financial classification models.

View profile
UC

Mid-level Machine Learning Engineer specializing in NLP, computer vision, and RAG systems

Atlanta, GA5y exp
Morgan StanleyKennesaw State University

Machine learning/NLP engineer who built a production-oriented retrieval-based AI system at Morgan Stanley for healthcare use cases, combining RAG over unstructured patient records with deep-learning medical image segmentation (U-Net/Mask R-CNN). Strong in end-to-end pipelines and MLOps (Spark/MongoDB, AWS SageMaker, CI/CD, monitoring, automated retraining) and in entity resolution/data quality validation for noisy clinical data.

View profile
SM

Sahithi M

Screened

Mid-level GenAI/ML Engineer specializing in LLM applications and enterprise automation

5y exp
UnitedHealth GroupRivier University

Built and shipped a production LLM-powered healthcare support agent at UnitedHealthGroup, using LangChain + FAISS RAG on AWS SageMaker with CloudWatch monitoring and human-in-the-loop fallbacks for safety. Strong focus on reliability engineering (confidence gating, retries/timeouts, caching) and continuous evaluation loops; reported ~40% improvement in query resolution efficiency while reducing manual support workload.

View profile
Allan Farinas - Senior Full-Stack Software Engineer specializing in Python and AWS in West Covina, CA

Allan Farinas

Screened

Senior Full-Stack Software Engineer specializing in Python and AWS

West Covina, CA11y exp
CareRevCal Poly Pomona

Backend/data engineer who has built production Python microservices (FastAPI) and AWS-native platforms for event ingestion and analytics, combining ECS/Fargate + Lambda with CloudFormation-driven environments and strong secrets/IAM practices. Experienced modernizing legacy logic with parallel-run parity validation and safe phased cutovers, and has demonstrated measurable SQL tuning wins (20–30s down to 1–2s) plus incident ownership in Glue/Step Functions ETL pipelines.

View profile
Divyam Agrawal - Mid-level Machine Learning Engineer specializing in LLMs and NLP classification systems in Seattle, WA

Mid-level Machine Learning Engineer specializing in LLMs and NLP classification systems

Seattle, WA4y exp
Affinity SolutionsUniversity of Washington

Internship experience building a production RAG+LLM pipeline to map messy card transaction descriptions to merchant brands, including a custom modified-ROUGE evaluation approach for weak/variant ground truth. Improved scalability and cost by moving from a managed LLM endpoint (e.g., Bedrock) to self-hosted vLLM, and orchestrated massive embedding backfills (5,000+ files, 10B+ rows) using an Airflow-triggered SQS + ECS worker architecture with robust retry/DLQ handling.

View profile
Hari Chandana Kasula - Entry Machine Learning Engineer specializing in NLP, computer vision, and recommender systems in New York, NY

Entry Machine Learning Engineer specializing in NLP, computer vision, and recommender systems

New York, NY0y exp
Columbia UniversityColumbia University

Built and shipped an end-to-end podcast recommendation system exposed via a Flask API and React UI, explicitly balancing relevance, diversity (MMR), and safety constraints while meeting ~200ms latency targets. Also implemented a production-style RAG/information-extraction pipeline using web retrieval, spaCy NER, and fine-tuned SpanBERT with guardrails and evaluation loops (precision/recall/F1) to tune confidence thresholds and improve reliability.

View profile
MS

Mihir Sahu

Screened

Intern software engineer specializing in AI, full-stack, and applied ML

Madison, WI1y exp
Capital OneUniversity of Wisconsin–Madison

Backend/ML-focused engineer with experience spanning fintech, sales enablement, and medtech, including a Capital One capstone and a Singapore medtech startup internship. Stands out for owning end-to-end AI/backend systems, from a GenAI sales pitch platform that cut prep time by 50% to an ultrasound-guidance MVP for non-expert operators in a highly ambiguous domain.

View profile
MG

Mid-level Software Development Engineer specializing in cloud-native AI/ML systems

California, USA4y exp
ServiceNowCal State Long Beach

AI/ML-focused engineer with practical experience building RAG-based and multi-agent systems, including architectures for retrieval, reasoning, context processing, and response generation. Stands out for combining LLM productivity gains with disciplined software engineering practices like validation, monitoring, and reproducibility.

View profile
AK

Mid-level Machine Learning Engineer specializing in MLOps, NLP, and production ML systems

5y exp
ComcastUniversity of Central Missouri

Backend/founding-engineer-style builder who designed and evolved a near-real-time customer churn prediction platform (FastAPI + AWS SageMaker/Lambda + Redis + MLflow) to enable real-time retention actions, reporting ~18% churn reduction. Demonstrates strong production engineering in secure API design, incremental migrations with data integrity safeguards, and robustness improvements in async pipelines (idempotency, DLQs, retry visibility).

View profile
SL

Sabrina Liu

Screened

Junior Robotics & ML Engineer specializing in robot learning and simulation

Ithaca, NY2y exp
Cornell Center for Teaching InnovationCornell University

Robotics engineer with a 2024 internship building an end-to-end software stack for an autonomous humanoid robot that follows natural-language audio commands to make coffee and deliver snacks, including perception (OpenCV), mapping, and ROS Navigation. Also contributing to a robotics foundation model effort by building data preprocessing pipelines using GroundingDINO and SAM2, and has multi-robot coordination experience with algorithms designed to handle real-world communication drops.

View profile
VM

Senior Data Scientist specializing in GenAI, LLMs and RAG

Dallas, TX5y exp
Texas InstrumentsTrine University

Built and deployed a production LLM-powered RAG assistant for semiconductor manufacturing failure analysis, reducing engineer triage effort by grounding outputs in retrieved evidence and gating responses with SPC + ML signals (LSTM anomaly scores, XGBoost probabilities). Experienced with LangChain/LangGraph to ship reliable, observable multi-step agents with branching/fallback logic, and evaluates impact using both technical metrics and business KPIs like mean time to triage and downtime reduction.

View profile
KL

Kangjie Lu

Screened

Intern Full-Stack Software Engineer specializing in data pipelines and AI/ML systems

Beijing, China1y exp
Shanghai Wanwu Zhiyun Industrial Technology Co., Ltd.Carnegie Mellon University

Software engineer with experience building a Vue.js/TypeScript internal component library (with Jest testing standards) and improving JS runtime performance via profiling, code splitting, and lazy loading. Also led documentation and community support for a Python ML utility library, diagnosing metric-calculation bugs for imbalanced datasets and driving large reductions in support inquiries through targeted docs, tests, and rapid hotfixes in a startup environment.

View profile
SZ

Junior AI/Backend Software Engineer specializing in ML and scalable systems

Dallas, TX2y exp
PMGUniversity of Maryland, College Park

Backend engineer with strong AWS/CI/CD experience (multi-repo deployments, Lambda + core app, immutable ECR and image promotion) and a published master’s thesis building an ML framework for Solar PV energy prediction and CO2 reduction impact modeling using ensemble and meta-learning approaches benchmarked against SAM.

View profile
UK

Mid-level Generative AI Engineer specializing in LLM agents and RAG systems

4y exp
Capital OneLindsey Wilson College

Built and deployed a production LLM/RAG knowledge assistant integrating internal docs, wikis, and ticket histories to reduce tribal-knowledge dependency and repetitive questions. Emphasizes reliability via grounding + a validation layer, and achieved major latency gains (>50%) through vector index optimization, caching, quantization, and selective re-validation. Comfortable orchestrating end-to-end LLM/data workflows with Airflow, Prefect, and Dagster, including monitoring and alerting.

View profile
SM

Mid-level Data Scientist specializing in NLP/LLMs, time series forecasting, and MLOps

New York, NY6y exp
CitigroupKent State University

Data/ML practitioner with hands-on experience building NLP systems from prototype to production: delivered a Twitter sentiment classifier with robust preprocessing, SVM modeling, and Power BI reporting, and built entity-resolution pipelines for messy multi-source customer data (reporting ~95% improvement in unique entity identification). Also implemented semantic linking/search using SBERT embeddings with FAISS vector retrieval and domain fine-tuning (reported ~15% precision lift), and applies production workflow best practices (Airflow/Prefect, Docker, Azure ML/Databricks, Great Expectations).

View profile
Abhinav Gupta - Junior Machine Learning Engineer specializing in LLMs and applied data science

Abhinav Gupta

Screened

Junior Machine Learning Engineer specializing in LLMs and applied data science

2y exp
EsriUSC

Built and shipped multiple production AI systems, including Auto DocGen (LLM-generated OpenAPI docs kept in sync via AST diffs, schema-constrained generation, and CI/CD on Render) and a multimodal sign-language recognition pipeline at USC orchestrated with FastAPI, MediaPipe, and PyTorch. Also partnered with Esri’s non-technical community team to fine-tune an LLaMA-based spam classifier with a review UI, cutting moderation time by 70%.

View profile
Karan Variyambat - Mid-level Machine Learning Engineer/Researcher specializing in computer vision and multimodal AI in San Diego, CA

Mid-level Machine Learning Engineer/Researcher specializing in computer vision and multimodal AI

San Diego, CA3y exp
San Diego Supercomputer CenterUC San Diego

Developed a production wildfire smoke detection system where smoke is visually subtle and easily confused with fog/clouds; addressed this with a hybrid CNN+LSTM+ViT model and multimodal weather features to reduce false positives. Experienced running scalable, reproducible ML pipelines on shared GPU infrastructure using Slurm and Kubernetes-style batch jobs with checkpointing, retries, and rigorous error analysis.

View profile
Sri Harshitha Yannam - Junior Software Engineer specializing in AI/ML and cloud platforms in Austin, TX

Junior Software Engineer specializing in AI/ML and cloud platforms

Austin, TX2y exp
AmazonUniversity of Wisconsin–Milwaukee

LLM/agent engineer who shipped a production "Memory Assistant" at HydroX AI, building a LangChain/LlamaIndex RAG memory pipeline on ChromaDB/FAISS with robust fallbacks (BERT/BART), prompt-injection mitigation, and 99.9% uptime monitoring. Also built a multi-step customer support agent using Rasa + OpenAI Assistants API with structured tool calling, guardrails, and human-in-the-loop escalation, and has experience hardening agents against messy ERP data via Pydantic validation, idempotency, and transactional outbox patterns.

View profile
Ishaan Nanal - Intern-level Software Engineer specializing in backend systems and AI/ML in Ithaca, NY

Ishaan Nanal

Screened

Intern-level Software Engineer specializing in backend systems and AI/ML

Ithaca, NY1y exp
QuorAgraCornell University

Built and shipped an LLM-powered RAG research copilot used by 20+ users across biology, physics, and ML, cutting literature review from days to minutes. Strong focus on production reliability—iterated on chunking/retrieval/prompting, added validation and modular pipelines for debuggability, and is now containerizing and scaling the system with Docker and GCP.

View profile
Saisureshreddy Challa - Mid-level Data Scientist specializing in AI/ML, LLMs, and domain analytics in California, USA

Mid-level Data Scientist specializing in AI/ML, LLMs, and domain analytics

California, USA6y exp
BlackRockNortheastern University

BlackRock AI/ML engineer who built and owned a production LLM document intelligence system for regulatory and investment analysis end-to-end. They combined RAG, multi-agent validation, strong evaluation/monitoring, and reusable Python services to process 50K+ documents, cut review time 40-50%, and improve decision accuracy by about 25%.

View profile
YV

Entry-level Software Developer specializing in full-stack web and machine learning

California, USA1y exp
Easley-Dunn ProductionsUSC

Early-career candidate with a thoughtful, engineering-first approach to AI-assisted development: they use AI to accelerate implementation while retaining human ownership of architecture and final code quality. They recently built a speech-to-text workflow using Groq Whisper and showed practical judgment by designing around imperfect transcription accuracy with checks and fallback handling.

View profile
SM

Mid-level AI/ML Engineer specializing in GenAI, NLP, and MLOps

Connecticut, USA5y exp
PfizerUniversity of New Haven

Built and deployed an enterprise GenAI knowledge assistant over thousands of internal PDFs/reports using a RAG stack (GPT-4 + Hugging Face embeddings + vector DB) to reduce manual search and SME escalations. Uses LangGraph/LangChain to orchestrate modular agent workflows with relevance filtering and fallback handling, and applies rigorous evaluation (golden datasets, edge cases, A/B tests) with production monitoring metrics.

View profile

Need someone specific?

AI Search