Browse Talent Find Talent Open Jobs Pricing FAQsGet Started

Vetted Data Preprocessing Professionals

Pre-screened and vetted.

Data Preprocessing Python Docker SQL AWS CI/CD

Rohith Sadanala

Screened

Mid-level Machine Learning Engineer specializing in Generative AI and MLOps

Missouri, USA3y exp

AirbnbUniversity of South Florida

“LLM/agent engineer who has shipped production RAG chatbots in sustainability-focused domains, including a packaging recommendation assistant that standardized messy user inputs and used Pinecone-backed retrieval over product/regulatory data. Experienced orchestrating end-to-end ML workflows with Airflow and AWS Step Functions/Lambda, emphasizing reliability (property-based testing, circuit breakers, OpenTelemetry) and measurable performance (latency/cost). Partnered closely with non-technical leadership to ship 3 weeks early, driving adoption by 150+ businesses and ~20% reported waste reduction.”

A/B Testing Amazon Bedrock Amazon EC2 Amazon EKS Amazon RDS Amazon S3+154

View profile

Byron Pineda

Screened

Staff/Lead Data Scientist specializing in Generative AI, NLP/LLMs, and MLOps

Pascagoula, MS10y exp

TuringMississippi State University

“Lead Data Scientist (10+ years) with recent work in healthcare data: built production pipelines that unify EHR, genomics, and clinical notes using NLP (spaCy/BERT/BioBERT) and scalable Spark-based processing. Also led development of domain-specific LLM/NLP systems for chatbots and semantic search, deploying models via FastAPI/Flask and improving retrieval with FAISS-backed, fine-tuned clinical embeddings and RAG-style workflows.”

Python R SQL Pandas NumPy Scikit-learn+132

View profile

Sarath Dunga

Screened

Mid-level Full-Stack Developer specializing in cloud microservices and AI/ML integration

Remote, USA4y exp

eBayArizona State University

“Full-stack engineer (~3 years) with eBay production experience building and operating high-scale, event-driven Python microservices for order processing and AI-powered recommendations (Kafka/Redis/FastAPI on AWS with Prometheus/Grafana). Also delivered polished React+TypeScript analytics dashboards and designed high-concurrency PostgreSQL schemas with significant latency reductions. Recently built AI-agent orchestration and an interactive node-based requirements dashboard for Siemens Polarion via MCP servers, improving user interaction by ~17.8%+.”

Anomaly detection Authentication Authorization AWS AWS CodePipeline AWS Lambda+183

View profile

Hari Kiran Reddy Rommala

Screened

Mid-level Full-Stack Software Engineer specializing in cloud and data platforms

Boston, MA5y exp

Northeastern UniversityPenn State University

“Full-stack engineer with experience spanning Amazon IMDb and Northeastern’s NeuroJSON portal, combining consumer product work with complex scientific data applications. Built IMDb’s streaming providers feature—described as the company’s most impactful feature of 2023—and has hands-on experience with React/Angular, GraphQL, AWS, Python services, and production monitoring.”

React TypeScript SQL PostgreSQL Docker Kubernetes+283

View profile

Timothy Yeav

Screened

Senior AI/ML Engineer specializing in Generative AI and FinTech

Bronx, NY8y exp

InsitroNew York City College of Technology (CUNY)

“Built end-to-end LLM/RAG systems for biological data and scientific literature analysis in a drug discovery setting, helping researchers explore disease insights and treatment hypotheses faster. Combines applied GenAI product work with strong production engineering, including monitoring, retrieval optimization, reusable Python services, and scalable deployment on AWS/Kubeflow.”

Generative AI LLaMA GPT Agentic AI BERT Transformers+204

View profile

Rucha Visal

Screened

Mid-Level Software Development Engineer specializing in distributed systems and full-stack web apps

Seattle, USA4y exp

AmazonUniversity of North Carolina at Charlotte

“Software engineer who owned customer-facing, high-traffic TypeScript/React + TypeScript backend systems end-to-end, emphasizing safe velocity through feature flags, staged rollouts, observability, and rollback-ready incremental delivery. Reports shipping more frequently with fewer production incidents and faster recovery due to these guardrails.”

Java Python JavaScript TypeScript Go C+79

View profile

Sumanth Salluri

Screened

Mid-level Business Data Analyst specializing in Financial Services and Healthcare analytics

USA4y exp

VisaGeorge Mason University

“Full-stack engineer (~4 years) who has owned and shipped customer-facing SaaS onboarding and a role-based real-time analytics dashboard using TypeScript/React with a modular backend. Experienced in microservices with RabbitMQ and strong observability practices (correlation IDs, structured logging, queue metrics), and built an internal deployment tracker integrated with CI/CD that replaced manual spreadsheet/Slack processes.”

Python SQL R HTML CSS JavaScript+118

View profile

Nikita Vivek Kolhe

Screened

Junior Data & Machine Learning Engineer specializing in MLOps and NLP

Los Angeles, United States1y exp

WorkUpUSC

“ML/LLM practitioner with production experience building a healthcare review sentiment pipeline (RateMDs) using Hugging Face Transformers plus a LangChain+FAISS RAG layer for interactive querying. Also led orchestration-driven optimization of Nike’s Fusion ETL pipeline, improving runtime efficiency by 20%, and has experience translating ML outputs into Tableau dashboards for non-technical healthcare stakeholders (e.g., readmission risk).”

Python SQL C C++R MATLAB+90

View profile

Zufeshan Imran

Screened

Senior Machine Learning Engineer specializing in LLMs, RAG, and computer vision

San Diego, CA10y exp

SOTER AIUC San Diego

“Built an "AskMyVideo" system that turns YouTube videos into queryable knowledge graphs by transcribing audio (Whisper), chunking and embedding content, and enabling traceable answers back to exact timestamps. Strong in entity resolution (rules + fuzzy matching + TF-IDF/cosine with PR-curve thresholding) and modern retrieval stacks (FAISS, hybrid dense/sparse, domain fine-tuning with ~12% precision gain), with a production mindset using Airflow/Prefect, Docker/FastAPI, and LangSmith/Prometheus/Grafana observability.”

Machine Learning Deep Learning Generative AI Transformers Large Language Models (LLMs)Retrieval-Augmented Generation (RAG)+120

View profile

Sriraksha Rao

Screened

Junior Software Engineer specializing in AI systems and distributed backend platforms

San Diego, CA3y exp

Relevance LabsUC San Diego

“Built end-to-end AI features across both fitness and insurance domains, including a full-stack personalized workout recommendation system and a production RAG-based insurance QA assistant at Relevance Labs. Stands out for combining backend/distributed systems skills with practical LLM architecture, evaluation, and risk-aware human-in-the-loop design; notably reduced unnecessary LLM calls by 40% while improving latency and answer reliability.”

Python C C++Go Java Rust+119

View profile

Piyush Kautkar

Screened

Junior Software Engineer specializing in full-stack systems and distributed log analytics

Miami, FL1y exp

NeocisCarnegie Mellon University

“CMU candidate with hands-on experience taking LLM concepts from research prototypes toward production-ready designs (structured outputs, guardrails, failure-scenario evaluation). Also partnered with sales/customer teams at Mazecare to drive adoption with Dontia Alliance (largest dental clinic chain in Singapore) and engaged Singapore government stakeholders, bridging clinical workflow needs with IT security/integration concerns.”

Agile Analytics Anomaly Detection Authentication AWS C+++190

View profile

Harsh Chaudhari

Screened

Intern Software Engineer specializing in ML/NLP and LLM applications

Boulder, CO0y exp

SplunkUniversity of Colorado Boulder

“Full-stack AI/LLM engineer who has deployed a production LLM backend (Mistral 14B) on GKE to auto-transform datasets and generate runnable ML training pipelines, addressing hallucinations, schema mismatch, latency, and burst scaling with caching/prompt compression and HPA. Also has internship experience (Splunk, BlackOffer) delivering data automation and 10+ Power BI dashboards for non-technical stakeholders with measurable efficiency gains.”

C++Data Pipelines Data Preprocessing Docker Embeddings FAISS+70

View profile

Rakesh Munaga

Screened

Mid-level Full-Stack Engineer specializing in AI and FinTech platforms

TX, USA4y exp

JPMorgan ChaseUniversity of Texas at Arlington

“Full-stack engineer building real-time internal banking operations dashboards (Java/Spring Boot microservices + React/TypeScript) with Kafka-based streaming and post-launch performance optimizations. Also shipped a production internal AI support assistant using RAG (Confluence/PDF/support docs ingestion, embeddings + vector DB retrieval) with guardrails, evaluation loops, and observability to reduce hallucinations and prevent regressions.”

AI Agents Amazon API Gateway Amazon CloudWatch Amazon EC2 Amazon RDS Amazon S3+132

View profile

Sirisha Maddikunta

Screened

Mid-level Generative AI Engineer specializing in enterprise LLM and healthcare AI solutions

O Fallon, MO6y exp

MastercardUniversity of Texas at Arlington

“Built and owned an end-to-end LLM-powered fraud investigation assistant that automated case summaries and risk analysis, cutting analyst investigation/documentation time by 40%. Stands out for translating RAG concepts into a production-grade internal platform with strong evaluation, monitoring, and reusable Python service architecture that improved both analyst trust and engineering velocity.”

Generative AI Natural Language Processing Computer Vision Prompt Engineering Retrieval-Augmented Generation LoRA+234

View profile

Bhargav Diyora

Screened

Mid-level Full-Stack Software Engineer specializing in FinTech microservices

California, USA4y exp

PayPalCalifornia State University, Long Beach

“Robotics software engineer who has built end-to-end pipelines spanning backend/data processing through model interfaces and hardware integration. Has hands-on ROS2 experience building Python nodes and debugging real-time behavior via profiling, publish-rate tuning, and latency fixes, plus experience standardizing multi-robot communication with QoS adjustments. Uses Gazebo simulation and Docker/CI/CD to catch integration issues early and speed iteration.”

Java JavaScript TypeScript Python C#SQL+161

View profile

Ramyasri Veerapaneni

Screened

Mid-Level Full-Stack Developer specializing in FinTech

Remote, USA4y exp

IntuitMississippi State University

“Backend-heavy full-stack engineer with experience at Intuit (TurboTax Live) and Paytm payments, building and scaling Java/Spring Boot microservices for high-traffic transaction systems. Has hands-on wins improving peak-load performance using Redis/disk caching and Kafka event-driven patterns, plus React/Redux work for web app integration and strong monitoring practices with ELK.”

Apache Kafka Apache Spark API Design AWS C C#+83

View profile

Vidhi Upadhyay

Screened

Senior Software Engineer specializing in AI/ML, computer vision, and cloud-native systems

Remote8y exp

Saayam for AllCarnegie Mellon University

“Independently built a production-grade, containerized enterprise agentic AI platform (stateful orchestration + RAG) focused on real-world reliability—guardrails, citation-based outputs, reranking, query rewriting, and evaluation harnesses to reduce hallucinations. Hands-on with OpenAI SDK, CrewAI, and LangGraph, and has delivered AI solutions for non-technical NGO stakeholders via demos and practical POCs.”

Python C++SQL MySQL .NET Generative AI+150

View profile

Niyaz Nurbhasha

Screened

Mid-level Machine Learning Engineer specializing in computer vision and LLM pipelines

4y exp

BlueHaloDuke University

“ML/LLM engineer who built production systems to speed up artist content-creation workflows, including a fine-tuned image captioning model paired with a RAG layer over image embeddings/captions to improve consistency across changing domains. Experienced orchestrating multi-tool agents with LangChain/LangGraph (planning + critic/reflection) and setting up practical monitoring (caption rejection rate) plus evaluation sets for tool-calling accuracy, output quality, and latency.”

Python C++SQL JavaScript TypeScript PyTorch+75

View profile

Akhil Chippalthurthy

Screened

Mid-level AI/ML Engineer specializing in NLP, Generative AI, and predictive analytics

New Jersey, USA5y exp

JPMorgan ChaseStevens Institute of Technology

“GenAI/LLM engineer who architected and deployed a production RAG “research assistant” for JPMorgan Chase’s regulatory compliance team, focused on safety-critical behavior (mandatory citations, refusal when evidence is missing). Deep hands-on experience with LlamaIndex, Pinecone, Hugging Face embeddings, LangGraph agent workflows, and metric-driven evaluation (golden sets, TruLens), including a reported 28% relevancy lift via cross-encoder re-ranking.”

Python R SQL Jupyter Notebook LightGBM XGBoost+172

View profile

SHREY MATHUR

Screened

Mid-level Machine Learning Engineer specializing in LLMs and AI products

Sunnyvale, CA6y exp

TCSUCLA

“Applied ML/LLM engineer currently building AppleCare’s production chat recommender, owning the full lifecycle from transcript cleaning and fine-tuning through distributed deployment, monitoring, and iterative improvement. Their work delivered >10% copy-count improvement, 5% lower modification rate, 60% cost reduction, and $1.1M profitability in 2025, and they also created a reasoning-data generation approach that enabled a reasoning model and a judge model that cut eval time by over 99%.”

Data preprocessing Deep Learning LoRA LangChain Retrieval Augmented Generation Hugging Face+138

View profile

Tianai Shi

Screened

Intern Full-Stack Software Engineer specializing in test analytics platforms

La Jolla, CA2y exp

NutanixUC San Diego

“Software engineer intern at Nutanix who independently shipped and maintained an internal smoke-test/failure-analysis dashboard, integrating failure data from multiple upstream systems (e.g., Jira, Jenkins, CircleCI) via REST APIs. Also has prior data-science experience building Postgres-based asset management analytics with automated reporting and indexing for faster time-series retrieval.”

API Design Asynchronous Processing Backend Development BERT CI/CD C+94

View profile

Keerthana Tammina

Screened

Mid-level Data Scientist specializing in machine learning and generative AI

Saint Louis, MO5y exp

DoorDashSaint Louis University

“ML/LLM engineer who has shipped a production transformer-based document understanding system on AWS, owning the full pipeline from domain fine-tuning to Dockerized CI/CD deployment. Demonstrates strong production rigor—latency optimization (distillation/quantization, async batching, autoscaling), orchestration with Airflow/Step Functions/Azure Data Factory, and monitoring/drift detection—plus experience translating ops stakeholder needs into adopted AI automation via dashboards.”

Agile Amazon Redshift Amazon S3 Amazon SageMaker Anomaly Detection Apache Hadoop+157

View profile

Yuxi Yang

Screened

Intern Embedded Software Engineer specializing in autonomous driving and applied computer vision

null1y exp

iFLYTEKJohns Hopkins University

“Autonomous driving engineer from iFLYTEK who shipped 5+ middleware modules for vehicles across three models, with deep experience in reliability, IPC performance, and real-world system hardening. Stands out for translating flaky production behavior into measurable signals—resolving 30+ faults, cutting backlog 39%, improving latency 20%, and supporting 500+ hours of road testing with 99%+ reliability.”

Python C++PyTorch Linux Git Statistical Analysis+113

View profile

Machine Learning Engineers Software Engineers Data Scientists Research Assistants Software Developers AI Engineers AI & Machine Learning Engineering Data & Analytics Education

Need someone specific?

AI Search

Related

Need someone specific?