Browse Talent Find Talent Open Jobs Pricing FAQsGet Started

Vetted Data Preprocessing Professionals

Pre-screened and vetted.

Data Preprocessing Python Docker SQL AWS CI/CD

Fangjian Xiong

Screened

Junior Machine Learning Engineer specializing in NLP and biomedical entity extraction

Boston, MA2y exp

Northeastern UniversityNortheastern University

“Built and deployed a production LLM-powered biomedical knowledge extraction pipeline that processed millions of papers to identify tools/techniques and produce a unified knowledge graph via active learning NER (Prodigy + spaCy transformers) and entity linking (Bio-tools/Wikidata). Addressed hard NLP engineering challenges like WordPiece span-offset alignment and scaled inference over ~1.5M documents using batching/caching, containerized services, async workers, and orchestration with Prefect/Airflow.”

AI Agents AWS BigQuery C#C++Data Preprocessing+94

View profile

Nishad Kane

Screened

Mid-level Data Scientist & AI Engineer specializing in RAG, agentic AI, and production ML

5y exp

Xtrium AIArizona State University

“AI/data engineer who built a production LLM-powered schema drift detection system (LangChain/LangGraph) to catch semantic data changes before they break downstream analytics/ML. Deployed on AWS with Docker/S3 and implemented an LLM-as-a-judge evaluation framework to improve trust, reduce hallucinations, and control false positives/alert fatigue. Collaborated with non-technical risk/business analytics stakeholders at EY by delivering human-readable drift explanations that improved confidence in financial analytics dashboards.”

A/B Testing Agentic AI Amazon EC2 Amazon EKS Amazon Redshift Amazon S3+104

View profile

Harini Vinu

Screened

Intern Software Engineer specializing in cloud, big data, and test automation

New York, United States1y exp

QualitestNYU

“Internship experience at Qualitest building and deploying an LLM-powered test automation system that reduced manual test creation and improved efficiency (~40%). Demonstrates strong production engineering for LLM systems (timeouts/retries/monitoring/caching, prompt optimization, batching) and has scaled workflows to 100+ concurrent jobs; also has orchestration experience with AWS Step Functions and Kubernetes.”

Amazon CloudWatch Amazon DynamoDB Amazon Kinesis Amazon S3 Amazon SQS Amazon API Gateway+149

View profile

Murali Marupudi

Screened

Mid-level Backend Engineer specializing in Python microservices and scalable systems

Jersey City, NJ4y exp

BlackRockPace University

“Full-stack engineer with hands-on experience shipping both secure platform features and production AI systems. They combine React/TypeScript, Flask/Node.js, and PostgreSQL fundamentals with practical LLM and NLP implementation, including retrieval, schema-validated outputs, monitoring, and human-in-the-loop safeguards. Notable impact includes cutting manual review by 40% and reducing post-update error rates by over 20%.”

Python JavaScript SQL PL/SQL FastAPI Django+201

View profile

Rakesh Kolagani

Screened

Mid-level AI/ML Engineer specializing in MLOps and LLM-powered applications

Mountain View, CA5y exp

IntuitUniversity of Central Missouri

“AI/ML engineer with production experience building a RAG-based internal analytics assistant (Databricks + ADF ingestion, Pinecone vector store, LangChain orchestration) deployed via Docker on AWS SageMaker with CI/CD and MLflow. Strong focus on real-world constraints—latency/cost optimization (LoRA ~60% compute reduction), hallucination control with citation grounding, and enterprise security/governance. Previously at Intuit, delivered an interpretable churn prediction system (PySpark/Databricks, Airflow/Azure ML) that improved retention targeting ~12%.”

A/B Testing Amazon S3 Apache Airflow AWS Glue AWS Lambda AWS Step Functions+126

View profile

Pooja Murigappa

Screened

Mid-level AI/ML Engineer specializing in NLP, Generative AI, and MLOps in Financial Services

Austin, TX5y exp

Charles SchwabUniversity of Central Missouri

“ML/LLM engineer at Charles Schwab who built a production loan-advisor chatbot integrated with internal knowledge and loan-calculator APIs, adding strict numeric validation to prevent rate hallucinations and optimizing context to control costs. Also runs ~40 Airflow DAGs orchestrating retraining/ETL/drift monitoring with an automated Snowflake→SageMaker→auto-deploy pipeline, and uses rigorous testing plus canary rollouts tied to business metrics and compliance constraints.”

Amazon DynamoDB Apache Airflow Apache Kafka Apache Spark AWS AWS Glue+183

View profile

Nandini Kalita

Screened

Senior Data Scientist / ML Engineer specializing in NLP, anomaly detection, and cloud ML platforms

Remote, CA10y exp

EmotionallNMIMS University

“ML/NLP practitioner who built customer-feedback topic modeling (NMF + TF-IDF) to diagnose chatbot-to-agent handovers and drove product/ops changes that reduced operational costs by 20%. Also developed LSTM-based intent recognition using Word2Vec/GloVe embeddings for semantic linking, and deployed an LSTM autoencoder for fraud anomaly detection that cut false positives by 25% while capturing 15% more fraud in A/B testing.”

A/B Testing Agile Anomaly Detection AWS BigQuery Bitbucket+116

View profile

Ruijing Wang

Screened

Intern Data Scientist specializing in healthcare AI and experimentation

Boulder, CO1y exp

EchoPlus AIStevens Institute of Technology

“Human-AI Design Lab practitioner who productionized a wearable-health anomaly detection system by evolving a standalone autoencoder into a hybrid autoencoder + GPT-based approach, backed by PySpark ETL and MLOps on AWS SageMaker/MLflow. Also has applied LLM troubleshooting experience (fine-tuned FLAN-T5 summarization) and partnered with BI teams to run A/B tests and improve retention via feature stores and experimentation.”

Python Pandas Scikit-Learn PyTorch TensorFlow SQL+97

View profile

Saniya Shinde

Screened

Mid-level Data Scientist specializing in NLP, LLMs, and RAG systems

Washington, DC4y exp

World BankGeorge Washington University

“Built and deployed a production-style vision-language pipeline that generates structured medical reports from chest X-rays using BioViLT embeddings, an image-text alignment module, and BiGPT fine-tuned with LoRA, delivered via Streamlit and hosted on AWS EC2. Also collaborating experience presenting EDA findings, feature importance, and model performance to Ford managers while working with vehicle parts data at Bimcon.”

Python SQL R C++PyTorch TensorFlow+93

View profile

Saicharitha Yanamandala

Screened

Mid-Level Software Developer specializing in Java, Cloud, and Microservices

Chicago, IL6y exp

Capital OneChicago State University

“Backend/Python engineer who owned an end-to-end FastAPI + AWS internal natural-language document Q&A system (Textract extraction, embeddings/vector DB, LLM integration) with strong focus on reliability and latency. Hands-on with Kubernetes + GitOps (Argo CD, Helm, rolling updates/auto-rollback) and built/optimized Kafka streaming pipelines using Prometheus/Grafana. Also supported a zero-downtime on-prem to cloud migration with parallel run and gradual traffic cutover.”

API Gateway AWS AWS CloudFormation AWS Lambda Angular Bash+265

View profile

Karan Javali

Screened

Mid-Level Full-Stack Software Engineer specializing in FinTech and cloud-native web platforms

Salt Lake City, Utah5y exp

Goldman SachsArizona State University

“Software engineer with experience at Goldman Sachs and Arizona State University’s Learning Engineering Institute, shipping production backend systems including a vendor equities invoice-generation service designed for extensibility across multiple vendors. Built Django REST + PostgreSQL backends with JWT auth and Pytest coverage, and delivered data-heavy, responsive Angular dashboards; also has exposure to AWS EC2 deployments and GitLab CI/CD automation.”

Python Java JavaScript TypeScript SQL Spring Boot+93

View profile

Anthony Thanpoovong

Screened

Entry-level Software Engineer specializing in cloud, AI, and full-stack development

Toronto, Canada1y exp

TELUSToronto Metropolitan University

“Backend/AI engineer with hands-on experience building LLM-powered data products and AI platform workflows, including a project that turns tabular datasets into graphs, summaries, and chat-based insights with 1-2 second latency. Also contributed at TELUS to a Sovereign AI Factory self-serve onboarding platform tied to 100+ NVIDIA H200 GPUs, giving them an interesting mix of applied LLM, platform, and infrastructure exposure.”

Java Python Go C SQL MySQL+90

View profile

Venkata Surendra Kommineni

Screened

Senior AI/ML Engineer specializing in Generative AI and agentic systems

Texas, USA5y exp

Bank of AmericaWichita State University

“Built and deployed an agentic RAG assistant in production to automate enterprise knowledge search and multi-step workflows with tool calling, tackling real-world issues like hallucinations, retrieval accuracy, and latency. Demonstrates strong LLMOps and orchestration depth (MLflow, Airflow, LangGraph/LangChain/LlamaIndex) plus a metrics-driven approach to agent testing/evaluation and cross-functional delivery with business stakeholders.”

Generative AI Machine Learning Agentic AI Multi-Agent Systems Retrieval-Augmented Generation Prompt Engineering+153

View profile

Ajitha Rachamanti

Screened

Mid-level Data Analyst specializing in healthcare and financial analytics

Texas, USA5y exp

Blue Cross Blue ShieldUniversity of North Texas

“Healthcare analytics candidate with hands-on experience turning messy claims and clinical data into validated SQL/Python pipelines and Power BI dashboards. They have delivered measurable impact in revenue cycle operations, including 15-18% improvement in reimbursement accuracy and 40-45% reduction in manual reporting effort.”

SQL Python Pandas NumPy Power BI Tableau+120

View profile

Dhrumil Patel

Screened

Mid-level AI/ML Engineer specializing in Generative AI and NLP

Boston, MA5y exp

TD Bank

“Built an end-to-end GenAI underwriting copilot at TD Bank for complex financial documents, combining RoBERTa-based risk classification with Azure OpenAI RAG to deliver grounded, citation-based insights. Drove a 40-50% reduction in manual underwriting review time and created reusable FastAPI ML services that cut integration effort for other teams by 30-40%.”

Large Language Models Retrieval-Augmented Generation Prompt Engineering LangChain LangGraph Natural Language Processing+159

View profile