“ML/NLP practitioner who built a retrieval-augmented generation (RAG) system for large financial and operational document sets using Sentence-Transformers (all-mpnet-base-v2) and a vector DB (e.g., Pinecone), with a strong focus on retrieval evaluation and chunking strategy optimization. Experienced in entity resolution (rules + embedding similarity with type-specific thresholds) and in productionizing scalable Python data workflows using Airflow/Dagster and Spark.”

Python SQL R Pandas NumPy SciPy+177

View profile

Sri Niyati Kompella

Screened

Senior Data Engineer specializing in cloud data platforms and ML pipelines

Atlanta, GA8y exp

Berkshire HathawayUniversity of Alabama at Birmingham

“Data engineer focused on AWS-based enterprise data platforms, owning end-to-end pipelines from multi-source batch/stream ingestion (Glue/Kinesis/StreamSets/Airflow) through PySpark transformations into curated datasets for Redshift/Snowflake. Emphasizes production reliability with strong monitoring/observability and data quality gates, and reports ~30% performance improvement plus improved SLAs and latency after optimization.”

Amazon Athena Amazon DynamoDB Amazon EMR Amazon EKS Amazon Kinesis Amazon Redshift+138

View profile

Varshitha K

Screened

Mid-level Data Engineer specializing in cloud data platforms and lakehouse architectures

Lakewood, CO4y exp

First BankUniversity of Central Missouri

“Data engineer in a banking context who has owned end-to-end Azure lakehouse pipelines ingesting financial/vendor data from APIs, Azure SQL, and flat files into Databricks/Delta (bronze-silver-gold). Emphasizes production reliability via schema-drift validation, data quality controls, monitoring/alerting, retries/checkpointing, and Spark/Delta performance tuning, with outputs served to BI/reporting teams (e.g., Tableau).”

Python Scala Java C++SQL PL/SQL+173

View profile

Sharanya Rao

Screened

Mid-level AI/ML Engineer specializing in NLP, LLMs, and RAG for finance and healthcare

Remote, USA3y exp

Ally FinancialUniversity of Maryland, Baltimore County

“Built an AI lending assistant (RAG + DeBERTa) used by credit analysts to retrieve policies and past loan decisions, tackling real production issues like hallucinations, document quality, and sub-second latency. Deployed a modular, Dockerized AWS architecture (ECS/EMR + load balancer) with load testing, caching/precomputed embeddings, and CloudWatch monitoring, and used Airflow to automate scheduled data/embedding/vector DB refresh pipelines with retries and alerts.”

Python PySpark SQL Pandas NumPy Scikit-learn+133

View profile

srilekha pothula

Screened

Mid-level Data Engineer specializing in cloud data pipelines for healthcare and financial services

Bloomfield, CT4y exp

CignaPace University

“Data engineer with ~4 years of experience (Cigna) building and operating Azure Data Factory pipelines for healthcare claims/member/provider data at 2–3M records/day. Emphasizes reliability and downstream safety via schema/data-quality validation, quarantine workflows, idempotent processing, and backfills; also improved runtime ~20% through SQL optimization and served curated datasets through versioned views and well-documented, analyst-friendly interfaces.”

Apache Airflow Apache Kafka Apache Spark AWS AWS Glue AWS Lambda+71

View profile

Agna Antony

Screened

Mid-level Data Engineer specializing in cloud-native healthcare and enterprise data platforms

Michigan, USA5y exp

MedStar HealthAPJ Abdul Kalam Technological University

“Data Engineer (TCS) who owned an end-to-end CRM analytics pipeline for Bayer’s eSalesWeb integration, ingesting from Salesforce APIs/databases/S3 and serving analytics-ready datasets via PostgreSQL/S3 for Tableau. Drove measurable outcomes: ~60% reduction in manual data-quality effort, ~30% lower latency through SQL optimization, and ~35% improved stability via monitoring, retries, and idempotent processing.”

SDLC Agile Scrum Kanban Waterfall DevOps+124

View profile

Fernando Mosqueda

Screened

Senior AI/ML Engineer specializing in healthcare AI and MLOps

Mansfield, TX16y exp

McKessonSam Houston State University

“Healthcare AI engineer with hands-on ownership of production ML and LLM systems at McKesson, spanning clinical risk prediction and RAG-based documentation tools. Stands out for combining deep clinical-data experience, HIPAA-aware deployment practices, and measurable impact through reduced readmissions, clinician workflow gains, and 20% to 30% faster ML delivery for engineering teams.”

Python R JavaScript Go PyTorch TensorFlow+193

View profile

Harika M

Screened

Mid-level Full-Stack Developer specializing in FinTech and enterprise platforms

Plano, TX6y exp

PennyMacUniversity of Central Missouri

“Engineer with a pragmatic, production-focused approach to AI-assisted development, using tools like Copilot and ChatGPT to accelerate coding while maintaining strict validation for correctness, security, and performance. Particularly notable for building a multi-agent incident-resolution workflow for a financial platform, with specialized agents for log analysis, root cause identification, fix suggestions, and test generation.”

Java C C#PL/SQL JDBC Multithreading+193

View profile

Ankush Banthia

Screened

Senior Data & Platform Engineer specializing in cloud-native streaming and distributed systems

USA10y exp

JPMorgan ChaseNew York Institute of Technology

“Financial data engineer who has built and operated high-volume batch + streaming pipelines (200–300 GB/day; 5–10k events/sec) using AWS, Spark/Delta, Airflow, Kafka, and Snowflake, with strong emphasis on data quality and reliability. Demonstrated measurable impact via 99.9% SLA adherence, major reductions in bad records/nulls, MTTR improvements, and significant latency/runtime/query performance gains; also built a distributed web-scraping system processing 5–10M records/day with anti-bot and schema-drift defenses.”

Team Building Onboarding Mentoring Agile Scrum Jira+150

View profile

Madhupal Singu

Screened

Mid-level Data Engineer specializing in multi-cloud data platforms for healthcare and finance

USA6y exp

CignaUniversity of Cincinnati

“Data engineer with Cigna experience building and operating an end-to-end AWS-based healthcare claims pipeline processing ~2TB/day, using Glue/Kafka/PySpark/SQL into Redshift. Strong focus on data quality and reliability (schema validation, monitoring/alerting, retries/checkpointing/backfills), reporting improved accuracy (~99%) and reduced latency, plus experience serving real-time Kafka/Spark data to downstream analytics with documented data contracts.”

Python Pandas PySpark SQL Scala Java+88

View profile

Harshitha Mittapalli

Screened

Mid-Level Full-Stack Software Engineer specializing in cloud-native and GenAI solutions

Remote, USA5y exp

Capital OneUniversity of North Carolina at Charlotte

“Built and shipped production RAG-based LLM agents automating multi-step document query workflows, emphasizing reliability via monitoring, retries, structured exception handling, and fallback retrieval (alternative embeddings/keyword search). Demonstrated measurable gains (18% latency improvement, 25% retrieval efficiency, 12% precision) and has experience integrating agents with messy tax and transaction data at RSM using validation/cleaning and idempotent design.”

Large Language Models (LLMs)LangChain Retrieval-Augmented Generation (RAG)Prompt Engineering Generative AI Google Gemini+90

View profile

Abhishek Gawali

Screened

Mid-level Data Engineer specializing in cloud ETL and real-time streaming

New York, NY6y exp

PNCRochester Institute of Technology

“Data engineer focused on AWS + Spark/Databricks pipelines, including an end-to-end nightly loan-data ingestion flow (~2.2M records) from Postgres/S3 through Glue and Databricks into a DWH with layered validation and alerting. Also built real-time streaming with Kafka + Spark Structured Streaming and a master’s project streaming Reddit data for sentiment analysis under ambiguous requirements and tight budget constraints.”

SDLC Agile Waterfall Python SQL R+105

View profile

Mohammad Sami

Screened

Mid-level Data Analyst specializing in financial services and fraud analytics

Beaverton, OR3y exp

Facteus, IncUniversity of Tampa

“Analytics candidate currently at Facteus with hands-on experience turning messy transactional data into trusted reporting layers in Snowflake and Power BI. They combine SQL and Python automation with strong validation, performance tuning, and stakeholder-facing metric design, including cohort-based retention and segmentation work that improved trust and adoption of analytics.”

SQL MySQL PostgreSQL Python Pandas NumPy+72

View profile

Srijitha Katkuri

Screened

Mid-level Data Analyst specializing in healthcare and business intelligence

Michigan, USA4y exp

Banner HealthTrine University

“Healthcare analytics candidate with hands-on experience turning messy EHR, billing, and operational data into validated SQL datasets and automated Python/Airflow pipelines. They appear strongest in hospital KPI reporting—especially length of stay, readmissions, retention, and bed utilization—and have owned projects from metric definition through Power BI delivery and impact measurement.”

SQL Python Pandas NumPy Power BI Tableau+70

View profile

Alekya Battu

Screened

Mid-level Data Scientist specializing in machine learning, MLOps, and cloud analytics

USA5y exp

Wells FargoWilmington University

“Senior data scientist with ~5 years’ experience building production ML/NLP systems in finance (Wells Fargo) and deep learning for sensor analytics in connected vehicles (Medtronic). Has delivered end-to-end platforms combining time-series forecasting with transformer-based NLP, including automated drift monitoring/retraining (MLflow + Airflow) and standardized Docker/CI/CD deployments; achieved a reported 22% precision improvement after domain fine-tuning.”

Python SQL R Classification XGBoost Random Forest+171

View profile

BHEEMA SABILLA

Screened

Mid-level Data Engineer specializing in Lakehouse, Streaming, and ML/LLM data systems

Remote, USA3y exp

DiscoverUniversity of South Dakota

“Built and productionized an enterprise retrieval-augmented generation platform for internal knowledge over large unstructured corpora, emphasizing trust via strict citation/grounding and hybrid retrieval (BM25 + FAISS + cross-encoder re-ranking). Demonstrates strong scaling and cost/latency optimization through incremental indexing/embedding and index partitioning, plus disciplined evaluation/observability practices. Has experience operationalizing pipelines with Airflow/Databricks/GitHub Actions and partnering closely with risk & compliance stakeholders on auditability requirements.”

Python PySpark SQL Scala Pandas NumPy+157

View profile

Hema Edavalapati

Screened

Mid-level AI/ML Engineer specializing in cloud data engineering and GenAI

Florida, USA6y exp

LexisNexisUniversity of South Florida

“AI/LLM engineer with production experience in legal tech: built a GPT-4 + LangChain RAG summarization system at Govpanel that reduced legal case-file review time by 50%+. Previously at LexisNexis, orchestrated end-to-end Airflow data/AI pipelines processing 5M+ legal documents daily, improving ETL runtime by 35% with robust validation, monitoring, and SLAs.”

SQL SQL query optimization Python Pandas NumPy PySpark+159

View profile

Somil Shah

Screened

Mid-level AI/ML Engineer specializing in generative AI, RAG platforms, and LLM agents

San Francisco, CA4y exp

INTERACT Animal LabNortheastern University

“AI/LLM engineer who has shipped 10+ production applications, including InvestIQ on GCP—a production-grade RAG due-diligence engine that ethically scrapes web/PDF sources, builds a ChromaDB knowledge base, and delivers analyst-style dashboards plus a citation-backed chat copilot. Deep focus on reliability (evidence-only answers, hard citations, refusal gating), retrieval tuning, and orchestration (Airflow/Cloud Composer), plus multi-agent systems (CrewAI with 7 specialized finance agents).”

AI Agents API Development Bash BigQuery Business Intelligence ChromaDB+136

View profile

Brian Mar

Screened

Senior Data Engineer specializing in data infrastructure and marketing/CRM analytics

San Mateo, CA8y exp

Full Circle InsightsUC Davis

“Salesforce-focused implementation/solutions engineer from Full Circle Insights who owned end-to-end campaign attribution and reporting deployments for multiple customers at once (3–5 concurrently), including sandbox testing, KPI monitoring, and rollback-safe migrations from legacy reporting. Also builds personal multi-agent workflows and uses Claude Code to rapidly scaffold data/analytics scripts like an advertising optimization parser over CSV/XLSX inputs.”

Data Engineering Data Modeling ETL dbt Snowflake Apache Airflow+85

View profile

Sudeep govathoti

Screened

Mid-level Data Analyst/Data Engineer specializing in BI, ETL pipelines, and cloud analytics

4y exp

VerizonLindsey Wilson College

“Data engineer focused on marketing/web analytics and external API pipelines, handling ~10M records/week. Built Azure-based ingestion and PySpark transformations with rigorous data quality checks, then served curated datasets into Synapse/Redshift for Power BI. Also designed an Airflow-orchestrated crypto REST API pipeline with monitoring, retries/exponential backoff, schema-change detection, and backfill-friendly reprocessing.”

SQL Python R PySpark Pandas Scikit-learn+71

View profile

Varun Gattamaneni

Screened

Mid-level GenAI Engineer specializing in LLM fine-tuning, RAG, and MLOps

Glassboro, NJ5y exp

HCLTechRowan University

“Healthcare-focused LLM engineer who deployed a production triage and clinical knowledge retrieval assistant using RAG and LangGraph-orchestrated multi-agent workflows. Emphasizes clinical safety and compliance with robust hallucination controls, HIPAA/PHI protections (tokenization, encryption, audit logging, zero-retention), and human-in-the-loop escalation; reports a 75% latency reduction in a healthcare agent system.”

Python Pandas NumPy R SQL Bash+150

View profile

Venkatesh Sanaboina

Screened

Senior AI/ML Engineer specializing in Generative AI, LLMs, and MLOps

Tampa, FL9y exp

VerizonJawaharlal Nehru Technological University

“Telecom (Verizon) AI/ML practitioner who built a production multimodal system that ingests messy customer issue reports (calls, chats, emails, screenshots, videos) and turns them into confidence-scored incident summaries with reproducible steps and evidence links. Also built KPI/alarm-to-ticket correlation to rank likely root-cause domains (RAN/Core/Transport), cutting triage from hours to minutes and improving MTTR.”

A/B Testing Agile Amazon Redshift Amazon S3 Amazon SageMaker Anomaly Detection+168

View profile

Data Engineers Machine Learning Engineers Software Engineers Data Scientists Data Analysts Software Developers Data & Analytics AI & Machine Learning Engineering Executive & Leadership

Need someone specific?

AI Search

Related

Need someone specific?