“Data engineer focused on AWS + Spark/Databricks pipelines, including an end-to-end nightly loan-data ingestion flow (~2.2M records) from Postgres/S3 through Glue and Databricks into a DWH with layered validation and alerting. Also built real-time streaming with Kafka + Spark Structured Streaming and a master’s project streaming Reddit data for sentiment analysis under ambiguous requirements and tight budget constraints.”

SDLC Agile Waterfall Python SQL R+105

View profile

Sourabrata Samanta

Screened

Intern Data Scientist specializing in AI, analytics, and cloud data engineering

New York, NY3y exp

MphasisIndiana University Kelley School of Business

“Built a production multimodal LLM-based vendor risk assessment platform that ingests SOC reports and other documents, uses a strict RAG pipeline with grounded evidence (page/paragraph citations), and dramatically reduces analyst review time. Experienced with LangGraph/LangChain/AutoGen for stateful, fault-tolerant agent workflows, and emphasizes reliability (schema validation, guardrails) plus low-latency delivery (~1–2s) through hybrid retrieval, reranking, caching, and model tiering.”

Agile Amazon Bedrock Angular Artificial Intelligence AWS AWS Glue+104

View profile

Sai Harshith Varma Pericherla

Screened

Mid-level Data Engineer specializing in cloud ETL/ELT and lakehouse architecture

Jersey City, NJ4y exp

State StreetUniversity of New Haven

“Data engineer focused on sales/marketing analytics pipelines, owning ingestion from CRMs/ad platforms through warehouse serving and dashboards at ~hundreds of thousands of records/day. Built reliability-focused systems including dbt/SQL/Python data quality gates with alerting, a resilient web-scraping pipeline (retries/backoff, anti-bot tactics, schema-change detection, backfills), and a versioned internal REST API with caching and strong developer usability.”

SQL Python Pandas NumPy Scikit-learn Java+151

View profile

Sheshikanth Pothuganti

Screened

Mid-level Data Engineer specializing in real-time streaming and cloud data platforms

New York, NY4y exp

Wells FargoUniversity of Birmingham

“Data engineer with Wells Fargo experience owning an end-to-end lakehouse ETL pipeline on Databricks/Azure Data Factory, processing ~480GB daily and implementing robust data quality/reconciliation across 40+ tables to reach ~99.3% reliability. Strong in performance optimization (cut runtime 5.5h→3.8h), CI/CD and monitoring, and resilient external/API ingestion with retries, schema validation, and backfills.”

Python SQL Java Scala R PostgreSQL+122

View profile

Shanmukh Gudapati

Screened

Mid-level Data Analyst specializing in business intelligence and cloud data platforms

Stamford, CT4y exp

Franklin TempletonUniversity of Bridgeport

“Healthcare analytics professional with TCS/Humana experience turning messy claims and eligibility data into reliable reporting assets using SQL and Python. They combine strong data engineering and analytics execution with stakeholder management, including automating monthly claims reporting from half a day to under 5 minutes and driving a provider outreach effort that reduced claim rejection rates by about 20%.”

Data Analytics Data Modeling SQL Python Pandas NumPy+102

View profile

Ashtik Mahapatra

Screened

Mid-level Data Scientist specializing in LLMs, RAG, and document intelligence

NYC, NY3y exp

MagnitUniversity at Buffalo

“LLM/ML engineer who has shipped production systems in legal/financial-risk domains at Wolters Kluwer, including a hybrid OCR+deterministic+LLM extraction pipeline that structured UCC filings at massive scale and drove $6M+ in revenue. Also built LangGraph-based multi-agent “Deep Research” workflows with model routing, tool calls (MCP), persistence, and human-in-the-loop review, and partnered closely with policy writers to deliver LLM summarization that cut writing time by ~60%.”

Python SQL Bash NoSQL MySQL Retrieval-Augmented Generation (RAG)+84

View profile

Tejaswini Narayana

Screened

Mid-level Data Scientist & AI/ML Engineer specializing in GenAI and cloud ML

Harrison, NJ5y exp

State FarmMonroe University

“GenAI/LLM engineer who recently built a production compliance assistant at State Farm for KYC/AML and regulatory teams, using AWS Bedrock + LangChain with Textract/Lambda pipelines to extract fields, tag risk, and summarize long documents. Implemented RAG, strict structured outputs, and human-in-the-loop guardrails, and reports automating ~80% of documentation work while reducing review time by ~40%.”

SDLC Agile Waterfall Python C C+++149

View profile

Ashrita Mishra

Screened

Mid-level Data Analyst specializing in analytics, ETL, and cloud data platforms

Jersey City, NJ4y exp

CitigroupPace University

“Data analyst with 4 years of experience spanning banking and retail/marketing analytics. Has hands-on experience building churn analytics pipelines in SQL and Python, optimizing large-query performance, and turning stakeholder-aligned metrics into recurring dashboards and business actions.”

SDLC Agile Waterfall Python Java R+88

View profile

Data Scientists Data Engineers Data Analysts Business Analysts Machine Learning Engineers Research Assistants Software Engineers Teaching Assistants

Need someone specific?

AI Search

Vetted Data & Analytics Professionals in the NYC Metro

Popular Titles

Sekhar Sabbisetti

Chris Cestari

Gunjan Mishra

Rajiv Ragavan

Ashraf Syed

Vamshidar Bhootkuri

Vanshika Makhija

Pratik Patil

Adithya Bandari

Neel Adalja

Zihui Feng

Sumedha Mannem

Upendra Devathi

Shylesh Bontha

Subramanyam R

Bhavya Gada

Abhishek Gawali

Sourabrata Samanta

Sai Harshith Varma Pericherla

Sheshikanth Pothuganti

Shanmukh Gudapati

Ashtik Mahapatra

Tejaswini Narayana

Ashrita Mishra

Related

Need someone specific?