Vetted Latency Optimization Professionals

“Built and shipped production RAG/agentic systems in high-stakes domains (biomedical and legal), including an enterprise biomedical document retrieval platform over ~10k scientific docs and a multilingual African-law assistant at the World Bank. Deep hands-on experience with LangChain/LangGraph/LlamaIndex and evaluation tooling (LLM-as-a-judge, safety/hallucination detection), with measurable gains in retrieval quality and hallucination reduction.”

Python PyTorch TensorFlow Hugging Face Transformers FastAPI Django+81

View profile

Ruudra Patel

Screened

Junior Data Scientist specializing in ML, LLMs, and RAG applications

Atlanta, GA3y exp

Georgia State UniversityGeorgia State University

“University hackathon finalist (2nd place) who built CareerSpark, a production-style multi-agent career guidance app in 24 hours using a hierarchical debate architecture with a moderator/judge agent. Has startup internship experience at LiveSpheres AI using LangChain for multi-LLM orchestration, and demonstrates a structured approach to testing/evaluation (golden sets, integration sims, latency/accuracy KPIs) plus strong non-technical stakeholder communication.”

Python SQL R Java JavaScript React+112

View profile

Ankita Mungalpara

Screened

Mid-level Data Scientist specializing in Generative AI and multimodal systems

Irving, TX5y exp

University of Massachusetts DartmouthUniversity of Massachusetts Dartmouth

“Recent J&J intern who built a conversational RAG agent and led a shift from a monolithic model to a modular RAG workflow, cutting response time from several days to under a second by tackling data fragmentation, context retention, and embedding/latency optimization. Also worked on a large (7B-parameter) multimodal VQA pipeline for healthcare research and stays current via NeurIPS/ICLR and open-source contributions.”

A/B Testing Amazon Bedrock Amazon EC2 Amazon RDS Amazon Redshift Amazon S3+154

View profile

Akshay Khan

Screened

Mid-level Backend Software Developer specializing in cloud-native microservices

USA5y exp

American ExpressUniversity of Central Missouri

“Backend engineer with American Express experience maintaining an internal Python/Flask rewards simulation microservice used by product analysts and QA. Demonstrated strong performance and scalability work: moved batch simulations to Celery, added Redis caching to cut DynamoDB latency, and tuned Postgres/SQLAlchemy queries with EXPLAIN ANALYZE and composite indexes (bringing API responses under ~200ms by queueing jobs). Also has experience integrating ML via Flask-based model-serving APIs (scikit-learn/LightGBM packaged with joblib) and designing multi-tenant data isolation and tenant-specific configuration systems.”

Java Python JavaScript SQL Shell Scripting Spring Boot+165

View profile

Ramya Sree Kanijam

Screened

Junior AI/ML Engineer specializing in RAG systems and cloud-native MLOps

Austin, TX2y exp

UpstartTexas A&M University-Corpus Christi

“Built and shipped a production LLM-powered RAG system at Upstart enabling natural-language search across 50k+ scattered internal technical docs. Delivered sub-300ms p95 latency for ~50 active users with strong hallucination safeguards (retrieval-first, thresholds, citations) plus robust testing/monitoring and cost controls (prompt caching cutting API spend ~20%).”

Python Java Retrieval-Augmented Generation (RAG)LangChain Prompt Engineering Vector Search+149

View profile

Ojasmitha Pedirappagari

Screened

Mid-level AI Engineer specializing in LLMs, RAG, and agentic platforms

Jersey City, NJ5y exp

Nurture HoldingsUC Santa Cruz

“Built and shipped a production RAG-based assistant that lets parents ask natural-language questions about their child’s learning progress, using pgvector retrieval (child-id filtered) and Redis caching to hit ~180ms latency. Implemented real-world guardrails and compliance (Llama Guard, COPPA, retrieval thresholds, fallbacks) with 99.5% uptime, and ran human-in-the-loop eval loops that improved satisfaction from 3.8 to 4.2 while serving 60k+ monthly users and reducing costs significantly.”

Python SQL C#TypeScript JavaScript AWS+83

View profile

BHEEMA SABILLA

Screened

Mid-level Data Engineer specializing in Lakehouse, Streaming, and ML/LLM data systems

Remote, USA3y exp

DiscoverUniversity of South Dakota

“Built and productionized an enterprise retrieval-augmented generation platform for internal knowledge over large unstructured corpora, emphasizing trust via strict citation/grounding and hybrid retrieval (BM25 + FAISS + cross-encoder re-ranking). Demonstrates strong scaling and cost/latency optimization through incremental indexing/embedding and index partitioning, plus disciplined evaluation/observability practices. Has experience operationalizing pipelines with Airflow/Databricks/GitHub Actions and partnering closely with risk & compliance stakeholders on auditability requirements.”

Python PySpark SQL Scala Pandas NumPy+157

View profile

SASI PAILA

Screened

Mid-level AI/ML Engineer specializing in Generative AI and production ML systems

PA, USA4y exp

BNY MellonFranklin University

“Built and deployed a production SecureAIChatBot (RAG-based) for secure internal information retrieval, using embeddings/vector search, GPT models, monitoring, and safety filters. Focused on real-world production challenges like latency and output consistency, applying caching, retrieval scoping, smaller models, and controlled prompting, and used LangChain to orchestrate the end-to-end workflow.”

CI/CD Cross-Functional Collaboration Data Analytics Docker Documentation Embeddings+56

View profile

Chethan Thimapuram

Screened

Mid-level AI/ML Engineer specializing in LLM systems, RAG, and MLOps

5y exp

HCA HealthcareUniversity of South Florida

“Built a production, real-time clinical documentation system at HCA that converts doctor–patient conversations into structured clinical summaries using speech-to-text, LLM summarization, and RAG. Demonstrated measurable gains from medical-domain fine-tuning (clinical concept recall +18%, ROUGE-L 0.62 to 0.74) while meeting HIPAA constraints via PHI anonymization and encryption, and deployed via Docker/FastAPI with CI/CD and monitoring.”

Amazon CloudWatch Apache Airflow Apache Kafka Apache Spark AWS Glue AWS IAM+125

View profile

Gowri Priya Gorla

Screened

Junior Robotics & Embedded Software Engineer specializing in Linux-based distributed robotic systems

Minnesota, United States3y exp

Quantronic CorporationConcordia University

“Robotics software engineer focused on system-level C++/Linux stacks for multi-robot platforms, owning the communication layer and validation/testing infrastructure. Built Python simulation/replay and fault-injection tooling integrated with Docker + GitLab CI/CD, and debugged real-time localization issues by instrumenting IPC timing and refactoring multi-threaded pipelines for deterministic performance.”

C++Python Robotics ROS 2 Linux Distributed Systems+70

View profile

Software Engineers Machine Learning Engineers Data Scientists AI Engineers Research Assistants Software Developers AI & Machine Learning Engineering Education Data & Analytics

Need someone specific?

AI Search

Related

Need someone specific?