Hello!
I am Logesh Kumar Umapathi , a Machine learning Engineer at Blackbox.ai. My work focuses on building agentic systems and models that help in automating software development and improve developer productivity. My interests include Code generation LLMs , Synthetic data generation with LLMs and Alighnment of code LLMs to Human preferences.
Previously, I was a Lead Maching learning Engineer at Saama Technologies. Leading machine learning efforts for a product focused on accelerating clinical trials and time to market of drugs.
I have been part of notable Code LLM research including Starcoder , Santacoder and Bigcode Evaluation harness.
When I’m not in front of a computer screen , I would love to speak at ML events. I also love reading books and photography.
Technologies
🤖 Machine Learning: Transformers, PyTorch, Scikit-Learn, Weights & Bias, Deepspeed, TensorRT , vLLM , Text generation inference.
☁️ Cloud: AWS, GCP, Docker, Airflow.
Publications:
-
Med-halt: Medical domain hallucination test for large language models - CoNLL 2023 Code , Website
-
Starcoder: may the source be with you! - TMLR 2023
-
MedMCQA - A large-scale multi-subject multi-choice dataset for medical domain question answering - ACL 2022
-
Santacoder:Don’t reach for the stars. - ICLR 2022
Notable Talks:
-
Taming the Large language models – Efficient inference of Multi-billion parameter models - NLP Summit 2023. [Presentation]
-
Bespoke LLMs : Building and Scaling customized large language models - Analytics Vidhya’s - DataHack summit 2023 [code]
-
Unlocking reasoning and planning abilities in Large language models - CONF42 2023
-
Decoding state-of-the-art NLP models at DataHack Summit 2019 - Analytics Vidhya’s - DataHack summit 2019
-
Get your feet wet with ML - Google developer group CBE DevFest 2019. [code]
Open source projects:
-
Bigcode Evaluation harness : A framework for the evaluation of autoregressive code generation language models. #pytorch #CodeGeneration #LLM
-
Mutate: A library to synthesize text datasets using Large Language Models (LLM) #pytorch #transformers #LLM
-
Keras Scaffolding - A scaffolding for keras and tensorflow with some callbacks , metrics and logs inbuilt.