I'm a senior Machine Learning Engineer with 4 years of experience in building scalable, high-performance production ML systems to support a wide range of business needs. Whether it's data analysis, ideation, and experimentation or deployment and maintenance, I'm actively engaged in all aspects of developing ML systems. I have in-depth experience in Machine Learning, with a particular emphasis on Natural Language Processing (NLP) and Large Language Models (LLMs).
I currently work in the Customer Support domain and lead the exploration and application of ML, NLP, and LLMs in this domain. My day-to-day responsibilities include (but are not limited to) the following:
- Collaborate with stakeholders & TPMs and analyze data to develop hypotheses to solve business problems and validate them through A/B tests.
- Frame business problems as ML problems and create suitable metrics for ML models and business problems.
- Build PoCs and prototypes to validate the technical feasibility of the new features and ideas.
- Create design docs for architecture and technical decisions and guide the technical implementation.
- Collaborate with the platform and data platform team to set up and maintain the data pipelines that satisfy our team's evolving needs.
- Build and deploy production-grade microservices in Python and Go with suitable capacity planning, logging, error handling, distributed tracing, monitors with actionable alerts, and auto-scaling.
- Write design docs for running A/B tests and perform post-test analyses to determine the impact of ML features on the business metrics.
- Maintenance, enhancement, and retraining of ML models for features running in production.
- Responsible for incident handling and included in the on-call rotation of the ML and backend microservices owned by my team.
- Evaluate technical assignments and conduct interviews for hiring mid-career, new grads, and intern ML engineers.
Tech stack that I use for carrying out my day-to-day responsibilities:
- Data analysis: SQL, BigQuery
- ML experimentation and model training: Jupyter Notebooks, PyTorch, Hugging Face Transformers, Azure OpenAI, Kubeflow pipelines, MLFlow
- Model deployment: TorchServe, Kubernetes
- Microservice development: Python, Go, gRPC, Datadog, Pagerduty, Sentry, Spinnaker, Docker