Work at Glance, InMobi
Rapid Prototyping & Deployment
So MSFT just open sourced a new model? Open AI has new version of GPT API? Someone just cracked video generation & speech synthesis?
Navigating the constantly changing AI landscape and yet be able to make the most of it via series of proof of concepts some of which getting to production as a viable business solution.
Production ML Platforms for Scale
For the last 4 years at InMobi & Glance, part of my responsibility at work has been developing, maintaining, & improving the ML Platforms, with changing requirements, team sizes, state of the art & clouds.
The most recent iteration of our ML Platform supports
- exploration & training on top of TBs of data via PySpark
- to put the model into production at the scale of 20+ million predictions per second & ~80k QPS.
- This has been built on top of GCP's Dataproc, GKE, Vertex Endpoints, Kubeflow Pipelines.
This ML Platform has been built in the most cost effective way possible yet without sacrificing robustness & developer productivity. Right now, leading the ML Infra Cost Savings Initiative at Glance.
More on my ML Platforms or MLOps work & philosophy can be read here.
Content Creation Tool via AI
Group of 3 people built & scaled a tool to make it the de facto content creation platform at Glance. Tech I worked on - ML, backend, frontend, infra.
This tool automates content (in the form of images, text, videos) by using CLIP based image & video search, image & video processing techniques, content ranking & moderation, generative AI using GPT3 & SD, speech synthesis & speech transcription.
Along with the tech, led the
- analytics & data driven new feature improvements
- KTs & stake holding conversations with consumers
Which later realized is what a Product Manager does.
Program Manager, Product Manager, Plumber
Over time, my role at Glance for the Data Sciences team has me be the
- Product Manager (ideating, stakeholding, delegating work)
- Program Manager (all things concerning timelines, migration activities, planning)
- Plumber (solving Data Sciences problems for the engineers & Engineering problems for the Data Scientists) for the 40 member DS team.
I also had to bring in engineering practices to the Data Sciencies workflow, what we now call MLOps - monitoring, CI/CD, oncalls, github.
To truly do platforms you need to have a deeper sense of what the pain point is - and I was privileged to start my career as a data engineer doing ML modelling.
Well versed in the art of SQL, big data querying (presto, spark sql, pig, hive) and my absolute favourite PySpark (which I write frequently on) & scheduling frameworks like airflow, kubeflow.
Being able to process TBs of data in a cost efficient (or constrained by compute) & still quick-est manner is an expertise of mine.
Analytics & Modelling on Tabular Data
Well versed in analysing TBs of data, getting insights after series of group bys & aggregations, and creating charts and tables on Excel / Sheets & communicating the insights in a crisp manner.
I believe, with the right analysis (and without outcome bias) you can truly disrupt whatever you are analysing.
Never really had a deeper bond with modelling as I had with platforms, analysis, engineering work; always a secondary point - and I blame XGboost for this. In my first two years of modelling, I never truly got a major performance improvement when applying different algorithms to the same features. Just that few algorithms are clear favourites to outperform. But creating features for a model, which requires forming loads of hypotheses & analysing the dataset to validate those - that is always the most fun part for me.
(the above opinion is for a specific set of problem statements)