Sheng Gao
Machine learning, AI, Data science, ML product architecture, LLM (RAG, Agent, vectorDB)
About
- 20+ years in deep learning, ML, speech recognition, multimedia analysis, NLP; 40+ publications.
- 5+ years in e-commerce & media industries solving business problems with ML.
- End-to-end ML product development: data flow, modeling, optimization, API deployment.
- Tech stack: PyTorch, Python, SQL, Keras, Tensorflow, C/C++/C#, Databricks, Scala, PySpark, Matlab, LangChain, FastAPI, Airflow, Redis.
Experience
ATTIX / Senior LLM Developer Apr 2024 – May 2025
- Developed PilotAI, an agentic fintech product using OpenAI, Llama, Anthropic, etc.
- Built real-time APIs for portfolio strategies, stock analysis, user profiling, and SEC/news processing.
Lazada / VP Nov 2019 – Jan 2024, Singapore
- Led AIGC projects: mannequin fashion image transformation, virtual avatar generation (SD & LoRA).
- Improved product content match (AUC +10%) with ResNet, ViT, multilingual CLIP, MoCo.
- Built product match, seller allocation, smart voucher, and multilingual classification systems.
MediaCorp / Lead Data Scientist (AVP) Jul 2018 – Nov 2019, Singapore
- Led team of 5 to deliver CTR optimizers (+50%), audience profiling, and traffic forecasting models.
I2R, A*STAR / Senior Scientist Jan 2003 – Jul 2018, Singapore
- Led research in audio search, NLP, multimedia retrieval, sentiment analysis, and image tagging.
- Published 30+ papers; productized music search (10M+ songs) for Baidu with 97% precision.
NUS / Research Fellow Jan 2002 – Dec 2002
Worked on speech recognition, text categorization & retrieval.
ATR, Japan / Invited Researcher Jun 2001 – Dec 2001
Researched acoustic modeling for large vocabulary spontaneous Chinese speech.
Education
- Ph.D., Institute of Automation, CAS – Apr 1998 – Jun 2001
Context-dependent acoustic modeling
- M.Eng., BUPT – Sep 1993 – May 1996
Isolated Chinese digit recognition
- B.E., Northwestern Polytechnical University – Sep 1989 – Jul 1993
Awards
- Lazada Forward Tech Star (FY22)
- ImageClef 2009 – 1st (text), 3rd (mixed modality)
- ImageClef 2007 – 2nd (auto), 3rd (overall)
- TRECVID 2005 – 3rd (video search)
- Dean Scholarship, CAS (2001); Best Student Paper (ISCSLP 2000)
- Huawei, Motorola, and Haiying Scholarships