Resume

Machine learning, Artificial intelligence, Data science, Machine learning model developer, ML product architecture

  • > 20-year R&D experience in deep learning, machine learning, speech recognition, multimedia content analysis (text, audio and video), and natural language processing. >40 papers published in top journals & conferences.
  • >5-year industry experience in e-commerce and media industries of developing deep learning and machine learning models to solve business problems. 
  • Hands-on experienced machine learning (ML) product developer:  data logic flow and architecture design, ML model developing and  optimization, and API service deployment. 
  • Programming: PyTorch, Python, SQL, Keras, Tensorflow, C, C++, C#, Databricks, Scala, PySpark, Matlab

Lazada / VP

Nov 2019 – PRESENT,  Singapore

  • Lead developing AI-generated content (AIGC): focusing on mannequin fashion image transformation and virtual avatar icon image generation, based on stable diffusion & lora based style model, API & realtime service deployment in Aliyun cloud.
  • Lead developing state–of-art deep learning toolkit to enhance representation capability of e-commerce product content (image & text), such as ResNet, ViT, multi-lingual CLIP, MoCo, improving product match performance (AUC >+10%), and exploiting in recommendation item diversity and image governance. 
  • Developing image quality governance pipeline and white background  image  generation to support seller operation and dwon-stream applications such as search and ads. 
  • Lead developing high-performance (accuracy > 90%) end-to-end multi-modality product match system to achieve millions of product matching and to support business operations such as product pricing, assortment and various data products.
  • Lead developing BERT based multilingual L1 category (~40) and leaf category prediction system (>2000 leaf categories).
  • Lead developing product item-seller allocation system to optimally allocate products to specified sellers with the predefined business constraints 
  • Lead developing seller mapping system based on seller information to identify if two seller shops in the lazada platform and the competitor platform are the same seller.
  • Lead developing smart voucher system (daily voucher & campaign voucher channels) for personalized voucher distribution to improve voucher collection rate and user click rate.

MediaCorp / Lead data scientist (AVP)

Jul  2018 -Nov 2019, Singapore

  • Leadership: Lead a team (5 data scientists) to develop in-house data science projects. 
  • Lead developing digital optimizer to improve CTRs of ads units (preliminary AB test: ads unit CTR +50%).
  • Lead developing audience profiling and segmentation, including predictions of age, gender, and ethnicity, news categorization (IAB tags) for Chinese, English, Tamil and Malay language. 
  • Lead developing forecasting model to predict  view traffic on  news articles, broadcast content & video content.
  • Lead developing device graph system to identify whether the  users surfing on different devices and browsers are the same ones.

Institute for Infocomm Research, A*STAR / Senior scientist

Jan 2003 -Jul 2018, Singapore

~15-year research and development on machine learning algorithms and systems in the following areas:

  • Audio search and fingerprint. Lead developing music search product (>10M music songs, precision >97%,  feature extraction, database index & search), launched in Baidu music app
  • Multimedia content analysis and information retrieval including audio, video/image and text 
  • Image content understanding and automatic image tag
  • Natural language processing such as opinion mining (sentiment analysis) from user generated content (UGC), text summarization and text normalization.
  • Leading a team to participate in the ad-hoc photographic image retrieval in ImageClef 2007. Our system is at the 2nd place in automatic run among 20 participants with 475 submitted run
  • >30 papers published in top conferences and journals..

National University of Singapore / Research Fellow

Jan  2002 -Dec 2002, Singapore

Research on speech recognition and text categorization and retrieval

Advanced Telecommunications Research Institute International / Invited Researcher

Jun  2001 -Dec 2001, Japan

Research on acoustic modeling for large-vocabulary speech recognition, particularly on acoustic modeling in Chinese spontaneous large vocabulary speech recognition, which is supported by a Japan government fund.

Institute of Automation, Chinese Academy of Science / Research Assistant

Apr 1998 -Jun 2001, Beijing China

  • Research on context dependent acoustic modeling, n-gram language model, search engine (one-pass & multi-pass) for Chinese LVCSR. 
  • Develop real-time ASR systems and APIs, and in charge of commercializing the ASR system by collaborating with a Chinese company. 
  • Best student paper was awarded for the framework of Mandarin LVCSR based on a one-pass decoder in the International Symposium on Chinese Spoken Language Processing, 2000.

Chinese University of Hong Kong / Research Assistant

Jun 1999 -Dec 1999, Hong Kong

Research on the comparative study between Mandarin and Cantonese from the view of context dependent acoustic modeling based on the decision tree.

Institute of Automation, Chinese Academy of Science / Ph.D

Apr 1998 – Jun 2001,  Beijing, China

Dissertation topic: Context-dependent acoustic modeling and the search strategy for large-vocabulary speech recognition with class-triphone modeling (written in chinese).

Beijing University of Posts and Telecommunications / Master of engineering

Sep 1993 – May, 1996,  Beijing China

Thesis topic: Isolated Chinese digital speech recognition (written in Chinese)

North-western Polytechnical University / Bachelor of EE

Sep 1989 – Jul, 1993, Xi’an, China

  • Lazada Forward Tech Star, FY22
  • Ranked at the 1st place with the text modality and at the 3rd place with text and image based mixed modality in the benchmark evaluation on the ad-hoc photographic image retrieval in ImageClef’09
  • Ranked at the 2nd place for text-based image search and 3rd place for mixed-modality (Text & Image) search in the benchmark evaluation on the ad-hoc photographic image retrieval in ImageClef’08
  • Ranked at the 2nd place (automatic run) and 3rd place (all types of run) among 20 participants in the benchmark evaluation on the ad-hoc photographic image retrieval in ImageClef’07
  • Ranked at the 3rd place in the task of video search and 13th place in the task of high-level feature extraction in the benchmark evaluation on TREC Video Information Retrieval 2005.
  •  Dean Scholarship of Chinese Academy of Sciences in 2001.
  • The best student paper award in the International symposium on Chinese Spoken Language Processing, 2000, P.R. China
  • Huawei Scholarship sponsored by Shenzhen Huawei Tech. Co. Ltd. in Institute of Automation, 2000
  • Motorola Scholarship sponsored by Motorola Inc. in Beijing University of Posts & Telecommunication,1995
  • Haiying Scholarship (First Prize) in North-western Polytechnic University, 1992.
  1. Sheng Gao & Haizhou Li, “Octave-dependent Probabilistic Latent Semantic Analysis to Chorus Detection of Popular Song”, ACM Multimedia 2015 (Short paper).
  2. Sheng Gao & Haizhou Li, “Popular Song Summarization Using Chorus Section Detection from Audio Signal”, MMSP 2015.
  3. Linhong Zhu, Sheng Gao, Sinno Jialin Pan, Haizhou Li, Dingxiong Deng and Cyrus Shahabi,  “The Pareto Principle Is Everywhere: Finding Informative Sentences for Opinion Summarization Through Leader Detection”. In Recommendation and Search in Social Networks. Ozgur Ulusoy, Abdullah Uz Tansel and Erol Arkun (eds.), Lecture Notes on Social Networks (LNSN) Series, Springer, 2014.
  4. Zhizheng Wu, Sheng Gao, Eng Siong Chng and Haizhou Li, “A study on replay attack and anti-spoofing for text-dependent speaker verification”, In the proceedings of Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) 2014, Dec.9-12.
  5. Linhong Zhu, Sheng Gao, Sinno Jialin Pan, Haizhou Li, Dingxiong Deng, “Graph-based informative-sentence selection for opinion summarization”, Proc. of The 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (short paper. 15% accept rate).
  6. Sheng Gao and Haizhou Li, “A cross-domain adaptation method for sentiment classification using probabilistic latent analysis”, Proc. Of CIKM’11 (Short paper).
  7. Sheng Gao and Haizhou Li, “Effective large scale text retrieval via learning risk-minimization and dependency-embedded model”, Proc. Of MMM’10
  8. Sheng Gao and Joo-Hwee Lim, “Selecting representative and distinctive descriptors for efficient landmark recognition”, Proc. of ICIP, 2009.
  9. Sheng Gao, Jean-Pierre Chevallet & Joo-Hwee Lim, “Rich representation and ranking for photographic image retrieval in ImageCLEF 2007”, Proceedings of International Workshop on Multimedia Signal Processing, Queensland, Australia, Oct.8-10, 2008.
  10. Sheng Gao and Qibin Sun, “Exploiting generalized discriminative multiple instance learning for multimedia semantic concept detection”, Pattern Recognition, 41(10), pp. 3214-3223, 2008.
  11. Sheng Gao, Joo-Hwee Lim and Qibin Sun, “An integrated statistical model for multimedia evidence combination”, ACM Multimedia, Sept. 24-29, 2007.
  12. Sheng Gao Xinglei Zhu and Qibin Sun, “Exploiting concept association to boost multimedia concept detection”, International conference on Acoustics, Speech, and Signal Processing, 2007.
  13. Sheng Gao, Joo-Hwee Lim & Qibin Sun, “Hidden maximum entropy approach for visual concept modeling”, International conference on Multimedia & Expo, July 2-5, 2007.
  14. Sheng Gao, Joo-Hwee Lim & Qibin Sun, “Propagating image-level part statistics to enhance object detection”, International conference on Image Processing, 2007.
  15. Sheng Gao and Qibin Sun, “Improving Semantic Concept Detection through Optimizing Ranking Function”, Appear in IEEE Trans. on Multimedia, Nov. 2007.
  16. Sheng Gao, Wen Wu, Chin-Hui Lee, and Tat-Seng Chua, “A maximal figure-of-merit (MFoM) learning approach to robust classifier design for text categorization”, ACM Trans. on Information Systems, Volume 24, Issue 4, pp.190-218, April, 2006.
  17. Sheng Gao, De-hong Wang & Chin-Hui Lee “Automatic image annotation through multi-topic text categorization”, International conference on Acoustics, Speech, and Signal Processing, 2006.
  18. Sheng Gao & Qi-Bin Sun, “Classifier optimization for multimedia semantic concept detection”, International conference on Multimedia & Expo, July 9-12, 2006.
  19. Sheng Gao & Qi-Bin Sun, “A generalized discriminative multiple instance learning for multimedia semantic concept detection”, International conference on Image Processing, Oct. 8-11, 2006.
  20. Sheng Gao, Chin-Hui Lee & Joo Hwee Lim, “An ensemble classifier learning approach to ROC optimization”, International conference on Pattern Recognition, 2006.
  21. Kai Chen, Sheng Gao, Yongwei Zhu & Qinbin Sun, “Music genre classification using text categorization method”, International Workshop on Multimedia Signal Processing, Victoria, B.C., Canada, 2006.
  22. Shi Rui, Tat-seng Chua, Chin-Hui Lee & Sheng Gao, “Bayesian Learning of Hierarchical Multinomial Mixture Models of Concepts for Automatic Image Annotation”, International Conference on Image and Video Retrieval, July 13-15, 2006.
  23.  T.-S. Chua, S.-Y. Neo, Y. Zheng, H.-K. Goh, Y. Xiao, M. Zhao, S. Tang, S. Gao, X. Zhu, L. Chaisorn, Q.B. Sun, “TRECVID 2006 by NUS-I2R”,  Proceedings of TRECVID workshop, 2006.
  24. Sheng Gao and Yong-wei Zhu, “A HMM-embedded unsupervised learning to musical event detection”, International conference on Multimedia & Expo, July 6-8, 2005.
  25. Sheng Gao, Bin Ma, Haizhou Li and Chin-Hui Lee, “A text categorization approach to automatic language identification”, 9th European conference on Speech, Communication and Technology, Sept.4-8, 2005.
  26. De-hong Wang, Sheng Gao, Qi Tian, Wing-Kin Sung, “Discriminative fusion approach for automatic image annotation”, International workshop on Multimedia Signal Processing, Oct.30-Nov.2, 2005.
  27. Kai Chen, Sheng Gao, Peiqi Chai and Qi-bin Sun, “Music identification based on embedded HMM”, International workshop on Multimedia Signal Processing, Oct.30-Nov.2, 2005.
  28. Kai Chen, Sheng Gao, Yong-wei Zhu and Qi-bin Sun, “Popular song and lyrics synchronization and its application to music information retrieval”, 12th annual Multimedia Computing and Networking, Jan.18-19, 2006.
  29. Yong-wei Zhu and Sheng Gao, “Extracting vocal melody from Karaoke music audio”, International conference on Multimedia & Expo, July 6-8, 2005.
  30. T.-S. Chua, S.-Y. Neo, H.-K. Goh, M. Zhao, Y. Xiao, G. Wang, S. Gao, K. Chen, Q.-B. Sun & Q. Tian, “TRECVID 2005 by NUS PRIS”,  Proceedings of TRECVID workshop, 2005.
  31.  Sheng Gao, Wen Wu, Chin-Hui Lee, and Tat-Seng Chua, “A MFoM Learning Approach to Robust Multiclass Multi-Label Text Categorization”, International conference on Machine Learning, 2004. (Accept rate: 33%, Top 1 conference in the community of machine learning)
  32. Sheng Gao, Chin-Hui Lee and Qi Tian, “Indexing with Musical Events and Its Application to Content-Based Music Identification”, International conference on Pattern Recognition, 2004.
  33. Sheng Gao, Chin-Hui Lee and Yongwei Zhu, “An unsupervised learning approach to musical event detection”, International conference on Multimedia & Expo, 2004.
  34. Sheng Gao and Chin-Hui Lee, “An adaptive learning approach to music tempo and beat analysis”, International conference on Acoustics, Speech, and Signal Processing, 2004.
  35. De-Hong Wang, Qi Tian, Sheng Gao & Wing-Kin Sung, “News sports video shot classification with play field and motion features”, International conference on Image Processing, 2004
  36. Sheng Gao, Wen Wu, Chin-Hui Lee, and Tat-Seng Chua, “A maximal figure-of-merit learning approach to text categorization”, (Research paper), 26th annual international ACM SIGIR conference, 2003. (Accept rate: 17%. Top 1 conference in the community of information retrieval)
  37. Sheng Gao, Namunu Chinthaka Maddage, Chin-Hui Lee, “A hidden markov model based approach to music segmentation and identification”, 4th international conference on Information, Communication & Signal Processing, 4th IEEE Pacific-RIM conference on Multimedia, 2003.
  38. Xianfeng Yang, Qi Tian & Sheng Gao, “Video clip representation and recognition using composite shot models”, 4th international conference on Information, Communication & Signal Processing, 4th IEEE Pacific-RIM conference on Multimedia, 2003.
  39. Sheng Gao, Chin-Hui Lee, “A Discriminative Decision Tree Learning Approach to Acoustic Modeling”, 8th European conference on Speech, Communication and Technology, 2003.
  40. Sheng Gao, Jin-song Zhang, Satoshi Nakamura, Chin-hui Lee, Tat-seng Chua, “Weighted graph based decision tree optimization for high accuracy acoustic modeling”, International conference on Spoken Language Processing, 2002.
  41. Sheng Gao, BoXu, Hong Zhang, Bing Zhao, Chengrong Li and Taiyi Huang, “Update of Progress of Sinohear: Advanced Mandarin LVCSR System At NLPR”, International conference on Spoken Language Processing, 2000.
  42. Sheng Gao, Bo Xu and Taiyi Huang, “A New Framework for Mandarin LVCSR Based on One-pass Decoder”, International Symposium on Chinese Spoken Language Processing, 2000.
  43. Sheng Gao, Bo Xu, Tan Lee and Taiyi Huang, “A Comparative Study Between Cantonese and Mandarin: A View From Speech Recognition Engine Portability”, Multi-lingual Speech Communication Workshop, ATR, Oct 11-Oct 13, 2000, Japan.
  44. Sheng Gao, Tan Lee, Y.W. Wong, Bo Xu, P.C. Ching and Taiyi Huang, “Acoustic Modeling For Chinese Speech Recognition: A Comparative Study Of Mandarin And Cantonese”, International conference on Acoustics, Speech, and Signal Processing, 2000.
  45. Sheng Gao, Bo Xu, and Taiyi Huang, “Triphone models of Mandarin speech based on decision tree”, Acta Acoustical, Vol.25, No.6, pp.504-509, Nov. 2000 (written in Chinese).
  46. Sheng Gao, Bo Xu and Taiyi Huang, “Class-triphone acoustic modeling based on decision tree for Mandarin continuous speech recognition”, International symposium on Chinese Spoken Language Processing, 1998.
  47. Bo Xu, Sheng Gao, Yang Cao, Hua Wu and Taiyi Huang, “Integrating tone information in continuous Mandarin recognition”, International symposium on Signal Processing and Intelligent System, Guangzhou, P.R.China, 1999.

Email: goseng123@msn.com 

LinkedIn:  linkedin.com/in/dlsheng/ 

GitHub:  github.com/aigaosheng

Google scholar: scholar.google.com/citations?hl=en&authuser=1&user=pNVyMb0AAAAJ