About

A researcher & developer on AI, machine learning, data science, e-commerce

Education
PH.D

Institute of Automation, Chinese Academy of Sciences, Beijing, China

Dissertation topic: Context-dependent acoustic modeling and the search strategy for LVCSR with class-triphone modeling

Master

Beijing University of Posts and Telecommunications, Department of Telecommunications, Beijing, China

Bachelor

North-western Polytechnical University, Department of Electronic Engineering, Xi’an, China

HONORS/AWARDS

  • Ranked at the 1st place with the text modality and at the 3rd place with text and image based mixed modality in the benchmark evaluation on the ad-hoc photographic image retrieval in ImageClef’09
  • Ranked at the 2nd place for text-based image search and 3rd place for mixed-modality (Text & Image) search in the benchmark evaluation on the ad-hoc photographic image retrieval in ImageClef’08
  • Ranked at the 2nd place (automatic run) and 3rd place (all types of run) among 20 participants in the benchmark evaluation on the ad-hoc photographic image retrieval in ImageClef’07
  • Ranked at the 3rd place in the task of video search and 13th place in the task of high-level feature extraction in the benchmark evaluation on TREC Video Information Retrieval.
  •  Dean Scholarship of Chinese Academy of Sciences in 2001.
  • The best student paper award in the International symposium on Chinese Spoken Language Processing, 2000, P.R. China
  • Huawei Scholarship sponsored by Shenzhen Huawei Tech. Co. Ltd. in Institute of Automation, 2000
  • Motorola Scholarship sponsored by Motorola Inc. in Beijing University of Posts & Telecommunication,1995
  • Haiying Scholarship (First Prize) in North-western Polytechnic University, 1992.

Experienced area

  • More than 20-year experience of R&D in machine learning, speech recognition, multimedia content analysis (text, audio and video), natural language processing. 
  • Robust and efficient production system architecture design and optimization.
  • Develop machine learning production system in e-commerce and media domain.
  • Seasoned data scientist and developer.

Tools

Keras, Tensorflow, PyTorch, Python, C++, C#, SQL, Databricks, Scala

Work experience

VP, Lazada (Nov, 2019 – )
  • Develop competitive intelligence production system by modeling and analyzing products among different platforms using deep learning text/image model. Efficient & robust architecture design and end-to-end product system implementation for large-scale product match and various business scenario. Support seller operation and help business partner to achieve business target.
  • Develop multi-lingual level-1 category prediction system and leaf category prediction system (>2000 leaf categories) to support SEA countries
  • Develop gap allocation system to optimally allocate gap products to specified sellers with the business constrains
  • Develop seller mapping system based on seller information
  • Develop personalized voucher recommendation system to support daily and campaign in order to increase voucher collection rate and user click rate
Lead data scientist,MediaCorp (Jul 2018 – Nov 2019)
  • Lead data science team to develop in-house prediction models such as age, gender, and ethnicity prediction, and news recommendation using vendor toolkit in media publisher.
  • Develop news categorization (IAB tags) to support Chinese, English, Tamil and Malay language.
  • Develop ML model to forecasting view traffic for news articles, broadcast & video content
  • Develop digital optimizer based on user behavior log to improve Ads CTR.
Senior scientist, Institute for Infocom Research, A*STAR (Jan 2003 – Jul 2018)

15-year research and develop ML algorithms and systems and publish papers on the following areas:

  • Multimedia content analysis and information retrieval
  • Automatic image annotation, multimedia semantic concept detection and generic object recognition
  • Opinion mining (sentiment analysis) from user generated content (UGC)
  • Audio search and fingerprint. Music search system (written C++, feature extraction, database index & search) deployed by collaborated top internet company in China.  
  • Text summarization and text normalization
  • >30 papers published in top conferences and journals.
  • Leading a team to participate in the ad-hoc photographic image retrieval in ImageClef 2007. Our system is at the 2nd place in automatic run among 20 participants with 475 submitted runs.
Research fellow, National University of Singapore (Jan 2002 – Dec 2002)

Working on speech recognition and text categorization

Invited researcher, Advanced Telecommunications Research Institute International, Japan (Jun 2001 – Dec 2001)

Working on acoustic model of speech recognition

My hometown, university and living place