digital optimiser | AI, Tech & Life

Ads targeting

Posted on September 27, 2021 by sheng gao Leave a comment

As a publisher platform, broadcast companies create content to attract users to their platform, and earn money by ads operation or by content subscription. Because media company is business driven, and they do not have enough manpower to develop own ads technology such as user tracking, ads placement, ads optimization, ……, they use third-party service such as Google DoubleClick. Based on Google technology, the media company can get real-time data about how users react to the ads displayed to them, e.g. which ads unit displayed to users (impression), whether user click the ads or not. From ads click information, in-house data scientists can develop machine learning model (lookalike model) to predict how probability a user will click an ads. Thus, ads targeting will be implemented, i.e. targeting ads to precisely tailored audience. Thus, it can improve click through rate (CTR) and drive traffic.

Big internet tech companies such as Google, Facebook, Baidu have ads targeting products. But there is intention that media companies like to build their in-house technology because of data privacy and they do not want depend on third-party service too heavy. Building in-house technology can let them easily customized ads targeting model to support niche business requirements, and improves quick response to business.

How to build ads targeting model?

Firstly, the problem is formulated as: given an ads – user pair, predict if user click the ads or not. Thus, from machine learning point, it is a binary classification problem.

Secondly, collect data to prepare training samples to learn a ML model. Collect ads-user pairs already displayed in the platform. If using Google DoubleClick, you can get the real-time impression log data. From these log data, you will know which ads unit impress the audience, and whether the audience click the ads. If the user click the ads, the impression is positive (1). Otherwise, it is negative (0). Thus, each ads-user log pair is tagged as 1 or 0.

Thirdly, represent the ads-user pair as a feature vector. In lookalike model, it try to find potential audience with similar behavior what they already know. Thus, ads information can be ignored. It only need to represent a user using a vector. This vector characterizes the user history behaviors in the platform from various dimension. How to use feature to represent a user, please refer to your-browsing-behavior-expose-your-gender-age-ethnicity.

Lastly, you can train any supervised machine learning model to do prediction. In my case, a simple weighted linear classifier works good, which is like mean of positive sample, negative sample, plus discriminative info comparing with a background model. A/B test on some ads unit shows promising results.

After the model learned, we can rank audiences based on how much probability the audience will click the ads, and selected top-N to do ads targeting.

Notes on Data science in media industry

Posted on September 1, 2021 by sheng gao Leave a comment

After about 16-year working in Institute for Infocomm Research, A*STAR, Singapore https://www.a-star.edu.sg/i2r as a research scientist, I realized I need make a change in my life, to learn how industry exploits machine learning and pattern recognition technology to build product and solve business problem. In 2019, I have a chance to join a media company to lead a data science team. In the following posts, I will summarize my learning journey of building data product & machine learning models to solve business problems. The main topics includes:

User profile (data product)
User personal information related prediction
- Gender prediction
- Age prediction
- Race prediction
Media content (news reading, video view) related prediction
- Traffic prediction
  - How to know content popularity in advance content publishing for efficient resource planning
- Auto content tag: label news content using IAB (https://www.iab.com/videos/iab-there/) tag set, a business related semantic tag.
Digital optimizer
- Personalized advertisement targeting
- How to improve advertisement performance (CTR, click through rate) based on third-party data collection toolkit (e.g. Adobe cookie, Google ads performance data)

Data science is to exploit machine learning and pattern recognition technology to solve business problems in industry. Every company is different in terms of industry, operation environment, data available, and business problems. Data science is data-driven approach to solve business problem, thus the most important stages are 1) to transform business problem to machine learning problem (discuss with business owner to understand their requirement), and 2) to collect correct data to build ML model.

M	T	W	T	F	S	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

AI, Tech & Life

涓涓细流，汇成江河

Tag Archives: digital optimiser

Ads targeting

Notes on Data science in media industry

Buy me a coffee