Problem Statement
This blog presents an advertisement system for an e-commercial website to help customers find the products matching their interests.
Design
Ad Creation
Advertisers create ads for their products with provided targeting configs, which include query based targeting, user based targeting, etc. Query based targeting shows ads to users based on the search terms. Advertisers can also configure negative targeting, which hides their ads for the search query. User based targeting factors in user specific attributes such as age, gender, browsing history, etc. Ads created by advertisers go through moderation to ensure the contents are compliant by policies. After moderation, ads are processed by an offline workflow to build the search index.
Ad Serving Workflow
Ads serving system serves ads based on search query and user context. It first calls retrieval system with search query to fetch relevant ads and then invokes the prediction system to rank the ads. Ads prediction system predicts the probability of user engagement with the ads, for example, clicking on the ads, based on ad, advertiser, user and context, etc. and produces the ranking score of each ad. Auction takes place to determine the final winner ads, the order in which they are displayed and the price the advertisers pay for the impression and click. Auction considers not only the ad ranking score and bid but also the ad quality, advertiser budget and so on. One important component of the ad serving system is pacing. Advertiser usually configures daily budget for the ads. To avoid spending all of the budget for auctions at the start of the campaign, pacing allows the budget to be spent evenly over the day by dynamically modifying the bid. This can also help the advertiser to reduce the cost per click of the ad and achieve the optimal return on investment. Users action on the ads displayed to them are recorded as impression logging. This is important data source to train the model used by the ads prediction system.
Ad Retrieval System
For a large scale e-commerce platform, there can be millions of active ads to serve. To reduce the number of ads to be ranked further and reduce the latency, in-memory search engine was utilized to retrieve the initial set of ads based on search terms. The initial set of ads can still contain a lot of ads. Running inference on those ads in the ads prediction system can be slow and expensive. To further narrow down the ads, a simplistic model such as logistic regression is utilized to select the top k ads.
Ad Prediction System
Ad prediction system generates the precise ranking of the input ads selected by ad retrieval system. Raw ad features such as product type, title, click through rate, etc. are embedded into feature vectors, which are fed into the model for inference. Since the ad features change dynamically during the day, for example, customer clicking on an ad increases the click through rate, the model needs to be refreshed periodically to capture the changes.