Various methods of collection of data
Prediction of consumer behavior intervenes in two main situations: targeted advertising and recommendations.
For example, GoogleAds: the company which wants targeted advertising gives its ad to GoogleAds. GoogleAds will have access to the data of Google’s users, create profiles of consumers and target profiles which could be interested in the ad.
Another example is AndChill: it’s a bot which recommends a film to users, based on their preferences. The user says to the bot which film he likes, then AndChill suggests him a film he could like. We will explain later how it works more in details.
To do targeted advertising or recommendations, a company needs data. If the company wants to target its clients, it has their data on the account of each client and can analyze it itself. If the company wants to reach new customers, it has to find data elsewhere. The more often, the company can partner with data providers to profiling users for prediction of consumers behavior. Data providers, such as Acxiom or Epsilon, partner with groups like Facebook and Google, collect their users’ data and analyze it to help other companies to do targeted advertising or recommendations.
To predict consumer behavior, data analysts have access to various types of data:
- the explicit data: it’s all the data the user gives to websites. For example: his name, his email address, the last purchase on a shopping website, music videos he adds to his playlist…
- the implicit data: it’s all the data the user is not aware he gives. For example: the moment he skips the video, the moment he pauses the music…
To develop an accurate model, developers need each type of data. Today, new solutions of prediction and data collecting appear: the Internet of things (with the connected objects), deep learning…
Algorithms of recommendations

The idea of recommendation algorithms is to find different clusters of users and items through a mathematical approach.
First of all, we collect data (the first picture: on the left of all pic)
Second we analyze data by lower the dimension of this original matrix, two reasons:
- takes less space in computer
- can visualize since we can choose to take only first k cologne of each matrix
Then we use a function in MATLAB called SVD to decompose the matrix into three (the pic on the right on low), where the first one represents the description of items and the third one the users.
Then we can represent them on a dimensional coordinate where we can easily find the different cluster.

In the end we can use it to predict a new user, given the former data, we simply need some calculate in math to get a result of the coordinate to describe his preference, then we can find out who is the nearest user to him and we recommend the films he has seen to the new user.