학술논문

From Data Analysis to Intent-Based Recommendation: An Industrial Case Study in the Video Domain
Document Type
Periodical
Source
IEEE Access Access, IEEE. 10:14779-14796 2022
Subject
Aerospace
Bioengineering
Communication, Networking and Broadcast Technologies
Components, Circuits, Devices and Systems
Computing and Processing
Engineered Materials, Dielectrics and Plasmas
Engineering Profession
Fields, Waves and Electromagnetics
General Topics for Engineers
Geoscience
Nuclear Engineering
Photonics and Electrooptics
Power, Energy and Industry Applications
Robotics and Control Systems
Signal Processing and Analysis
Transportation
Recommender systems
Task analysis
Metadata
Cleaning
Buildings
Over-the-top media services
Noise measurement
Industrial data
intent-based
session-based
user behavior
Language
ISSN
2169-3536
Abstract
This work presents a comprehensive study, from an industrial perspective, of the process between the collection of raw data, and the generation of next-item recommendation, in the domain of Video-on-Demand (VoD). Most research papers focus their efforts on analyzing recommender systems on already-processed datasets, but they do not face the same challenges that occur naturally in industry, e.g., processing raw interactions logs to create datasets for testing. This paper describes the whole process between data collection and recommendation, including cleaning, processing, feature engineering, session inferring, and all the challenges that a dataset provided by an industrial player in the domain posed. Then, a comparison on the new dataset of several intent-based recommendation techniques in the next-item recommendation task follows, studying the impact of different factors like the session length, and the number of previous sessions available for a user. The results show that taking advantage of the sequential data available in the dataset benefits recommendation quality, since deep learning algorithms for session-aware recommendation are consistently the most accurate recommenders. Lastly, a summary of the different challenges in the VoD domain is proposed, discussing on the best algorithmic solutions found, and proposing future research directions to be conducted based on the results obtained.