APTISI Transactions on Technopreneurship (ATT) Vol. No. July 2025, pp. 306Oe317 E-ISSN: 2656-8888 | P-ISSN: 2655-8807. DOI:10. ye AI-Driven Personalized Movie Recommendations: A Content and Sentiment-Aware Model for Streaming and Digital Entrepreneurship Vijay Shelake1* . Scott Fernandes2 . Sarthak Shrungare3 1,2,3 Fr. Conceicao Rodrigues College of Engineering. University of Mumbai. India 1 vijaysnew12@gmail. com, 2 scottfernandes3586@gmail. com, 3 sarthakshrungare46@gmail. *Corresponding Author Article Info ABSTRACT Article history: In an era marked by the digital consumption of media, the landscape of movie recommendation is undergoing a profound transformation. Traditional recommendation methods, which rely on collaborative filtering and user reviews, are being supplanted by more sophisticated content-based approaches. The evolution of Artificial Intelligence (AI) has given rise to a new generation of recommendation systems, characterized by their ability to process and analyze vast amounts of content metadata to provide tailored suggestions. This research presents an AI-driven personalized movie recommendation model for streaming and digital entrepreneurship, leveraging data analytics and Natural Language Processing (NLP) techniques to enhance user experience. The model integrates sentiment analysis and cosine similarity to recommend similar movies, offering personalized recommendations across multiple streaming platforms, thus improving user satisfaction, engagement, and content discovery. By utilizing AI-driven algorithms, this model contributes to digital entrepreneurship by enhancing content personalization and improving user retention in the competitive streaming industry. Submission November 19, 2024 Revised January 25, 2025 Accepted April 10, 2025 Published April 14, 2025 Keywords: Sentiment Analysis Content Movie Streaming This is an open access article under the CC BY 4. 0 license. DOI: https://doi. org/10. 34306/att. This is an open-access article under the CC-BY license . ttps://creativecommons. org/licenses/by/4. AAuthors retain all copyrights INTRODUCTION The proliferation of streaming platforms has resulted in an explosion of available content, making it essential to implement sophisticated recommendation systems to enhance the user experience. Content-based recommendation systems, which leverage user preferences and content metadata, have emerged as effective solutions to this challenge. These systems analyze a wide range of data points, such as genres, actors, directors, and plot keywords, to generate personalized movie recommendations. This approach minimizes the need for extensive user input and enhances the discovery of relevant content, ultimately boosting user satisfaction and By prioritizing responsible data handling and ensuring user privacy, these systems can protect user data while delivering high-quality recommendations. Our research model demonstrates how to utilize the vast amount of movies available online to recommend films that users are likely to enjoy, improving efficiency and the overall user experience. This makes it easier for users to find better content to watch. The research presents an AI-driven personalized movie recommendation model that integrates both content and sentiment analysis, which aligns with the ongoing Journal homepage: https://att. id/index. php/att APTISI Transactions on Technopreneurship (ATT) ye advancements in streaming platforms and digital entrepreneurship. By integrating AI technologies, the model addresses the challenges of content overload while enhancing the digital entertainment landscape for both users and service providers. In terms of its contribution to Sustainable Development Goals (SDG. , the AI-driven personalized movie recommendation system has a significant impact on SDG 9: Industry. Innovation, and Infrastructure. By developing scalable and efficient recommendation systems, this approach fosters digital entrepreneurship and the growth of the streaming industry. The integration of Artificial Intelligence (AI) enhances technological advancements and creates a more inclusive and accessible entertainment ecosystem. Additionally, the modelAos focus on improving user experience through personalized content directly contributes to SDG 10: Reduced Inequality, as it enables diverse user groups to discover content that resonates with their unique preferences. Furthermore, the emphasis on user privacy and data security aligns with SDG 16: Peace. Justice, and Strong Institutions, ensuring that user data is handled responsibly while providing a secure and trustworthy platform. RELATED WORK Various research studies have explored different techniques for generating personalized movie recommendations. One approach leverages deep learning in a multimodal movie recommendation system, integrating user preferences and movie features to achieve improved accuracy compared to traditional methods . Another method suggests combining collaborative filtering and content-based filtering to enhance recommendation relevance by analyzing user behaviors alongside movie attributes . Additionally, a review of deep learning architectures for recommendation systems highlights their effectiveness in capturing complex user data patterns, consistently outperforming traditional techniques . Machine learning algorithms, such as decision trees and support vector machines, have been investigated for their strengths in real-world recommendation applications . A personalized system combining LSTM networks and CNNs has also been introduced to boost accuracy by utilizing temporal dynamics and visual features. Furthermore, a clustering-based system segments users and movies, which improves prediction accuracy for diverse groups . A clustering-based system that segments users and movies has been explored, improving prediction accuracy for diverse groups . Reddy focus on genre correlation in content-based recommendations, demonstrating how understanding genre relationships enhances recommendation relevance . The cold-start problem has been addressed by proposing methods to improve suggestions for new users or items, utilizing historical data and social influences . Item-based and user-based collaborative filtering have been compared, providing insights into their strengths and weaknesses . A privacy-preserving recommendation model using SVD has been introduced to deliver accurate recommendations while maintaining user privacy . Movie recommendations have been improved by integrating sentiment analysis to capture user preferences, addressing the challenge of processing large datasets . A combination of content-based techniques with emerging technologies has been used to improve recommendation accuracy . Recommendations have been enhanced by incorporating emotion-based analysis to better align with user sentiment, while Euclidean distance has been employed to measure similarity between users and items . , emphasizing the challenge of maintaining recommendation accuracy across large datasets . Sr. Table 1. Comparison of Methods Reference Focus Limitations . Hybrid recommenda- If one method predominates over the other, the hybrid model tion techniques may struggle to balance contributions of content-based and collaborative filtering, resulting in suboptimal suggestions. Clustering-Based As the number of users and movies rises, clustering-based Methods approaches may become inefficient, resulting in less accurate predictions for diverse datasets. Content-Based Rec- This work may not delve deeply into practical implementaommendation Tech- tion challenges or integration with contemporary AI methniques . Collaborative Filter- This research does not fully address the advantages of ing Methods content-based systems in personalized recommendations. E-ISSN: 2656-8888 | P-ISSN: 2655-8807 ye Sr. Reference Focus Privacy-Preserved Recommendations using SVD Movie Recommendation Emotions Limitations The privacy-preserving strategy with SVD improves user confidentiality but reduces intricate user-item interactions, potentially lowering recommendation accuracy. A key limitation of movie recommendation systems based on emotions is the difficulty in accurately detecting and interpreting emotions, which can vary widely among users, leading to misaligned recommendations. Building upon the previous research, our model integrates sentiment analysis with content-based filtering, offering a more nuanced approach to movie recommendations. Unlike traditional methods, which may struggle with cold-start issues or limited user input, our AI-driven system leverages a comprehensive set of metadata and sentiment-driven insights to offer more relevant and personalized recommendations . combining user reviews, movie attributes, and sentiment analysis, the model ensures better alignment with user preferences, enhancing recommendation accuracy. Furthermore, our approach addresses the computational complexity found in large-scale datasets by utilizing efficient algorithms such as cosine similarity and advanced NLP techniques . As seen in Table 1, the integration of sentiment analysis and content-based filtering enhances recommendation quality, offering a scalable solution that improves user engagement and satisfaction. This methodology not only improves the user experience but also has the potential to scale with growing data, offering real-time, highly personalized content recommendations across diverse streaming platforms . , . METHODOLOGY This work has two primary goals, the first is to recommend films to users based on their tastes, achieved through the application of machine learning techniques . The second is to successfully perform sentiment prediction on movie reviews using Natural Language Processing (NLP) . Movie Recommendation The first part of the work involves achieving movie recommendations using content-based filtering to enhance the user experience . , as shown in Figure 1 . Figure 1. Movie Recommendation Process APTISI Transactions on Technopreneurship (ATT). Vol. No. July 2025, pp. 306Ae317 APTISI Transactions on Technopreneurship (ATT) ye A Data Collection: Collect a diverse dataset of movies, including metadata such as genres, actors, directors, and plot summaries. A Feature Extraction: Analyze the metadata to extract relevant features for each movie. This includes text processing techniques like tokenization, stemming, and vectorization. A Building User Profiles: Develop user profiles based on their viewing history and preferences . This involves mapping user preferences to movie features. A Similarity Calculation: Using cosine similarity to measure the similarity between user profiles and movie A Recommendation Generation: Generate a list of recommended movies for each user . based on the highest similarity scores. A Evaluation of the Model: Test the recommendation system on a subset of the dataset to assess its accuracy and relevance. Metrics such as precision and recall are used. The structured approach used to enhance user experience through personalized movie recommendations . By employing a content-based filtering method, the system tailors recommendations to individual user preferences, utilizing a variety of data points such as genres, actors, and directors . The key steps, including data collection, feature extraction, user profiling, and similarity calculation, work in tandem to ensure that the recommendations are aligned with users specific tastes . This process not only streamlines the recommendation flow but also enhances its accuracy, making it more relevant to the user. The modelAos ability to leverage data and compute similarity between user profiles and movie attributes ensures that the suggestions are meaningful, increasing user satisfaction and engagement with the platform . Additionally, by evaluating the systemAos performance using standard metrics, the process ensures continuous refinement for better Thus, the method described in this section lays a strong foundation for personalized recommendations, improving the overall user experience in streaming platforms . Data Collection and Data Preprocessing The dataset used is TMDB dataset is a comprehensive resource widely utilized in developing movie recommendation systems. It contains detailed metadata on a vast collection of movies, including attributes such as genres, cast, crew, plot descriptions, and user-generated tags . This rich dataset facilitates the implementation of content-based filtering techniques by providing ample features for calculating similarities between movies. Stemming A natural language processing method called Stemming is used to strip words of their suffixes and return them to their base or root form. This procedure aids in text normalisation, which can enhance the effectiveness of a number of text processing jobs, including text mining and information retrieval . One of the most used stemming methods is the Porter Stemmer Algorithm. Iteratively deleting frequent suffixes, it turns words into their stems according to a set of predetermined rules. One might simplify the terms Ayrunning,Ay Ayrunner,Ay and AyranAy to the root Ayrun. Ay Count Vectorization A method for converting text input into numerical representation in natural language processing is called count vectorization. To achieve this, a matrix must be built, with rows standing in for documents and columns for distinct terms from the whole corpus . The number of times a single word appears in a given document is included in each cell of the matrix. This procedure converts textual input into an easily processed structured format for machine learning algorithms. Generate Recommendations In this phase of our proposed work aims to provide the best movie recommendations to users by using content-based filtering. It generates results using techniques like cosine similarity. E-ISSN: 2656-8888 | P-ISSN: 2655-8807 ye Cosine Similarity Cosine similarity is a statistic that compares two movies according to their content aspects and is used in movie recommendation systems . , . Movies are represented as vectors in a multidimensional space in content-based filtering, where each dimension relates to a particular property like the plot keywords, actors, or genre. By calculating the cosine of the angle between these vectors, cosine similarity compares the characteristics of the movies without taking into account their magnitudes . The recommendation engine may find and propose films that are most comparable to ones that a user liked the most by employing cosine similarity, which increases the suggestion relevancy. The cosine similarity between two vectors r1 and r2 is given by: CosineSimilarity. 1 , r2 ) = r1 A r2 Our1 Ou A Our2 Ou where r1 A r2 represents the dot product of r1 and r2 , and Our1 Ou and Our2 Ou denote the magnitudes . r norm. of r1 and r2 , respectively. Sentiment Analysis The second part of the work involves sentiment prediction on movie reviews, providing users with a better understanding of the overall reception of the movie . Figure 2. Sentiment Analysis of Reviews A Data Collection: Collecting data that captures sentiments from reviews to predict whether they are positive or negative. A Text Preprocessing: Analyze the metadata to extract relevant features for each movie. This includes text processing techniques like tokenization, stemming and vectorization. A Classification: Random Forest Algorithm is used in NLP to classify the texts. It is used for sentiment A Sentiment Prediction: Conditional probability of each feature is given a class . entiment categor. For sentiment analysis, these classes are typically AopositiveAo and AonegativeAo sentiments. APTISI Transactions on Technopreneurship (ATT). Vol. No. July 2025, pp. 306Ae317 APTISI Transactions on Technopreneurship (ATT) ye A Evaluation of the Model: It involves testing the prediction system on a subset of the dataset to assess its accuracy and relevance. Based on Figure 2, the integration of sentiment analysis into movie recommendation systems enhances the overall quality of recommendations by considering user feedback in the form of reviews. By analyzing reviews, the system can predict whether a movie is positively or negatively received, thus enabling more accurate and personalized recommendations . The process of text preprocessing and feature extraction ensures that the review data is properly prepared for sentiment classification. Using algorithms like Multinomial Naive Bayes for sentiment prediction allows the model to classify sentiments effectively, making it possible to align movie suggestions with the user emotional response to previous content. Overall, this approach not only improves the relevance of recommendations but also provides a more holistic understanding of user preferences, leading to a better user experience . Figure 3. Distribution of Reviews The dataset for this research includes movie reviews categorized by sentiment, with Figure 3 showing a balanced distribution between positive and negative sentiments. Each sentiment category contains approximately 25,000 reviews. This balance is advantageous for training sentiment analysis models, as it ensures a robust understanding of both positive and negative sentiments, minimizes bias, and facilitates comprehensive model training for more reliable and applicable results. A TF-IDF Vectorization: A key method in the sentiment analysis of movie reviews is vectorization using TF-IDF (Term Frequency-Inverse Document Frequenc. By assessing a word significance in a document in relation to a corpus, it converts textual data into numerical vectors. While Inverse Document Frequency (IDF) lessens the weight of common terms that appear in numerous documents. Term Frequency (TF) assesses how frequently a word appears in a document. TF-IDF vectorization combines these measures to identify important phrases that help interpret sentiment. By using the prevalence of critical phrases to discriminate between good and negative reviews, this technique improves the accuracy of sentiment analysis models. To determine the emotional tone of a text and classify it as positive or negative, various NLP algorithms can be used. This research compares different classification algorithms, such as Random Forest. Logistic Regression, and Naive Bayes, to find the best model for sentiment analysis. The evaluation focuses on achieving a good balance between precision and recall, as well as overall accuracy, to ensure reliable sentiment E-ISSN: 2656-8888 | P-ISSN: 2655-8807 A Precision-Recall: Figure 4. Precision-Recall Curves The graphs in Figure 4 below represent and compare the Precision-Recall Curves for various classification algorithms. These curves provide valuable insights into the performance of each algorithm, helping us determine the most suitable algorithm for this task. These charts show the Precision-Recall curves for several machine learning models, including Random Forest. Decision Tree. Bernoulli Naive Bayes. Multinomial Naive Bayes. Gaussian Naive Bayes, and Logistic Regression. The curves illustrate how each model trades off precision and memory. the optimal model has a curve toward the top-right corner, which indicates excellent precision and recall. Random Forest achieves superior precision for a broad range of recall values by maintaining a comparatively better balance throughout the plot. In comparison to the other models, this indicates that it does a good job of preserving accuracy while catching more true positives. Despite its consistency, logistic regression loses precision as recall increases. Figure 5. Comparison of Naive Bayes Algorithms APTISI Transactions on Technopreneurship (ATT). Vol. No. July 2025, pp. 306Ae317 APTISI Transactions on Technopreneurship (ATT) ye Figure 5 compares the performance of various Naive Bayes classification models, showing that the Gaussian Naive Bayes model achieves the highest accuracy among the evaluated models. This indicates that the Gaussian Naive Bayes algorithm is most effective for the given dataset. The performance superiority of Gaussian Naive Bayes is likely due to its ability to model continuous data distributions efficiently, making it a reliable choice when dealing with such data types. While other models, such as Multinomial Naive Bayes and Bernoulli Naive Bayes, provide good performance, they donAot reach the accuracy level of the Gaussian model, suggesting that Gaussian Naive Bayes is better suited for handling the specific characteristics of the dataset. Figure 6. Comparison of Other Classification Algorithms Figure 6 compares several other classification algorithms, where Logistic Regression achieves the highest accuracy among the models evaluated. This high performance is attributed to its ability to model linear relationships effectively and its robustness in high-dimensional spaces, which makes it a reliable option for predictive tasks. However, when considering the Precision-Recall curves. Random Forest emerges as the superior algorithm. Although Logistic Regression has the highest accuracy, its Precision-Recall curve indicates a significant drop in precision at higher recall levels, signaling less consistent performance. In contrast. Random Forest maintains a better balance between precision and recall while still achieving high accuracy, making it a more reliable choice for tasks requiring consistent performance across various metrics. Random Forest is an extension of the bagging method, utilizing both bagging and feature randomness to generate an uncorrelated forest of decision trees, ensuring low correlation among them. This algorithm conducts sentiment analysis on movie reviews by building multiple decision trees, and for each review, it aggregates the predictions from all trees to determine the sentiment, using a majority vote for classification. yC = mode{T1 . T2 . , . Tn . } . The formula above represents a decision-making process where the predicted sentiment . enoted as yC) is determined by taking the mode . ost frequent valu. of the predictions made by multiple decision trees. Each Ti . refers to the prediction of the i-th decision tree for a given input x. The formula suggests that the final prediction yC is based on aggregating the individual predictions T1 . T2 . , . Tn . from n decision trees, and selecting the value that occurs most frequently . as the final output. This method is a key aspect of ensemble learning, where multiple models are combined to improve the robustness and accuracy of the prediction. RESULT AND DISCUSSION The figures illustrate the step-by-step workflow of our recommendation engine, starting with data collection and preprocessing to extract meaningful features like genres, cast, and keywords. These features are used to build user profiles and calculate similarities using methods like cosine similarity, generating tailored Sentiment analysis further refines the suggestions, ensuring continuous personalization and ye E-ISSN: 2656-8888 | P-ISSN: 2655-8807 Figure 7. Movie Recommendations Figure 7 depicts all the recommended movies which have high similarity in terms of genre, overview, cast when compared to that of the resultant movie. Figure 8. Sentiment Analysis Figure 8 illustrates the sentiment analysis performed on the reviews, indicating the predicted sentiments. The analysis successfully categorizes the sentiments as either positive or negative. Figure 9. Confusion Matrix Figure 9 presents a confusion matrix evaluating our sentiment analysis model. The matrix shows 3168 true positives (TN) and 3203 true negatives (TP), indicating correct classifications of negative and positive APTISI Transactions on Technopreneurship (ATT). Vol. No. July 2025, pp. 306Ae317 APTISI Transactions on Technopreneurship (ATT) ye reviews, respectively. Misclassifications include 511 false positives (FP) and 618 false negatives (FN). The high values along the diagonal cells reflect the model strong performance in sentiment prediction. However, the off-diagonal values suggest areas for improvement in minimizing incorrect predictions. Thus, this research presents a sentiment-and content-aware movie recommendation system that uses data analytics and natural language processing to improve customer satisfaction. It offers individualized movie recommendations by substituting content-based methods for conventional collaborative filtering, utilizing metadata such as genres, storyline synopses, and reviews. By taking user opinion into account, sentiment analysis provides a better understanding of the overall quality of the movie, thereby guaranteeing relevance and interaction. Modern streaming platforms with large content collections can use the system because of its Our research demonstrates digital entrepreneurship by enhancing personalized movie recommendations on streaming services through the use of sentiment analysis, machine learning, and natural language processing (NLP). Exploring its impact on user engagement, content discovery, and targeted advertising would further highlight its practical significance in AI-powered digital business strategies. The scalability for big datasets, data-driven insights for content curation, and improved user experience are some of its uses. Streaming platforms may enhance recommendations, solve the cold start issue, and provide real-time, customized suggestions by incorporating this approach into their current systems, which will benefit consumers and services In addition to providing chances for IT companies and startups to create sophisticated recommendation systems, this innovation aids in the digital transformation of the entertainment sector. Similar technologies are already being used by Netflix. Amazon Prime Video. Spotify. YouTube. Hulu, and Disney to improve their recommendation engines, demonstrating the potential for broad acceptance and additional innovation. MANAGERIAL IMPLICATIONS The implementation of AI-driven personalized movie recommendation systems offers significant strategic advantages for streaming platforms and digital entrepreneurs. By integrating content-based filtering with sentiment analysis, platforms can enhance user satisfaction and engagement, leading to stronger customer loyalty and retention. This combination enables the system to provide more relevant recommendations aligned with individual preferences, reducing the need for extensive user input. Managers should focus on refining recommendation algorithms to cater to evolving user behaviors, offering personalized content that enhances the overall user experience. Additionally, leveraging big data and machine learning techniques like cosine similarity helps managers optimize content curation, marketing strategies, and partnerships, ensuring that platforms continue to meet the dynamic preferences of their user base. Furthermore. AI-driven recommendation systems provide a competitive edge by allowing platforms to differentiate themselves in a crowded market. By improving the accuracy and relevance of movie suggestions, platforms can stay ahead of competitors while enhancing content discovery for users. The system also enables more targeted advertising and efficient monetization, as managers can leverage insights into user preferences to personalize ads and optimize marketing budgets. With the scalability of the system, it can handle large datasets, making it adaptable for growing platforms. However, as privacy concerns increase, it is essential for managers to ensure that the recommendation systems comply with data protection regulations, fostering trust and safeguarding the platform reputation. By focusing on user personalization. AI integration, and privacy, managers can not only boost customer engagement but also secure a sustainable competitive advantage in the digital entertainment industry. CONCLUSION In this research paper, we introduce an automated movie recommendation system designed to address the challenges encountered by users in selecting movies. Our system aims to streamline the recommendation process and enhance user experience. The system utilizes NLP for recommending movies to users. By employing NLP techniques, our system identifies and recommends the movies that are most suitable according to user preferences and past behaviour, accompanied by a sentiment analysis to predict the sentiments of the Our system is a robust movie recommender that effectively leverages NLP techniques to provide accurate recommendations and enhance the user experience on streaming platforms. To enhance performance, future research might concentrate on creating a hybrid model that combines collaborative and content-based filtering. While collaborative filtering makes use of common user tendencies. E-ISSN: 2656-8888 | P-ISSN: 2655-8807 ye content-based filtering tailors suggestions according to item qualities. By integrating their advantages, a hybrid strategy can improve relevance and variety while addressing drawbacks like overspecialization and cold starts. Its efficacy would be guaranteed by testing on various datasets and assessing measures like precision and Moreover, integration of advanced similarity measures and classification models in our innovative movie recommendation system will make it perfect for prompt and customized recommendation in the era of Big data DECLARATIONS About Authors Vijay Shelake (VS) Scott Fernandes (SF) Sarthak Shrungare (SS) https://orcid. org/0000-0003-2739-4110 Author Contributions Conceptualization: VS. Methodology: SF. Software: SS. Validation: VS and SF. Formal Analysis: SS and VS. Investigation: SF. Resources: SS. Data Curation: VS. Writing Original Draft Preparation: SF and SS. Writing Review and Editing: VS and SF. Visualization: SS. All authors. VS. SF, and SS, have read and agreed to the published version of the manuscript. Data Availability Statement The data presented in this research are available on request from the corresponding author. Funding The authors received no financial support for the research, authorship, and/or publication of this article. Declaration of Conflicting Interest The authors declare that they have no conflicts of interest, known competing financial interests, or personal relationships that could have influenced the work reported in this paper. REFERENCES