International Journal of Electrical and Computer Engineering (IJECE) Vol. No. June 2013, pp. ISSN: 2088-8708 Goal-based Hybrid Filtering for User-to-user Personalized Recommendation Muhammad Waseem Chughtai*. Ali Selamat*. Imran Ghani *Departement of Software Engineering. Faculty of Computing. Universiti Teknologi Malaysia (UTM). Skudai, 81310. Johor Darul Takzim. Malaysia Article Info ABSTRACT Article history: Recommendation systems are gaining great importance with e-Learning and multimedia on the internet. It fails in some situations such as new-user profile . old-star. To overcome this issue, we propose a novel goalbased hybrid approach for user-to-user personalized similarity recommendation and present its performance accuracy. This work also helps to improve collaborative filtering using k-nearest neighbor as neighborhood collaborative filtering (NCF) and content-based filtering as content-based collaborative filtering (CBCF). The purpose of combining k-nn with recommendation approaches is to increase the relevant recommendation accuracy and decrease the new-user profile . old-star. The proposed goal-based approach associated with nearest neighbors, compare personalized profile preferences and get the similarities between users. The paper discussed research architecture, working of proposed goal-based approach, its experimental steps and initial results. Received Jan 10, 2013 Revised Mar 28, 2013 Accepted May 20, 2013 Keyword: Goal-based Recommender system Content-based filtering Collaborative filtering k-Nearest Neighbor Hybrid filtering Cold-start Similarity Copyright A 2013 Institute of Advanced Engineering and Science. All rights reserved. Corresponding Author: Imran Ghani. Departement of Software Engineering. Faculty of Computing. Universiti Teknologi Malaysia (UTM). Skudai, 81310. Johor Darul Takzim. Malaysia. Email: imran@utm. INTRODUCTION Recommender systems are proposed as a promising solution to deal with these issues. Two main techniques are used in hybrid approach recommender systems. Content-based filtering techniques . , . in which the user will be recommended items similar to those the user preferred in the past, and collaborative filtering techniques . in which the user will be recommended items that the other users similar in tastes liked in the past . Each technique has some limitations when it is taken individually such as data sparsity, new-item profile . old-star. and new-user profile . old-star. Cold-start . ew-user profil. is an issue in hybrid personalized recommendation systems . In cold-start, the system gives poor recommendation and damages the resulting filtering learning content accuracy of recommender systems . In recent research, there are two types of profiles working in e-Learning recommender systems: one is userprofile and second is item-profile . The proposed research work is on improving the user-profile cold-start User profile cold-start problem occurs when the user is new in the recommender system environment and does not have related information in the personalized profile. Users should have user interests, required goals, rating/grading and likes/dislikes, learning contents details, etc, in the profile . Unless the system is unable to acclaim the users required goal/ interests and cannot recommend the required item/ content more closely related to user interest. In addressing the new-item profile . old-star. Li and B. Kim . , . used item-based collaborative filtering. The clustering results merge the content information into the collaborative filtering in order to solve the cold start problem. However, the Li and B. Kim . , . approach ignores the Journal homepage: http://iaesjournal. com/online/index. php/IJECE A ISSN:2088-8708 demographic information of users which can be helpful to improve the prediction results. The new-user profile . old-star. occurrs in both content-based and collaborative filtering recommender systems. For example: Imran Ghani et al. proposed domain-based filtering approach which works on content-based algorithm to overcome the new-user profile . old-star. Spiegel et al. Gunawardana and Meek . and P. Melville et al. proposed their works with both content-based and collaborative filtering to address and improve the issue of new-user profile . old-star. The authors incorporated the rating data as well as the content information in a unified model. They demonstrated that mixing features of users and items achieves good accuracy. Nevertheless, non of them exploited all the users and items features like the address and rating time, and they did not indicate how their approach deals with the cold start issue. Researchers now adays recommend the use of more than two approaches . artially used machine learnin. to reduce the effects of new-user profile . old-star. One of them. Pazzani . proposed a framework that combines collaborative, content-based and demographic filtering for recommending information from HTML pages to gather the demographic information of users. The weak point is the author tested his approach with minimum number of usersAo and itemsAo dataset which cannot guarantee the efficiency of the proposed system with huge dataset. Moreover. Pazzani . did not provide explanation on how the model is built. Lee et al. proposed a collaborative filtering recommender system combined with the SOM Neural Network. They categorized the users based on their demographic information, and used a clustering technique to cluster the users in each category according to their preference to items using the SOM Neural Network. Jahrer et al. used several approaches such as SVD (Singular Value Decompositio. Neighborhood Based Approaches . Restricted Boltzmann Machine. Asymmetric Factor Model and Global Effects to build recommender systems. The authors show that linearly combining these algorithms increases the accuracy of prediction. In addition, the use of all these models leads to significant increase in training time complexity and data sparsity issues. Other works on hybrid recommender systems can be found in . where the researchers proposed different filtering recommendation techniques in order to provide more relevant predictions/ recommend-ations and overcome/reduce the limitations of each In this paper we propose a novel hybrid approach to build a recommender system. Our approach combines neighborhood collaborative filtering (NCF) using k-nearest neighbor network and content-based collaborative filtering (CBCF) using collaborative filtering techniques to improve the new-user cold-start profile content filtering accuracy. This method allows a better coverage, and overcomes the cold-start profile filtering accuracy issue. The new-profile attributes . gender, age, occupation, etc. ) are used for clustering the similarities between users personalized profile preferences. The method is categorizing the users into categories . using a nearest neighbor technique. Each category holds users sharing similar user profile characteristics. For a new user, the technique recommendes items using only the cluster to which this user belongs. In the same way, the combination of user-profile content characteristics and content-based filtering technique helps to solve the problem of new-user profiling by adding details in the system. The contributions of this study are . establishing the new and existing user profile similarities using neighborhood collaborative filtering (NCF). recommendation of required goals to the user using contentbased collaborative filtering (CBCF) and . the goal-based hybrid approach, improves the performance of recommender systems with new-learner cold-start profile automatically. RESEARCH ARCHITECTUR AND METHODOLOGY The challenge of this novel goal-based hybrid approach based recommender system is improving new-user cold-start profile content filtering accuracy . , . Figure 1: shows the architectural model of our proposed hybrid filtering recommender system. In this section we discuss our proposed architectural Data Collection Goal-based hybrid filtering Users Profile Collection List of Users Goals Dataset Content Collection List of Users Profiles Neighborhood Collaborative Filtering (KNN CF) Content-based Collaborative Filtering (CF CBF) Recommendation Recommended Item/Content Figure 1. Architectural model of goal-based filtering hybrid recommender system IJECE Vol. No. June 2013: 329Ae336 IJECE ISSN: 2088-8708 In user profile collection, the proposed hybrid approach incorporates the usersAo profiles content. to compute each individual user , , ,A, First, the system collects a set of usersAo profiles profile with different userAos profiles collaboratively to overcome the new-user profile cold-start problem. The dataset content collection collects the items/contents I from the provided dataset. The item/content collection is used as an matrix, where n is total number of users and m is total number , , ,A, of items/contents used in users and itemsAo I. The list of users recommended goals collects the users ,A. Here user u belongs to set existing recommended items that have been sorted as ( OO ) and item i belongs to set I ( OO ), so the user u has i past-recommended item and R is a set of The range of rating is integer values that have been sorted ratings for each item by the user, denoted by 1,2,3,4,5 . To understand the process of recommended userAos profiles. subset u considered as new , , ,A, | OO users and subset v considered as active users. so that u and v represented as , , . A , | OO . Equation . , . are used to compute the similarity matrix between user u . ew use. and user v . ctive use. where n is the number of users in both subsets. This similarity matrix element is computed using equation . Similarity . Several machine learning and data mining approaches are used by researchers in the domain of recommender systems . We used a machine learning k-nearest neighbor with collaborative filtering to improve users profile preferences similarity performance. In our approach, the user . ew-use. uAos profile attributes are compared with nearest-neighbor user vAos personalized profiles preferences and content ratings on item i. This comparison is calculated based on the user v profiling content/item i voting/rating of likes/dislikes combined with users that voted/rated similarly the item i. Such approach depends strongly on the number of nearest-neighbors that rate the item i and recommends the similar rated/voted user v profile as the user . ew-use. The nearest-neighbor . -n. query object may access directly to the next neighbor object and filter the required object . K-nnAos of the user . ew-use. u are computed using a similarity Equation . are the most used similarity measures in hybrid filtering approaches for recommendation systems . Let us assume u and v are two users in hybrid recommender systems, so the set of all relevant items of userAos u and userAos v are denoted as In our method, the userAos u and v are | where . denotes the dot-product between the vectors and , norm vector both treated as vectors | space of users u calculated as An An and users v as An An using personalized profile preferences. So the similarities between users u and users v are analysed as follows: An An An An UE OO An An An An UE OO UE OO , , ,A, The content-based collaborative filtering (CBCF) finds the set of items previously rated by the new-user u and selects the items that are similar to the item i used as required goal of a new-user u using a similarity measure. In the case dataset, we used the AoMovieLensAo dataset with movies features that characterize the rated movies such as genre, country and date. Combination of content-based and collaborative offers quick fixes that can be included in this feature. These quick fixes make it effective to return quality results in minimum time period . We used this hybrid filtering in the content-based filtering to compare the new and existing usersAo profiles collaboratively. Measuring the similarities and differences in rating scale between users shown in Equation . The recommended item rating using the userAos profiles, correlation users profile and correlation similarity measures are computed using the following C. Goal-based Hybrid Filtering for User-to-user Personalized. (Muhammad Waseem Chughta. A ISSN:2088-8708 In Equation . , is the recommended content rating of the new-user u on the item i. Here is the mean content rating given by the new-user u, , is the collaborative profiles correlation similarity between new-users u and v . ctive use. in the collaborative neighborhood. It also analyses the personalized profile preferences similarities along the columns of the user-to-user similarity matrix. Goal, in recommender systems is an identification of requirements and achievements of required products/items required by the user. The definition of goal is . : Aua goal specifies the objectives that a client may have when he consults a web servicesAy. This research paper used goal term as a common vocabulary to requeste services, as requesters will select defined goals to express their required items/products and services will link to existing goals. In goal-based filtering, a new-user profiling characteristic plays an important role in identifying the categories of users that like certain kinds of items or have similar required goals and recommendations. Neighborhood collaborative and content-based collaborative filtering (CBCF) means user-profiling characteristics can be used to overcome the limitations of both content-based and collaborative . We measure the profiling characteristics/contents similarities between users as follows: UE OO UE OO UE OO Equation . n Equation 2 A . i A I. nxm Input data Goal-based hybrid filtering Recommendations (Learning contents for User . Output Item 1 Item 2 Item i U Item m Equation From Equation . are the averages of the u and v userAos profiles characteristics/ contents In our proposed model, we employed Equation . to compute the similarity between userAos u and v. The users . and items i are represented by an similarity matrix. Thus, the similarity is computed along the rows of the matrix. The k-nn is used to select the most nearest similar users. In a recommendation a recommender system works as to acclaim the users goals for enhancing the userAos interest, reduce boredom and promote clarity to achieve the required item/product . Figure 3 is defining the working of goal-based hybrid filtering recommender system with the similarities between two userAos profiles It shows the operational working of Equations . , . Figure 3. Operational model of goal-based hybrid approach In goal-based filtering, we used the profiling characteristics/contents of existing userAos v . gender, occupation, zip code and ratings, et. and matched with new-userAos u profile characteristics/contents. These characteristics/contents of users profiling are available in the AoMovieLensAo dataset. It helps to create the categories of different users that share/like the same content and which have the same profile characteristics/content. The k-nn is used to create these categories using the collaborative filtering. The profile characteristics and rating content comparison of a new-user u on item i with existing user profile characteristics and rating contents are computed using only the category to which the new-user u belongs. The accuracy of new-user profile content filtering depends on the number of nearest neighbors of the newuser u. In order to make good recommendations, the recommender system learns first the userAos preferences and tastes based on user recommended learning content ratings. In new-users profile scenario, traditional collaborative filtering fails to recommend these new users because there is no or less recommended ratings The goal-based hybrid approach with the incorporation of userAos profiles content filtering can IJECE Vol. No. June 2013: 329Ae336 IJECE ISSN: 2088-8708 improve the new-user cold-start profile content filtering accuracy of a recommender system. To do so. Table 1 describes the necessory steps: Table 1. Operational steps of goal-based filtering Step 1 Detect the new usersAo nearest neighborsAo profiles collaboratively using the KNN algorithm with collaborative filtering approach. Step 2 Compare the new usersAo profiling content with existing userAos profiles that share the same profile characteristics. Step 3 Establish the new users and existing users profile similarities using equation . in neighborhood collaborative filtering (NCF). Step 4 Based on the ratings of the nearest neighbor userAos recommended content, compute the recommendations for the new-user u on the item i using equation . Step 5 Output recommendations based on userAos requirements/goals. The AoMovieLensAo database provides a set of predefined characterization using 5-fold validation analysis based on . ender, occupation, user. enre, movie. with their ratings information. experimental setup, the user data is categorized in user groups . age, gender, occupation, et. are available in the AoMovieLensAo dataset. Figure 4 shows the detail of user categorized data. Gender Occupation 1 Male 1 Administrator 6 Entertainment 11 Librarian 16Scientist 2 Female 2 Artist 7 Executive 12 Marketing 17Student 3 Doctor 8 Healthcare 13 Programmer 18Technician 4 Educator 9 Homemaker 14 Retired 19Writer 5 Engineer 10 Lawyer 15 Salesman 20Other Age 1 <18 5 2 18-25 6 3 26-35 4 36-45 Figure 4. AoMovieLensAo dataset user profile data There are 19 movie genres/categories. A movie can belong to more than one genre/category. Here each user rates minimum 20 and maximum 50 movies using integer values in the range 1 OO , where 1 identifies the lowest rating r and 5 indicates highest rating r against items i. The experiment was performed on the AoMovieLensAo dataset. The dataset comes with five predefined splitting. each user rated 20% of movies, containing 943 usersAo and 1682 movies with 100,000 ratings . Each user rates approximately . -to-. The ratings are on a numeric five-point scale with . , 2, 3, 4 and . We note that not all movies are rated by all users. EVALUATION AND RESULTS In this phase, we have done the evaluation of new hybrid filtering recommendation results using two famous evaluation matrices normaly: Precision (P. and Recall (R. This evaluation/analysis helps to indicate that the proposed approach performance is improved on new-user cold-start profiling issue in hybrid filtering recommender systems. In this section, we acquaint the results of goal-based hybrid filtering approach for recommender system. We performed our experiment on the AoMovieLensAo dataset which contains 943 usersAo and 1682 movies with 100,000 ratings . iscussed in section . To get the initial results of our new hybrid approach on new-user cold-start profile content filtering accuracy problem, we randomly elected 500 userAos data as training set. The AoMovieLensAo have 943 users data. The procedures for calculating the precision and recall are as follows: Precision Recall Goal-based Hybrid Filtering for User-to-user Personalized. (Muhammad Waseem Chughta. A ISSN:2088-8708 This experiment divides the usersAo data into two portions, a training dataset . %) and testing . %) The training dataset has been sorted based on user id and users testing dataset has been constructed with their personal profiles and rating history. There are 19 movie genres/categories. A binary value . was used where 1 indicated that a movie belongs to a specific category and 0 indicated it is not. For the experimental results, we randomly selected 500 users in training data and the rest of users in testing data. The results show that our hybrid approach outperforms the other techniques using two famous evaluation matrices namely. Precision (P. and Recall (R. Table 2. Results (%) of Precision and Recall Comparison Method Precision (P. Recall (R. Content Based Recommender System (CB-RS) . Time-context Based Collaborative Filtering (TB-CF) . Content-based collaborative filtering (CBCF) Nearest neighbor collaborative filtering (NCF) Proposed hybrid filtering (Goal-base. With the use of Equation . Table 2 defines the conducted preliminary results for taking the evaluation of the proposed work. Conducted preliminary results obtained show the performance of the neighborhood collaborative filtering (NCF), content-based collaborative filtering (CBCF) and the proposed goal-based hybrid filtering approaches. Table 2 shows the comparison results between previous research and above two filtering approaches to improve the new-user profile . old-star. Precision (%) (CBAaRS) . (TBAaCF) . Recall (%) CBCF NCF GoalAabased Figure 5. Stacked bar graph of Precision (%) and Recall (%) results In summary. our approach provides promising results on initial stage of this research. We conducted an experimental study on the improvement of traditional approaches . ontent-based and collaborativ. as hybrid, which has three stages. The first stage compared traditional Content Based Recommender System (CB-RS). as discussed by Kant. and K. Bharadwaj . , the researchers stat that (CB-RS) is based on reclusive traditional content-based recommendation methods aimed at dealing with item features and its This approach contains . recision: 42. 58%, recall: 46. 84%) as mentioned by . improve the cold-start in traditional content-based method, we combined collaborative in it and named as content-based collaborative filtering (CBCF). This combination enhanced the precision Aurecommendation accuracyAy as revealed in the result . recision: 57. 80% recall: 46. 98%). The second step concluded the comparison of Time-context Based Collaborative Filtering (TB-CF). as prescribed by . , that (TB-CF) outperforms the traditional user-based collaborative recommender systems, as shown in the results ( precision: 25. 87%, recall:33. 22 %). IJECE Vol. No. June 2013: 329Ae336 IJECE ISSN: 2088-8708 To improve the cold-start in traditional collaborative filtering we combined k-nearest neighbor and named it as neighborhood collaborative filtering (NCF). This combination enhanced the precision Aurecommendation accuracyAy and recall Aufiltering accuracyAy as shown . recision: 55. 41%, recall: 52. more than (TB-CF), (CB-RS) and (CBCF) approaches. The third and last step combined the two improved approaches (CBCF) and (NCF) for goal-based hybrid filtering approach. By using the equation . , results . recision: 64. 60%, recall: 54. 91%) shows more improved results of precision Aurecommendation accuracyAy and recall Aufiltering accuracyAy. The purpose of combining the two improved approaches is to use all the possibilities for decreasing the new-user profile . old-star. issue and increase the efficiency of recommendation approach with minimum code complexity. Figure 5 demonstrates the graph representation of precision (%) and recall (%) accuracy results of all approaches for new-user profile . old-star. CONCLUSION In this paper we introduced an efficient hybrid approach for recommender systems. Hybrid approach basically in recommender systems is the combination of content-based and collaborative filtering. proposed a novel goal-based recommendation approach which used the content-based and collaborative filtering but in a different way. We used machine learning technique . -n. k-nearest neighbor network. The idea to use the machine learning in recommender systems with collaborative filtering is to enhance the expertise of proposed approach for improving the new-user profile cold-start issue more efficiently and accuratly than traditional hybrid recommendation approaches. To achieve this, we combined k-nearest neighbor . -n. , collaborative filtering (CF) and content-based filtering (CBF) techniques. This work also helps to improve collaborative filtering using k-nearest neighbor as neighborhood collaborative filtering (NCF) and content-based filtering as content-based collaborative filtering (CBCF). By combining these approaches we proposed a novel goal-based filtering approach for recommender systems. The goal-based filtering approach is incorporated in our hybrid approach to improve new-user cold-start profile issue. For evaluation of the results we used precision and recall matrices. Our work indicates that the user-to-user personalized recommendation can improve the recommendation accuracy of e-learning recommender systems in terms of new-user profile . old-star. Our work suggests interesting future extensions and directions. First of all, we will improve our novel goal-based hybrid approach. Second we will increase the experimental data to test the efficiency of our approach with large amount of dataset. ACKNOWLEDGEMENT This work is supported by Research Management Center. Universiti Teknologi Malaysia (UTM) Johor Campus under the Vote Project Number: 4D046. The project is led by Dr. Imran Ghani. Senior Lecturer. Software Engineering Department. Faculty of Computer Science & Information Systems. Universiti Teknologi Malaysia (UTM). Skudai, 81310. Johor Darul Takzim. Malaysia. REFERENCES