International Journal of Electrical and Computer Engineering (IJECE) Vol. No. February 2019, pp. ISSN: 2088-8708. DOI: 10. 11591/ijece. Content-aware resource allocation model for IPTV delivery networks Suliman M. Fati1. Putra Sumari2. Wou Onn Choo3 1 College of Computers and Information Sciences. Prince Sultan University. Saudi Arabia 2 School of Computer Sciences. Universiti Sains Malaysia. Malaysia 3 Faulty of Information Technology and Science. INTI International University. Malaysia Article Info ABSTRACT Article history: Nowadays, with the evolution of digital video broadcasting, as well as, the advent of high speed broadband networks, a new era of TV services has emerged known as IPTV. IPTV is a system that employs the high speed broadband networks to deliver TV services to the subscribers. From the service provider viewpoint, the challenge in IPTV systems is how to build delivery networks that exploits the resources efficiently and reduces the service cost, as well. However, designing such delivery networks affected by many factors including choosing the suitable network architecture, load balancing, resources waste, and cost reduction. Furthermore. IPTV contents characteristics, particularly. size, popularity, and interactivity play an important role in balancing the load and avoiding the resources waste for delivery networks. In this paper, we investigate the problem of resource allocation for IPTV delivery networks over the recent architecture, peerservice area architecture. The Genetic Algorithm as an optimization tool has been used to find the optimal provisioning parameters including storage, bandwidth, and CPU consumption. The experiments have been conducted on two data sets with different popularity distributions. The experiments have been conducted on two popularity distributions. The experimental results showed the impact of content status on the resource allocation process. Received Apr 9, 2018 Revised Jul 23, 2018 Accepted Aug 11, 2018 Keywords: Content status Content-aware Hybrid genetic algorithm IPTV delivery networks Resource allocation Copyright A 2019 Institute of Advanced Engineering and Science. All rights reserved. Corresponding Author: Suliman M. Fati. College of Computers and Information Sciences. Prince Sultan University. Riyadh 12435. Saudi Arabia. Email: smfati@yahoo. INTRODUCTION IPTV, as a promising technology, is growing rapidly in terms of subscribers and revenue to become in the near future the standard means to deliver home and business entertainment contents . Thus. Telecommunication companies have entered a hectic competition to increase their customer base and profit by delivering the IPTV services . , . On the other hand, the key concern of service providers in this hectic competition is how to provide high quality service with a minimum cost. Consequently, the cost reduction is one of the service providersAo worries. Such cost reduction can be achieved by controlling the resources allocation process . Resource Allocation is an important issue in designing the shared delivery network. Resource allocation determines the locations and the amount of resources to minimize the cost under certain constraints. In the shared delivery networks, many service providers may share resources and compete to deliver good quality services. For that, these resources should be shared in a more effective manner . However. IPTV systems suffer from the sudden peak workloads. Thus, allocating a large amount of resources to cope with the sudden workload leads to low resources utilization at non-peak hours. On the other hand, non-considering the peak workload leads to user Journal homepage: http://iaescore. com/journals/index. php/IJECE A ISSN: 2088-8708 dissatisfaction during the peak hours. For that, pay-as-you-go scheme is suggested by some delivery network Amazon CloudFront is a global example of this scheme. In this scheme, the service provider has to pay for the used resources only without any upfront, commit, or service contract. Particularly, the fluctuation in the workload . the number of incoming requests targeting the hosted content. is a serious issue threatens IPTV systems. Such workload fluctuation issue figures out the content status in IPTV system. Gaber and others . , . , . suggested a contentsAo workload estimation model that estimates the contentsAo status according to their characteristics. Of these characteristics are huge size . in Mega Bytes or Giga Byte. , interactivity, and rapidly changing effective period . Such distinct characteristics make the design of IPTV delivery networks challengeable. On the other hand, ignoring the status of diverse contents leads to load imbalance and waste of resources . Therefore, estimating the contentAos status according to these characteristics is a crucial factor in IPTV systems. Content status modeling refers as estimating the portion of concurrent requests that target the content according to its Such modeling helps a lot in handling both the load balancing and resource allocation based on the anticipated load of contents . By means of such contentsAo status estimation model, the service provider can control the resources allocation according to the content status and the usersAo demand to reduce the hosting cost . without affecting the usersAo satisfaction. This integration allows the service provider to adjust the required resources according to the necessitated replication scheme. For that. Laoutaris et al. argue that replica placement problem and resource allocation problem should be solved jointly due to the dependency between them. Moreover, considering the contentsAo status during the integration gives the service providers the possibility to control the resources and content distribution according to the demand fluctuation, as well as, enables the delivery network to absorb the sudden load fluctuation. Therefore, this paper concerns about how to build content-aware resource allocation model that considers the content status to reduce the waste of resources. In this paper. Content-Aware Resource Allocation Model (CARAM) is proposed. CARAM model considers the contentsAo status including contentsAo load and contentsAo popularity distribution to decide the required resources. CARAM model allocates the resources according to the replication strategy. Thus, it can save the resources in terms of storage, bandwidth, and CPU. For this purpose, the resource allocation problem and the replica placement problem are integrated to allocate the resources depending on the necessitated replication scheme. The content-aware resource allocation problem is studied and formulated mathematically as a constrained binary optimization problem. Next, this problem is solved using Hybrid Genetic Algorithm aiming to evaluate the effect of content status on the content-aware resource allocation. For this purpose, two popularity distributions are considered through this paper. These popularity distributions are ZipfAos distribution and the exponential distributions, which follow the normal density function . The rest of this paper is organized as follows. Section 2 is dedicated for background and related Section 3 presents the system model and problem formulation. Section 4 introduces the proposed hybrid Genetic algorithm to solve this problem. Section 5 is dedicated for experimental results and Finally. Section 6 presents the conclusion and future work. BACKGROUND AND RELATED WORKS Although the high benefits of IPTV delivery networks to both service provider and Content consumer, designing efficient delivery networks is not a trivial task and has potential problems, which must be solved. IPTV delivery networks composition. Content distribution and management, resource allocation, and request redirection are the main classes of issues that must be considered during building the delivery IPTV delivery networks composition includes delivery network placement, determining the serversAo specifications, and interaction among servers. Content distribution and management includes content selection and distribution based on demand and cache/replica management. Resource allocation chooses the type and size of resources that are required for each surrogate server. Request redirection includes the techniques to distribute the requests among servers in a manner to maintain the load balanced and avoid the traffic congestion . , . Content placement is an important point in the delivery networks for multimedia streaming. In large multimedia systems . IPTV), the effective content placement is highly required to cope with load imbalance, insufficient bandwidth, large delay, and packet loss . In the context of content placement, a considerable number of works are proposed to cope with the growing demand on these contents. On the other hand. Replica placement Problem, as a global strategy, is widely investigated in the literature review to improve the overall performance and/or minimize the allocation cost . In the literature, the Replica Placement problem is formally expressed as a problem definition . ost functio. , which moves Int J Elec & Comp Eng. Vol. No. February 2019 : 369 - 385 Int J Elec & Comp Eng ISSN: 2088-8708 toward an overall goal . erformance or cost improvemen. Such problems typically require heuristic algorithms to find approximate solutions within a feasible time, due to that they are NP-complete . ,[ . In the context of problem definition, the researchers investigate the content replication problem from two perspectives. The first perspective is to improve the quality of service . The second perspective is to minimize the delivery cost . , . , and . These works can also be classified according to the solution proposed. These solutions can be classified into heuristic solutions and evolutionary solutions. The heuristic solutions include the approximation algorithms and the greedy algorithms. In contrast, the evolutionary algorithms include Genetic, ant colony, and particle swarm algorithms. Because of the Replica Placement Problem is NP-Complete . , they typically require metaheuristic algorithms to find approximate solutions within a feasible time . Employing such meta-heuristic algorithms help in gaining the maximum benefit of replication that is crucial for the success of the IPTV However, the aforementioned works, which exploit heuristic algorithms, fail to produce an optimal effective placement strategy . , . Thus, due to the fruitful application of evolutionary algorithms to solve a variety of optimization problems, many attempts are recorded in the direction of using these algorithms in solving replica placement problem. Khan and Ahmad . compared the heuristic algorithms with evolutionary algorithms (Genetic Algorithm as an exampl. They introduced a unified cost model that captures the minimization of the total object transfer cost in the system, which in turn, leads to effective utilization of storage space, replica consistency, and fault-tolerance. The study shows that Genetic Algorithm outperforms the heuristic algorithms . greedy algorithm. in terms of solution quality. Although. Evolutionary algorithms produce better solutions as close as possible to the optimal solutions, the heuristic algorithms are faster than evolutionary algorithms in the running time. These findings also confirmed by the authors in . , . Resource allocation is an important process in designing IPTV delivery networks. This process consists of determining the location, the type, and the amount of networking resources to deploy. The aim of the resource allocation problem is to minimize the cost while certain constraints are respected . Thus, the optimization strategies are a good option to solve such problems . In the literature review, the resource allocation problem has been addressed using heuristic solutions . Wauters et al. address the resource allocation problem of determining the equipment required for building transport network for VoD services. They focused on determining the number of ports of each server, as well as, the number of multiplexers and switch ports at each node. In their study, the decentralized network is divided into regional sub networks with a ring topology. Next, a single server is installed on each regional sub network. Similarly. Kitjongthawonkul and Ko . examined the resource allocation problems in a VOD network to find the optimal locations of the video servers. They considered the usersAo demand to select the location of those servers. Moreover, the authors addressed the allocation problem using the dynamic programming technique. The objective of Kitjongthawonkul and Ko is to minimize the total operating cost, including transmission cost, storage cost, and server installation cost. Allocating the resources to build an overlay network over multiple substrate networks is proposed by Houidi et al. and Pandey et al. Houidi proposed a dynamic overlay network creation model according to the load and the available resources. The proposed model composed of request splitting algorithm and embedding algorithm. The request splitting algorithm aims at helping the overlay network provider to distribute the incoming requests among the available substrate network providers. The authors proposed two requests splitting algorithms. max-flow min-cut algorithm and linear programming algorithm. Based on the splitting algorithm, embedding algorithm is proposed to assign virtual nodes and links into each substrate network provider simultaneously. In this work, the authors formulated the overlay network embedding process to be solved by means of mixed integer program. The aim of the proposed embedding algorithm is to increase the request acceptance rate and decrease the leasing cost of infrastructure. Pandey et al. focused on determining the optimum network deployment strategy for VoD services in a heterogeneous networking environment. The deployment decisions can be made considering the following parameters: the required number of VoD servers, the minimum physical distance of server from the community and the bandwidth capacity requirement. The authorsAo aim is to determine the minimum capacity requirements that maintain the minimum requirement of Quality of Experience (QoE) to be met. For that, they estimated three QoE metrics: the server waiting time, the minimum one-way delay, and the access network bandwidth consumption. Such estimation is done based on the analysis of user requirements and network configurations. Afterwards, they formulated the server location problem and path selection problem jointly as an Integer Linear Programming (ILP) problem. The authors solved this problem by means of ILP techniques, as well as, heuristics. One of the flaws in most proposed works to solve the resource allocation and/or replica placement problem is that they consider the problems of resource allocation or replica placement problem Solving these two problems independently produces suboptimal solutions. this is due to the Content-aware resource allocation model for IPTV delivery networks (Suliman M. Fat. A ISSN: 2088-8708 direct effect of replicas locations on the amount of required resources and vice versa . , . The main advantage of combining replica placement and resource allocation in one joint problem is that the resources will be utilized efficiently and as necessary. Once these resources become no longer required, they will be released to be available for other users who require it or able to pay for it. For that, integrating the replica placement problem with the resource allocation is suggested firstly by . Laoutaris et al. argue that the replica placement problem and the resource allocation problem should be solved jointly to avoid a suboptimal solution. They proposed a realistic framework to build optimal resource/ content allocation model for a hierarchical architecture that minimizes the total cost. The cost here is represented by the distance between the storage location and the end user. The framework of Lautaris took into consideration the load balance and request peering among servers as constraints. In this framework. Genetic Algorithm is exploited to solve the resource allocation problem. Nguyen et al. addressed the problem of overly network provisioning for multimedia contents to build a cost-effective virtual delivery network. Their framework aims to build an overly delivery network over a substrate network. They formulated the provisioning problem as a combination between content replication and resource allocation. Next, they solved this provisioning problem using a Lagrangian relaxation heuristic algorithm. Similarly. Nakaniwa et al. introduced a mathematical model to join the content replication with server placement for the hierarchical architecture. Their aim was to maximize the system reliability of delivery network. They introduced an integer programming technique to maximize the reliability of the This technique finds the optimal scheme for both content and resource allocation problems that maximize the reliability of all the paths from each user to each file location. They applied their model on a real delivery network for internet service provider in Japan called "B Bit-Japan" with subject to delay and storage and/or transmission cost constraints. On the other side. Aldana Diaz and Huh . argue that allocating the IPTV resources including storage and bandwidth requires creating a cost function based on contents management and distribution. They proposed three cost-based policies, particularly, local hosting, external hosting, and partial hosting policies for 3rd party content service on a distributed architecture. In their detailed cost function, they considered the storage and bandwidth requirements that maintain the least level of Service Level Agreement (SLA). The results of them showed that partial hosting of contents is the most suitable technique wherein some of the contents are hosted locally while the others can be requested upon the demand. Such partial hosting may maintain the trade-off between the storage cost and bandwidth cost. However, this work lacks for real constraints on which contents should be hosted, and, which contents should be requested upon Li and Wu . proposed a heuristic algorithm that finds the optimal number of popular contents and the optimal number of servers to store these contents. Their model ignores the contents' replication and the load balancing. Replicating the contents according to their status helps the service provider to decide on the required resources to allocate. Thus, considering the content status is an important point to achieve optimal resource allocation scheme depending on the replica placement. Therefore, in this paper, the content-aware resource allocation model is proposed aiming at investigating the impact of content status on resource allocation. CARAM FRAMEWOK In traditional delivery networks, as CDN, the service provider has to allocate the contents according to long-term service contract with a certain commitment. This pricing scheme hinders the service provider to cope with the sudden workload at the peak busy times. On the other side, this scheme leads to waste the resources due to the low utilization at the non- peak busy times. For that, pay-as-you-go scheme is suggested by the delivery network providers. Amazon CloudFront is a global example for pay-as-you-go scheme. which, the service provider has to pay for the used resources only without any upfront, commit, or service Particularly, the service provider has to control the resources allocation according to the demand to minimize the hosting cost . without affecting the user satisfaction. Therefore, the aim of Content-Aware Resource Allocation Model (CARAM) is to decide on the resources amount and the content replication scheme according to the demand so as to achieve a minimum cost subject to given constraints. The cost, here, is expressed in terms of the amount of payment to the network provider to host the contents. Thus, allocating the storage and bandwidth resources can be performed in conjunction with replica placement. The main advantage of combining replica placement and resource allocation in one joint problem is that the resources will be utilized efficiently and as necessary . Once these resources become no longer required, they will be released to be available for other users who require it or able to pay for it. Content-Aware Resource Allocation problem focuses on determining the server locations in each service area, along with, the processing power, storage space, and bandwidth capacity at each location. Int J Elec & Comp Eng. Vol. No. February 2019 : 369 - 385 Int J Elec & Comp Eng ISSN: 2088-8708 To achieve an optimal solution for this problem, the service provider has to obtain the size and popularity distributions of contents in each service area. Consequently, the replication degree, the popularity distribution, and the expected load of contents along with their requirement in terms of processing power, storage, and bandwidth must be estimated clearly. Replicating the contents according to their requirement helps the service provider to decide the required resources to allocate. Thus, this interdependency emphasizes the need to address resources allocation and replica placement jointly. Content-aware Resource Allocation as an optimization process should take into account the available resources and their cost, as well as, the contents and their characteristics as follows. Available resources: In the context of shared hosting environment, each resource provider advertises Aufor hireAy resources with information of resource type, location, and cost. The processing power, storage space, and bandwidth capacity of servers are also provided. In such environment, the service provider can choose different resources, which belong to different resources providers. Contents requirement: each video in IPTV systems requires a storage space, processing power, and bandwidth capacity proportional to its popularity. Thus, replicating the contents according to their popularity distribution gives the optimal replication scheme, which in turn optimizes the resource allocation process. The information about the available resources, as well as, the requirements is gathered from the resource providers and service providers respectively. The service provider is responsible for doing the allocation process based on the requirements. Upon completion the allocation process, the service provider sends the process results to the resource provider to create the topology. In case of the existence of resource broker, the resource broker can complete the whole process according to the given information from both The flow of Content-Aware Resource Allocation Model (CARAM) is depicted in Figure 1. Figure 1. CARAM framework As depicted in Figure 1, in CARAM model, the information on both the content popularity distribution and the demand of each video are gathered from the service provider. This information on the resources from each physical network is provided by the resources providers. Then, the desired optimization tool is employed to compute the optimal topology including the replication pattern. Lastly, the resultant topology is passed to the resources providers or the resource broker to create the delivery network. CONTENT-AWARE RESOURCE ALLOCATION PROBLEM FORMULATION Content-aware Resource Allocation problem over the peer-service area architecture can be modeled as a set of inter-connected service areas. Each service area contains a set of servers, which can be physical servers or virtual servers. Each server is expressed by its processing power, storage space, and bandwidth Content-aware resource allocation model for IPTV delivery networks (Suliman M. Fat. A ISSN: 2088-8708 This paper considers a delivery network comprising of A interconnected service areas with ycEyayca potential locations in each service area. Each potential location is with known storage capacity ycoycycoycaycu , processing power ycycycoycaycu , bandwidth ycaycycoycaycu , and Load capacity yaycycoycaycu . Moreover, each potential location advertises the cost of storage unit ycIyc , bandwidth unit yuyc , and CPU consumption unit yuyc . Let there be a set of Video contents N. The size of each object is denoted by ycoycn and measured in Megabytes (MB). Each object i is associated with its processing power ycycn , bandwidth ycaycn , expected load yaycn , and popularity value pi OO . Each object has a possibility to be replicated among the service area or requested from another service area based on its popularity and the trade-off between the hosting cost and redirection cost. In the cost estimation stage of Content-Aware Resource Allocation Model (CARAM), it is necessary to distinguish between two terms server initiation cost . aya ) and replication pattern cost . aycI ). Server initiation cost . cyc ) In CARAM, the service provider has the ability to lease the resources from different physical network providers. Each physical network provider has own pricing plan for leasing the resources. The pricing plan includes the cost of configuring the server at the first time, deploying software, and maintaining the servers including the backup and security . For instance, in the virtual networks, the leasing cost CL for the virtual network is the cost of initiating and deploying the candidate locations, which are selected as a virtual server in all service areas, as in . The cost yayc represents the leasing cost of a server j in the service area a. ycEya yaya = Ocyayca=1 Ocyc=1yca ycycayc O yayc Furthermore, ycaj is a binary indicator to indicate whether the potential location is selected ycycyca = 1 or not ycycyca = 1. To facilitate the communication among the service areas, the service areas are connected via inter-service areas links. These links enable the dispatcher in any service area to redirect the requests targeting the unavailable contents to other service areas. The cost of these links can be obtained from the redirection process in the replication pattern as will be explained in the next sub-section. Replication pattern cost . cyc ) The contents replication represents a challenge in CARAM. In this step, the content replication cost must be envisaged. The replica placement problem should be extended to include the cost of other resources. The resources cost includes the cost of storage, processing power, and bandwidth for hosting the popular Besides, the cost of contents redirection is included. Redirection cost determines the cost of requesting unpopular content from other service areas in case of not replicating it due to the resources The following equation explains the replication pattern cost based on the above illustration. yaycI = OcycaOOya OcycOOycEyayca OcycnOOycA ycuycnycyca . coycn ycIyc ycycn yuyc ycaycn yuyc ) OcycaOOya OcycOOycEyayca OcycnOOycA ycyycn . Oe ycuycnycyca ) ycoycn yayaAycIycn The first part of . depicts the hosting cost of contents inside the service area including the storage cost mi Sj , the processing power consumption cost ci yc , and the bandwidth cost bi j . On the other hand, the second part shows the transmission cost of the streams that have to be redirected for non-existent contents inside the service area. To estimate the transmission cost, the number of streams is calculated firstly by dividing the size of redirected object ycoycn by the bit rate yayaAycI. Next, the bandwidth unit cost of the service area yuyca is provided and multiplied by the number of streams for redirected objects to calculate the redirection The inclusion of popularity value ycyycn of redirected object in the second part of this equation is important to control which contents should be redirected. Only the low popular contents can be redirected due to their low redirection cost. Upon completion of CARAM process, the optimal topology can be identified through the following Servers and ResourcesAe A server is defined through its location, the allocated computation power, storage space and bandwidth. The serverAos bandwidth is allocated for delivering the contents to the end-users. Thus, it is dimensioned based on the aggregated amount of requests targeting the contents stored in this Replication Pattern- The contents are replicated according to their popularity and expected load such as the popular contents can be replicated in the service area while, the low popular contents can be redirected to the other service area. For that, the replication pattern determines the replicated contents, the replication degree, and the redirected contents, as well. Int J Elec & Comp Eng. Vol. No. February 2019 : 369 - 385 Int J Elec & Comp Eng ISSN: 2088-8708 Decision variables In fact, there are two binary decision variables are introduced to formulate the content-aware resource allocation problem. The first binary variable ycUyc a is employed to decide which potential location must be belonged within the optimal topology while, the second one ycUycn ja to govern the replication pattern. ycnyce ycycnycyce yc ycnycu ycyceycycycnycayce ycaycyceyca yca ycnyc ycyceycoyceycaycyceycc ycayc yca ycyceycycyceyc ycuycEayceycycycnycyce . 1 ycnyce ycaycuycuycyceycuyc ycn ycnyc ycyceycyycoycnycaycaycyceycc ycnycuycycu yca ycyceycycyceyc yc ycnycu ycyceycycycnycayce ycaycyceyca yca ycuycnycyca = { ycuycEayceycycycnycyce . ycycayc = { In fact, the two binary decision variables are related to each other and affected by each other. For instance, these two variables determine the potential locations that should be selected according to the minimum cost of the replication scheme, at the same time. they decide whether the contents can be replicated in a potential location or not. Such relation supports the argument that the network topology, resources dimensioning, and content replication should be conducted mutually due to the mutual interaction between The proposed optimization model The resources allocation problem as a constrained binary optimization model can be formulated mathematically as explained in . Minimize: yaycN = yaya yaycI Subject to OcycnOOycA ycoycn ycuycnycyca O ycoycycoycaycu OAyc OO ycEyayca . OAyca OO ya . OcycnOOycA ycycn ycuycnycyca O ycycycoycaycu OAyc OO ycEyayca . OAyca OO ya . OcycnOOycA ycaycn ycuycnycyca O ycaycycoycaycu OAyc OO ycEyayca . OAyca OO ya . OcycnOOycA yaycn ycuycnycyca O yaycycoycaycu OAyc OO ycEyayca . OAyca OO ya . OcycOOycEyayca ycuycnycyca O ycE OAycn OO ycA . OAyca OO ya . OcycOOycEyayca ycuycnycyca = ycyycn O ycE OAycn OO ycA . OAyca OO ya . ycuycnycyca OO . OAycn OO ycA. OAyc OO ycEyayca . OAyca OO ya . ycycyca OO . OAyc OO ycEyayca . OAyca OO ya . ycuycnycyca = 0 OAycn OO ycA ycuycuyc ycaycoycoycuycaycaycyceycc ycnycu yc ycnycuycycnyccyce yca ycuycnycyca O ycycyca yceycuyc ycaycoyco ycn, yc OO yca OcycEya yc=1 ycycyca = ycE In this model, the cost function is illustrated in . as the sum of server initiation cost and replication Constraints . Ae. enforce the capacity of servers in terms of storage, processing power, bandwidth, and workload, respectively. These constraints . Ae. state that the total storage space, processing power, bandwidth, and workload of all contents replicated at server j should not exceed this server capacity. The number of copies for content i inside a service area has to be less than or equal number of selected potential servers P in that service area as depicted in equation . In addition, the number of copies for content i should be bounded by its popularity value ycEycn as depicted in equation . The integrality and nonnegativity constraints are presented in equations . The contents should not be replicated on a potential server unless this potential location is selected as a server according to equation . In other words, the contents will not replicate on a site j unless this site is selected as a server. Finally, the number of servers that Content-aware resource allocation model for IPTV delivery networks (Suliman M. Fat. A ISSN: 2088-8708 should be selected in any service area should be bounded by a predefined number P as in . where P is a predefined number of locations in the service area. The complexity of the proposed optimization model The proposed optimization model for Content-Aware Resource Allocation problem is a combinatorial optimization problem and belongs to NP-complete class. We can prove the completeness of this problem by considering the whole network consists of a single service area with a single server, which allocates all the objects. Moreover, we assume that the resources cost, except the storage cost, equals zero. According to these assumptions, the proposed optimization problem will be written mathematically as. Minimize: yaycN = OcycnOOycA ycoycn ycIyc Subject to OcycnOOycA ycoycn ycuycnycyca O ycoyc OAyc OO ycEyayca . OAyca OO ya . The simplified form of this problem is similar to the well-known knapsack problem, which is an NP-complete problem . For clarity, the knapsack problem tries to minimize/maximize a certain quantity with respect to some constraints. For instance, the knapsack problem aims at maximizing the obtained profit or minimizing the cost without exceeding the knapsack capacity. Similarly. CARA problem aims to minimize the allocation cost with respect to some enforced constraints. HYBRID GENETIC ALGORITHM FOR SOLVING CARA PROBLEM (HGA_CARA) There is no algorithm with a polynomial time complexity can solve the NP-complete combinatorial optimization problems. On the other side, applying the evolutionary algorithms is a good choice to solve such Thus, we integrate the Genetic Algorithm with two heuristic repair algorithms to solve CARA Such integration reduces the search space and improves the ability of the proposed model to produce optimal solutions efficiently. Heuristic repair algorithms The problem of producing infeasible chromosome can be solved by incorporating problem-specific For CARA problem, we have proposed two heuristic repair methods. The first one, called Restricted Search Operator (RSO), is to choose a certain number of servers in each service area based on the The second one, called Heuristic Replica Repair (HRR) algorithm is proposed to control the replica placement constraints. Restricted search operator (RSO) In many combinatorial optimization problems, there is a need to deal with chromosomes that hold a specified number of ones. for example the problem of selecting a subset of candidate locations to find the optimal solution, as in CARA problem. In this case, the individuals, which have more or less than the specified number of ones, will be rejected. Thus. Restricted Search Operator (RSO) is proposed to deal with this constraint. According to . RSO was firstly proposed by Salcedo-Sanz in 2004. next it is considered an extra operator added to the conventional GA. The idea of RSO is that the individual generated from crossover and mutation operators ycu may hold ycy ones. These ones may be different from the desired number ycA, thus, if ycy < ycA the RSO adds ycA Oe ycE ones in randomly chosen positions. On the other side, if ycy > ycA the RSO selects ycE Oe ycA ones randomly to be removed from the binary vector. The following Pseudo-code describes the Restricted Search Operator. Figure 2. Restricted search operator . Int J Elec & Comp Eng. Vol. No. February 2019 : 369 - 385 Int J Elec & Comp Eng ISSN: 2088-8708 In CARA problem, the restricted search algorithm is adapted to choose a certain number of servers where each server is represented as a string of binary values instead of a set of genes. In other words, the restricted search operator in the resource allocation case will work on a set of bulks instead of a set of genes. each bulk in this case consists of a set of sequential genes. As will be explained through this paper, each potential location yc in the service area yca consists of a set of sequential genes starting from the gene number (. ca Oe . O ycEyayca . c Oe . ) O ycA 1 and ending with the gene number (. ca Oe . O ycEyayca . c Oe . ) O ycA ycA. The restricted search operator represents the constraint 16 as follows: depending on the predefined number of locations ycE, the restricted search operator will choose randomly P bulks in each service area to be After that, the rest bulks in the service area will be change to zero. Thus, if the value of all cells for any location equal zero, this location will not be selected to be server. After applying the restricted search operator (RSO), which determines the number of selected servers. HRR algorithm checks each chromosome by counting the number of replicas for each video in the selected servers that are produced from RSO process. Hybrid genetic algorithm for CARA problem (HGA_CARA) To solve CARA problem, the genetic algorithm is hybridized with two repair algorithms. In this hybrid model, the objective function is to minimize the hosting cost . with respect to the listed In CARA problem, there are a set of candidate locations to construct servers in each service area. The aim is to select a subset of these locations to replicate the contents over them based on predefined For this problem, the binary encoding for the chromosomes is exploited. In such binary encoding, the chromosome is represented as a string of ones and zeros. This binary string represents the candidate solutions, in which each cell indicates for the replica position in the service area. The total length of the chromosome is A O PLa O N where A represents the number of service areas. PLa indicates to the number of candidate locations in the service area, and N denotes the number of contents. For instance, if the delivery network comprising of 2 service areas with two potential locations for each and the number of contents equals to 5, then the total length of chromosome equals to 20 genes, as depicted in Figure 3. Each cell in the chromosome holds a binary value xija OO . that indicates whether or not an object i is replicated at the server j inside the service area a. The position of any gene can be identified according to the following . cUycnycyca ) = (. ca Oe . O ycEyayca . c Oe . ) O ycA ycn Figure 3. Chromosome representation for CARA problem For example, the gene 13 holds a replica for the 3rd content into the first server in the second service area yca according to Figure 3. We can find this gene using equation . cu3,1,2 ) = (. Oe . O 2 . Oe . ) O 5 3 = 10 3 = 13. The potential location yc in the service area yca holds the genes from (. ca Oe . O ycEyayca . c Oe . ) O ycA 1 to (. ca Oe . O ycEyayca . c Oe . ) O ycA N. For instance, if there are 20 contents to be replicated over a delivery network with 5 service areas, each area contains 4 servers. The total length of the CARA problem chromosome will be 4*5*20 = 400 genes. The third server in the second service area will be represented by the genes from the position 121 to the position 140. The initial population of this problem is generated randomly by selecting a random binary number for each gene in the chromosome. During the evolution process, the roulette wheel selection technique is implemented to select two chromosomes at each time. The uniform crossover operator with a standard crossover probability is applied on the selected pair. After that, the reproduced offspring are mutated, with an extremely small mutation probability, to ensure the diversity in the search space. During the reproduction process, the new offspring may lie in the region of unfeasible solutions. therefore, two heuristic repair algorithms are used to repair this offspring to be obeyed by the proposed constraints as depicted in Figure 4. Content-aware resource allocation model for IPTV delivery networks (Suliman M. Fat. A ISSN: 2088-8708 Figure 4. Hybrid GA for CARA problem (HGA_CARA) THE EXPERIMENTAL RESULTS To evaluate the performance of the proposed content-aware resource allocation model (CARAM), an experiment using the proposed evolutionary approach HGA_CARA is conducted. Here, the numerical results are reported. The experiments have been conducted on a system model of 40 servers divided among 10 service areas and 100 video contents. The predefined number of servers P equals 2. This means that the model has to choose two potential locations among four locations to become servers. These potential locations are associated with the cost of storage, processing power, and bandwidth resources. Moreover, the video contents have to be replicated on the selected servers in each service area during the experiment. In this section, the effect of content status on the resource allocation process is investigated. The effect of content status is studied using two popularity distributions. The aim of using two popularity distributions in this experiment is to study the impact of the change in the popularity distributions on resource allocation process. This impact is measured in terms of storage. CPU consumption, bandwidth, and the number of popular contents. The popularity distributions have been sampled using two different equations. The first popularity distribution follows the normal density function . This popularity distribution has been sampled for 100 contents and depicted in equation . We refer to this distribution by the symbol (DF). ycyycn = yce Oe(. cnOe. OI) /2 Oe(. cnOe. OI)2 /2 Ocycu ycn=1 yce The other popularity distribution has been sampled for the same content size according to ZipfAos law Like the symbol k in Zipf distribution, the symbols OI represents for the degree of skewness in the popularity distribution (DF). The expected load of contents is estimated using the proposed IPTV content status model for the two popularity distributions. Figure 5 depicts the two popularity distributions with different skewness values 0. 01, 0. 04, and 0. According to this figure, the curve of small skewness values . 01 and 0. indicates that the contents popularity are distributed in an uniform manner, which means that all contents have the same popularity However, the curve tends to be more skewed when the skewness value increased. Notably. DF distribution tends to be more skewed than ZipfAos distribution in the figure when the skewness value increased . 08 as in the figur. Consequently, the number of popular contents in DF distribution is less than Int J Elec & Comp Eng. Vol. No. February 2019 : 369 - 385 Int J Elec & Comp Eng ISSN: 2088-8708 the number of popular contents in ZipfAos distribution. For instance. Figure 4 shows that 36% and 66% of contents are popular for DF and ZipfAos distributions, respectively. The same thing can be observed in Figures 6 and 7 where Figure 6 shows that only 6% of the contents are popular at the skewness value equals 1 for DF distribution, whereas the other contents having the popularity value zero. Figure 5. The popularity distributions with different skewness values Figure 1. DF popularity distribution with different skewness values On the other side, at the same skewness value. Figure 7. shows that more than 16% of contents are popular with a long tail of the curve reaches 70% of contents for ZipfAos law distribution. The popularity value of these contents is low, but it does not equal zero. That means that ZipfAos distribution replicates more contents than DF distribution. Therefore, it can be concluded that at the higher skewness values, a small percentage of contents are However, in DF distribution, this percentage is less compared to ZipfAos distribution. This behavior can be interpreted as follows. In the DF distribution, the popularity degree divided among the contents in a smooth manner wherein the contents popularity values are close to each other at the low skewness degree. However, when the skewness value tends to increase, the disparity in popularity degree of contents becomes Unlike DF, the popularity curve starts with popular contents and then cascaded down gradually. On the other side, the DF distribution looks similar to ZipfAos distribution at the lower skewness degree. However unlike the ZipfAos distribution, the DF popularity curve lacks the smooth gradient at higher skewness In general, the popularity degree of popular contents is higher in DF distribution than of those in ZipfAos distribution at the same skewness values. Figure 7. ZipfAos popularity distributions with different skewness values Content-aware resource allocation model for IPTV delivery networks (Suliman M. Fat. A ISSN: 2088-8708 Based on this observation, the load of the popular contents in DF distribution is higher than the load of the same popular contents in Zipfs distribution at the same degree. This behavior of load distribution is depicted in the Figures 8. where the load of the first content is 800 and 5000 for Zipf and DF distributions, respectively. As mentioned through this thesis, the popularity distribution is not the only factor that affects the anticipated load distribution. However, it is the most influential factor. Thus, the anticipated load distribution of the contents takes the same behavior of the popularity distribution, but with some Figure 8. depicts the load distribution for DF distribution at low skewness values. This figure depicts that the popular contents take larger workload portion than the other contents. This is due to the highly skew of DF distribution as pointed out in Figure 6. Figure 8. Expected load according to DF distribution Accordingly, most of the incoming workload will be redirected to a small portion of contents. Unlike ZipfAos law, the load distribution following DF distribution will be more skewed than ZipfAos Such distribution follows the popularity skewness. For instance, a small percentage of contents utilize a higher portion of the load where these popular contents take a higher popularity degree. On the other side, the other contents having very low popularity degrees take the rest portion of the load, which is very low or in many cases equals zero. On the other hand, the expected load of contents that follow ZipfAos law distribution will be distributed more smoothly than DF distribution as depicted in Figure 9. However, definitely the popular contents will take the lionAos share. Figure 9. Expected load according to ZipfAos distribution Based on the different skew behaviors of these two popularity distributions, the effect of content status on the aspect of resources allocation process with accordance to these two popularity distribution is The effect of content status on resource allocation process is investigated and reported below in terms storage, bandwidth, and the number of popular contents. The expected load at different skewness values in the service area is depicted in Figure 10. In this figure, the expected load of DF distribution is higher than the expected load of Zipf distribution. This is due to that the expected load of contents in DF distribution is higher than the contents in Zipf distribution. Int J Elec & Comp Eng. Vol. No. February 2019 : 369 - 385 Int J Elec & Comp Eng ISSN: 2088-8708 Figure 10. The total load for one service area What is apparent from the same figure is that in case of the expected load of contents following Zipf distribution, the expected load increases linearly when the skewness value increases. On the other side, the expected load of contents following DF distribution follows the same behavior at the low skewness values. However, at the higher skewness values, the expected load alters its behavior inversely as it decreases when the skewness value increases. The behavior of DF can be interpreted by the sharp disparity in the popularity degree at higher skewness values. Such sharp disparity leads to store very small percentage of contents with high popularity values. Moreover, in case of redirecting any content with lower popularity, a significant portion of the workload can be redirected. Such redirection leads to decrease the load on the servers. On the other side, the contents with low popularity values may be redirected to other service areas. Redirecting these contents leads to reduce the load on servers. Such reduction in load interprets the dwindling in the total load of DF distribution at higher skewness values. Furthermore, the required bandwidth of any content in the service area is a function of both the active streams at the server and the bit rate of the streamed contents. Thus, the expected bandwidth for the delivery network follows the workload distribution if we assume that the bit rate of all contents is fixed. The expected bandwidth for one service area for bit rate=9. 8 Mbps is depicted in Figure 11. Figure 12 plots the exploited storage versus different skewness degrees for both popularity The figure shows that the exploited storage space tends to decrease when the skewness degree However, the storage space of contents following DF distribution decreases rapidly more than the storage space of those content following Zipf distribution. As depicted in Figure 12, the storage space of contents following Zipf distribution decreases from 58 GB to 44 GB while the storage space of contents following DF distribution decreases from 82 GB to 22 GB for the skewness values . 01Ae. Figure 11. The expected bandwidth for one service area Figure 12. The exploited storage versus different skewness degrees Content-aware resource allocation model for IPTV delivery networks (Suliman M. Fat. A ISSN: 2088-8708 This behavior can be interpreted as follows. In case of DF distribution, the contents are close to each other in their popularity, which leads to replicate more contents in the service area. Moreover, the number of replica of each video is high because of the popularity values are closer to each other than ZipfAos distribution at the same low skewness values, which leads to increase the number of replica for the replicated contents, as depicted in Figure 13. Consequently, the exploited storage space becomes high. From the same Figure, the number of replicas for Zipf distribution-based replicated contents is less than that of DF distribution, which leads to exploit a less storage space for Zipf distribution contents. In contrast, the exploited storage space for both types is decreased at the higher skewness value . at skewness values equals 1 and . due to that only very small percentage of contents will be replicated in the service area, as depicted in Figure 14. However. DF exploits a storage space slightly less than Zipf distribution at higher skewness values. This is due to that DF replicated a percentage of contents slightly smaller than the percentage of contents replicated by ZipfAos distribution. Finally, the number of stored contents in the service area is measured and depicted in Figure 15. From this figure, it can be noticed that the number of stored contents in the service area decreases when the skewness value increases. However, the decreasing for DF distribution is larger than that for Zipf The interpretation for this phenomenon refers to the fact that the skewness in DF distribution is higher than the skewness of Zipf distribution. Figure 13. The number of replica at low skewness values (=0. Figure 14. The number of replica at high skewness values Figure 15. Percentage of popular contents for both distributions Int J Elec & Comp Eng. Vol. No. February 2019 : 369 - 385 Int J Elec & Comp Eng ISSN: 2088-8708 CONCLUSION Integrating the replica placement with resource allocation is a necessity to cope with the fluctuation of demand on IPTV services. In this paper. Content-Aware Resource Allocation Model (CARAM) is CARAM model considers the contentsAo status to decide the required resources. For this purpose, the replica placement problem and the resource allocation problem have been integrated aiming to allocate the resources depending on the necessitated replication scheme. The content-aware resource allocation problem is studied and formulated as a constrained binary optimization problem. The proposed model has been solved using Hybrid Genetic Algorithm. The experiments have been conducted on two popularity The experimental results showed the impact of content status on the resource allocation Thus, considering the contentAos status during the resource allocation process helps in allocating the resources according to the demand without wasting the resources. The future work for this direction will focus on building online optimization method for allocating the resources on the fly. Such online optimization helps in allocation the resources according to the real demand. REFERENCES