Wednesday 18 April 2012

Greedy Algorithm technique

Hi Friends, we shall discuss greedy technique today.

Greedy Algorithm:  Greedy algorithm technique is very simple, straight towards the goal.  Every step you take to reach the includes take best possible step available that moment.  In Greedy once we move from step1 to step2 changing the some values, the values remain persistent.  No matter what, we are not going to change it.

Greedy algorithm need not produce the best optimal solution.  Many a times we find it producing incorrect results, but on applying any other technique might give us the best solutions (algos like DP or BackTracking).
Anyhow in certain problems, if certain conditions are met, we can safely use Greedy.  For example Change-Making problem.  In this case we are given the available denominations of some dollars or any currency and we are expected to produce change for some amount.  Greedy technique is applicable, if each of the smaller denomination is a factor of larger denomination.  Don't worry if that's confusing we'll revisit this problem shortly.

Knapsack Problem: In this problem a thief breaks into a store and finds lot of items precious to cheap goods.  Unfortunately his knapsack bag is of limit capacity.  The limitation to the bag is, it can't carry weight above some weight say 'W'.  Obviously the thief should be choosy in selecting the items to steal.  And to his surprise goods are having different weights with different values.  We as a generous people are going to help the thief with some logic, so let's help him.

Remember as I said, Greedy technique in most of the cases doesn't produce optimal solution.  Optimal solutions could only be obtained for certain problems and under certain conditions.  Here too the optimal solution is possible if the problem has one more condition i.e., it should be a Fractional Knapsack Problem.


There are two variants to Knapsack problem.
1. Fractional: Fractional part of the goods could be taken.  If the items are some liquid or powder etc.
2. 1-0: In this case no fractional part of goods are allowed.  Either you need to take the object as a whole or leave it all together.

Thus Let the question be given this way:
Capacity of knapsack be: W = 20 kgs.
Number of items be: 4
Item ids                :   1,     2,     3
Let their weight be: 10,    5,    15
Let their values be: 60,   35,   45
ration                   :    6,    7,     3   ----------------------->  These values gets populated in step1.
Step 1: Let's take the ratio of value/weight
           result:  6,7,3
Step 2: Sort the ratios in descending order.
           Thus first place is occupied by item-2 then item-1 and then item-3.
Step 3: Choose items with highest ratio from the available list.  Fill the Knapsack till the item gets exhausted or  we run out of knapsack
Step 4:  If  knapsack still has space to store any item, remove the item we just filled from available list and loop back to step3,
else if knapsack runs out of the space go to step 5.
Step 5. exit.

Time Complexity: O(n).

Activities and Practice Room: Here is an other example of Greedy Technique.  Imagine a practice room for a Dance practice.  Since the room is of limited size, it was understood that no two activities can take place in the room at a time.  i.e., given a time only one activity can be conducted (by Activity i mean Dance).  Thus for a given time period and given activities with there start and end times, our target is to find maximum number of activities that can be conducted in that room.

For example, let the input given to us be:

Activity           Start                   End
A1                  2                         6
A2                  5                         8

here A1 and A2 can't be conducted, since we have an hour intersection between A1 and A2.

Let's try to solve this problem bit differently.  We'll try to assume different ways and find if there is any disproof is available.  Note: In case you directly want to see the correct ans. please refer to the 'Correct Approach' section which would be placed after all the assumptions.

Assumption 1 :  We'll take starting timings of all the activities and sort them.  Then we'll select activities that starts first then next early available activity(which doesn't intersect with the already selected task), and so on.

Is this procedure an appreciable approach, let's see:
Disproof:  If the first activity occupies all the other activity time slots. We'll be left with just one activity.  Below is the diagram explaining this issue.

In the above case as we have decided to sort the start time activity, and pick the least start time activity.  We need to choose Activity 'A',  but on preferring 'A' we end up not picking up B,C,D.  Clearly if we would have chosen B,C & D we would have got maximum activities selected.  Thus this approach is not a promising one.

Assumption 2: 

Let us select least sized activity, in this case we end up with smaller activities which has lesser probability of occupying other activity time space.
In this assumption, our task is quite simple, we'll sort all the activities with on basis of their length.  And we'll choose least time length activity to high time length.

disproof:  Imagine if we have 3 activities 2 being large and 3rd being small.  let the larger activities be A and B and let the 3rd activity be C.  In case if C starts before A ends and also if C ends after B starts.  In this case we can neither select A nor B, it is possible if we drop C we might pick both A and B.  The image below illustrates this case.

Thus we can't go with this approach either.

Correct Approach: We'll sort based on the finishing time of activities.  we can blindly place the first activity from the sorted list. Then we'll select the second activity based on the condition that the starting time of the activity to be chosen should be greater than the finishing time of the activity just placed.  Thus we'll loop through the entire list to return the maximum possible activities which can be conducted in the given room with mutually exclusive condition. 

code snippet:

startTime[]=<input>;
endTime[]=<input>;
actName[]=<input>;

sort(endTime, startTime,actName);  //  Though we are sorting on finish time corresponding swapping need to                                                  //be done for activity names and start times.
List activities=......................; // your choice of list can be arraylist, vector, or Linked list
activities.add(actName[0]);
int temp=endTime[0];
for(int i=1;i<actName.length;i++){
     if(startTime[i]>temp)
     {
         activities.add(actName[i]);
         temp=endTime[i];
     }
}

return activities;

Time Complexity: O(n Log n)  ---------------> worst case for sorting
                            +  
                            θ(n) -----------------------> for looping.
thus basically the time complexity is O(n log n).


No comments:

Post a Comment