Thursday, December 8, 2016

Algorithms : Design Paradigms

Algorithms can be classified according to their design paradigm. Some common paradigms include:
  • Exhaustive search. This is the naive method of trying every possible solution to see which is best.

  • Divide and conquer. A divide and conquer algorithm repeatedly reduces an instance of a problem to one or more smaller instances of the same problem until the instances are small enough to solve easily.

  • Dynamic Programming. a problem shows an optimal substructure, meaning the optimal solution to a problem can be constructed from optimal solutions to subproblems, and overlapping subproblems then dynamic programming avoids recomputing solutions that have already been computed. Dynamic programming and memoization go together. Using memoization (maintaining table of subproblems already solved) dynamic programming reduces the exponential nature of many problems to polynomial complexity.The main difference between dynamic programming and divide and conquer is that subproblems are more or less independent in divide and conquer, whereas subproblems overlap in dynamic programming.

  • The greedy method. Greedy method is similar to a dynamic programming, but the difference is that solutions to the subproblems do not have to be known at each stage; instead a "greedy" choice can be made of what looks best for the moment.

  • Linear programming. When solving a problem using linear programming, specific inequalities the inputs are found and then an attempt is made to maximize or minimize some linear function of the inputs.

  • Reduction. This technique involves solving a difficult problem by transforming it into a better known problem for which we have asymptotically optimal algorithms. The goal is to find a reducing algorithm whose complexity is not dominated by the resulting reduced algorithm's.

  1. Randomized algorithms are those that make some choices randomly (or pseudo-randomly); for some problems, it can in fact be proven that the fastest solutions must involve some randomness. There are two large classes of such algorithms:
    1. Monte Carlo algorithms return a correct answer with high-probability. E.g. RP is the subclass of these that run in polynomial time)
    2. Las Vegas algorithms always return the correct answer, but their running time is only probabilistically bound, e.g. ZPP.
  2. In optimization problems, heuristic algorithms do not try to find an optimal solution, but an approximate solution where the time or resources are limited. They are not practical to find perfect solutions. An example of this would be local search, tabu search, or simulated annealing algorithms, a class of heuristic probabilistic algorithms that vary the solution of a problem by a random amount. The name "simulated annealing" alludes to the metallurgic term meaning the heating and cooling of metal to achieve freedom from defects. The purpose of the random variance is to find close to globally optimal solutions rather than simply locally optimal ones, the idea being that the random element will be decreased as the algorithm settles down to a solution. Approximation algorithms are those heuristic algorithms that additionally provide some bounds on the error. Genetic algorithms attempt to find solutions to problems by mimicking biological evolutionary processes, with a cycle of random mutations yielding successive generations of "solutions". Thus, they emulate reproduction and "survival of the fittest". In genetic programming, this approach is extended to algorithms, by regarding the algorithm itself as a "solution" to a problem.

Monday, August 8, 2011

Algorithms : Divide and Conquer


A divide and conquer algorithm repeatedly reduces an instance of a problem to one or more smaller instances of the same problem until the instances are small enough to solve easily.



Equal number of nuts and bolts of various sizes are given. There is a match for every bolt in the set of the nuts. Comparisom of a bolt with a bolt and a nut with a nut is prohibited. How to make ut bolt pairs in shortest possible time?

We can use divide and conquer here. We cannot sort only nuts ( only bolts ) according to size because nut-nut (bolt-bolt ) comparison is prohibited.

So we try to match a nut with bolts. Three conditions arise :
a. Its a match.
b. Bolt is smaller.
c. Bolt is bigger.

Now we can divide bolts in two parts: smaller ones and bigger ones. The matching nut-bolt (pivot elements) are the first match.


Now try to match the pivot bolt with remaining nuts. Two conditions arise :
a. Nut is smaller.
b. Nut is bigger.

Now we can divide nuts in two parts: smaller ones and the bigger ones. The set of smaller nuts have a match in the set of smaller bolts and vice versa.


So after ~2n (assuming n nuts and n bolts) comparisons the bigger problem is divided into 2 smaller problems of half size (~ n/2).



Hence the time complexity is:
T(n)= T(n/2)+ O(n) ~ O(n lg n).


Monday, August 1, 2011

Simple Algorithms - Find duplicates in a list

Let's take an array of n integers. No integer repeats itself except one.

Given:
a. If n is even then the repeating
integer is repeated n/2 times. ( That is count of repeating integer is equal to the count of non repeating integers)

b.
If n is odd the repeating integer is repeated (n-1)/2 times or (n+1)/2.( That is count of repeating integer is one greater than or one less than the count of non repeating integers)

c. The repeating integer is never placed side by side in the array.

How to find out the repeating
integer?


General method of finding the repeating
integer is : creating a binary tree from the given integers. Whenever we encounter a duplicate integer, we can stop building the binary tree and return the repeating integer.

For the given special case, we can use the same algorithm. There are many possible permutations for the input array. If first and third integer same then it will take only three integers to be inspected to find out the repeating element.

So, what is the worst case for above problem. Lets create a permutation where we encounter the repeating integer 2 times as far as possible from the starting point.


If the count of repeating integer is one more then the non-repeating integers, that would have permutations like the one given below.

i. start here->1,2,1,3,1,4,1,5,1,6,1,7,1 ---- We can identify 1 as repeating integer after inspecting 3 integers.

If the count of repeating integer is equal to the non-repeating integers, that would have permutations like the two examples given below.

ii. start here->1,2,1,3,1,4,1,5,1,6,1,7 ---- We can identify 1 as repeating integer after inspecting 3 integers.

iii. start here->2,1,3,1,4,1,5,1,6,1,7,1 ---- We can identify 1 as repeating integer after inspecting 4 integers.


If the count of repeating integer is one less then the non-repeating integers, that would have permutations like the examples given below.

iv. start here->1,2,3,4,1 ---- We can identify 1 as repeating integer after inspecting 5 integers.

v. start here-> 2,3,1,4,1 ---- We can identify 1 as repeating integer after inspecting 5 integers.

Now we can generalize. If the count of repeating integer is two less then the non-repeating integers, that would have permutations like the examples given below (not limited to).

vi. start here-> 1,2,3,4,5,1 ---- We can identify 1 as repeating integer after inspecting 6 integers.

vii. start here-> 2,3,4,1,5,1 ---- We can identify 1 as repeating integer after inspecting 6 integers.

So for the given problem we have to inspect at most 5 integers from the array.



A general formula:

If the count of repeating integer is x less then the non-repeating integers then we have to inspect at most x+4 integers from the array for n greater than equal to 5.

Sunday, July 17, 2011

Algorithms and heuristics


Given an array of 9 numbers that contains numbers between 1 to 10, obviously no repetition, one number missing. How to find out the missing number?


Let the numbers in the array be 1,2,4,6,3,7,8,10,9 (total 9 numbers without repetition). I will try to solve this by using following algorithm that randomly came to my mind.

get_missing_number_1(array){

array=Sort(array);


for(i=0 to i
<9){

temp=array[i+1] - array[i] ;

if(temp>1) return array[i]+1;

}
}


To make things easy, I will give a dry run of this algorithm. Let the array after sorting is 1,2,3,4,6,7,8,9,10. Only number five is missing. For consecutive numbers in array i.e 2-1 is 1, 3-2 is 1 but 6-4 is 2, so 5 is missing.

But there exists a very simple method to find the missing number. Just do this

get_missing_number_2(array){

sum_array= sum of all numbers in array;

sum_1_to_10= sum of all numbers from 1 to 10;


missing_number= sum_1_to_10 - sum_array;


return missing_number

}

So, Did we wasted our time in the analysis of first algorithm?

The second method uses a heuristic and exploits particular properties of the problem. But there is lesser chance to solve the problem if it is generalized using a heuristic. Think, if array size is 8; that is, two numbers are missing , which of the above algorithms can be modified to find both the numbers?

The complexity of algorithm get_missing_number_1 is O(n log n) and get_missing_number_2 is O(n), but the first one is more generic.

Now question arises if there exists an algorithm that finds the missing number for generalized problem in linear time. The algorithm discussed get_missing_number_1 has time complexity O(n log n).

So, let's do divide and conquer. The image below is self-explanatory. The method used is to divide the problem by partitioning.



The time complexity comes out to be n + n/2 + n/4 ......... ~ O(n). {assuming partition divides the problem approximately into 2 equal parts}

On slight modification this algorithm can be used to find out multiple missing numbers. For that we have to follow multiple branches with missing numbers.

Thursday, July 7, 2011

Simple Algorithms - Generate a Series

Lets see how to generate a series i.e 0,1,2,...100,99.....1,0,1,2...... with an additional condition, that is, using only one variable.

First of all we must check the possibility of such an algorithm. Lets make an analogy (refer to figure below). A person runs from a pillar marked 0 towards another pillar marked 100, the distance between them is 100 meters. We take a variable called currentPosition to denote its position at any time. The value of currentPosition will change like 0,1,2,3,4......99,100. Now the person returns back to pillar marked 0. We can use the same variable to denote current position.
Here the direction of movement is different. But, we need not add another variable to denote direction. If the value of currentPosition is positive the person is moving from 0 to 100, if its negative the person is moving from 100 to 0. By analogy we can assume that it is possible to generate a series i.e 0,1,2,...100,99.....1,0,1,2...... using a single variable.

You can try this algorithm yourself.


Also, try to generate a series i.e 0,1,2,...100,99.....1,0,1,2......199,200,199,198............2,1,0 using only one variable? Is that possible?

Simple Algorithms - Swapping Values

We have three integer variables a,b,c. What are the steps to be followed to move value of a to b, b to c and c to a without using any extra variable?

To solve this we will try to find out a sub-problem first. This will make the task easier. A very simple problem is to swap values of two integer variables without using a third variable. Let a and b are two variables. The solution is:

swap(a,b) {
a=a+b;
b=a-b;
a=a-b;
}

The proof:

a =a+b; ----(1)
b=a-b=A(a+b) using (1) -b= a;---(2)
a=a-b= A(a+b) using (1)-B(a) using (2)=b;

Note: A(a+b) represents current value in variable a and so on.


Now if we have three variables a,b,c to move value of a to b, b to c and c to a we can follow following steps:

swap(a,b);
swap(a ,c);


So, what could be the algorithm if we have four integer variables i.e a, b, c, d ?