Showing posts with label Coding. Show all posts
Showing posts with label Coding. Show all posts

Tuesday, April 17, 2012

Compiler's Optimization Techniques - Virtual Inheritance - Part 1

Inheritance is one of the pillars of Object Oriented Programming. Many high level languages have different implementations for inheritance and also they support wider ramification of inheritance. Despite these differences, the core implementation structure of inheritance converges to an accepted design which is common across all languages. The advantage of diving deeper into understanding the implementation details of inheritance is that it imparts greater confidence in you to be able to exactly figure out what compiler is trying to achieve.

The reason behind writing this article is, out of all the high level programming languages C++ stands out by supporting multiple inheritance, which none of the other high lever languages support in it's native form, but rather as interfaces. As a programmer it is always good to understand your compiler and its optimization techniques which ultimately helps you to write efficient code.

The main article will be presented as two separate posts, and this being Part-1 of the article, discusses primitive inheritance technique ie single level inheritance, object layout, memory allocation, implicit class pointer conversion, and pointer offsetting. and Part - 2 of the article provides comprehensive insight into implementation details of multiple and virtual inheritance.

Part - 1
Consider two simple classes, " Mobile_OS " and " WindowsPhone" which is derived from "Mobile_OS" with a public access specifier.

 class Mobile_OS  
 {  
 public:  
   float iKernalVersion;  
   float iReleaseVersion;  
   char* strVendor ;  
   
   virtual float GetKernalVersion();  
   virtual float GetReleaseVerison();  
   Mobile_OS();  
   ~ Mobile_OS();  
 };  
   
 class WindowsPhone : public Mobile_OS  
 {  
 private:  
   char* strCodeName ;  
   char* iHardwarePlatform ;    
   
 public:  
   int iCustomRomVersion ;  
   float GetKernalVersion();  
   float GetReleaseVerison();  
   WindowsPhone();  
   ~ WindowsPhone();  
 };  

The blue print of  "WindowsPhone" class lays out all the non virtual data members in the order of their declaration starting from the base class data members, followed by derived class data members. as shown in the diagram below.
 
Why this object layout is preferred for derived class?
The C++ Standards Committee allows object layout of any ordering of data members each separated by access declarator. But VC++ and most of the other compilers ensures that the objects are laid out in the order of declarations, with derived class data members following the base class data members.

What optimization does compiler achieve with this layout?
Derived class inherits all the public properties and behaviors of base class.  The complete instance of base class's data members are contained within the derived class address space. 
By placing the base class "Mobile_OS" at the starting address of a derived class "WindowsPhone" ensures that the address of the base object  "Mobile_OS" within derived object "WindowsPhone" corresponds to very first byte of  "WindowsPhone".  And hence this layout avoids offset calculations for  base object data access with in derived object.

How does offset calculation is avoided ?
 Mobile_OS* Lumia = new WindowsPhone();  
 Lumia->iKernalVersion;  
Consider the code snippet above, here base class data member "iKernalVersion" is accessed from "Lumia" object pointer whose static type is "Mobile_OS*" but dynamic type is "WindowsPhone*".
Since the implicit upcasting form "WindowPhone" to "Mobile_OS" is succesfull in this case, compiler just extracts value of "iKernalVersion" based on "Mobile_OS" layout, and hence no offset calculations required in this case. So to get to the base class data member compiler just needs to compute

&DataValue = DerivedClass_StartingAddress + Offset_of_DataMember_Within_BaseClass

What if  base class data members followed derived class data members ?
Consider the same code snippet specified above, and if at all base class data members followed derived class data members, accessing any data member of the base class would have been an  overload with offset computation. So to get to the base class data member compiler needs to compute.

&DataValue = DerivedClass_StartingAddress + Offset_To_BaseClassObject_Within_DerivedClass  +Offset_of_DataMember_Within_BaseClass 

I hope this provides convincing explanation about object layout and its advantages in case of single inheritance. And as mentioned earlier Multiple and Virtual Inheritance implementations will be discussed in Part-2 of the article. Kindly revert back if any comments or suggestions.

Thursday, January 19, 2012

Code Sprint - 2 , Algorithmic Problems

This post is for all the self proclaimed geeks who love resolving complex algorithmic problems. Below are the questions  from  code sprint 2 an online coding contest. Time to overclock your brains. Happy coding.

The Living History
Coin toss
You have an unbiased coin which you want to keep tossing until you get N consecutive heads. You've tossed the coin M times and surprisingly, all tosses resulted in heads. What is the expected number of additional tosses needed until you get N consecutive heads? Input: The first line contains the number of cases T. Each of the next T lines contains two numbers N and M. Output: Output T lines containing the answer for the corresponding test case. Print the answer rounded to exactly 2 decimal places. Sample Input : 4 2 0 2 1 3 3 3 2 
Sample Output : 6.00 4.00 0.00 8.00

Permutation
This is an approximation solution type problem. Please read about this type of problem at the bottom before attempting. Given n, print a permutation(p) of (0,1,2...n-1) From the permutation p, you can create n-1 (x,y) coordinates, where x and y are consecutive pairs in the permutation. You are also given the n x n square matrix V. For each coordinate, you can find an associated value in the matrix V. (x and y are 0-indexed) 
Task : Given n and V, find the permutation p that maximizes the sum of the associated values of the consecutive pair coordinates Constraints: n <= 50 Input: First line contains n Next n lines contains n integers, forming the matrix V 
Output : Space separated permutation of (0,1,2....n-1)

Polygon
There are N points on X-Y plane with integer coordinates (x i , y i ). You are given a set of polygons with all of its edges parallel to the axes (in other words, all angles of the polygons are 90 degree angles and all lines are in the cardinal directions. There are no diagonals). For each polygon your program should find the number of points lying inside it (A point located on the border of polygon is also considered to be inside the polygon). Input: First line two integers N and Q. Next line contains N space separated integer coordinates (x i ,y i ). Q queries follow. Each query consists of a single integer M i in the first line, followed by M i space separated integer coordinates (x[i][j],y[i][j]) specifying the boundary of the query polygon in clock-wise order. Polygon is an alternating sequence of vertical line segments and horizontal line segments.
Polygon has M i edges, where (x[i][j],y[i][j]) is connected to (x[i][(j+1)%M i ], y[i][(j+1)%M i ].
For each 0 <= j < M i , either x[i][(j+1)%M i ] == x[i][j] or y[i][(j+1)%M i ] == y[i][j] but not both.
It is also guaranteed that the polygon is not self-intersecting.
Output : For each query output the number of points inside the query polygon in a separate line.

Sub-sequence Weighting
A sub-sequence of a sequence is a sequence which is obtained by deleting zero or more elements from the sequence. You are given a sequence A in which every element is a pair of integers i.e A = [ (a 1 , w 1 ) , (a 2 , w 2 ) ,...,(a N , w N )]. For a sub-sequence B = [ (b 1 , v 1 ) , (b 2 , v 2 ) , .... ,(b M , v M ) ] of the given sequence : We call it increasing if for every i ( 1 <= i < M ) , b i < b i+1 .
Weight(B) = v1 + v2 + ... + vM .
Task : Given a sequence, output the maximum weight formed by an increasing sub-sequence 
Input : First line of input contains a single integer C. C test-cases follow. First line of each test-case contains an integer N. Next line contains a 1 , ... , a N separated by a single space. Next line contains w 1 , ... , w N separated by a single space. 
Output : For each test-case output a single integer number: Maximum weight of increasing subsequences of the given sequence.

Crab Graphs
A crab is an undirected graph which has two kinds of vertices: 1 head, and K feet , and exactly K edges which join the head to each of the legs.( 1 <= K <= T, where T is given) Given an undirected graph, you have to find in it some vertex-disjoint sub-graphs each one of which is a crab . The goal is to select those crabs in such a way that the total number of vertices covered by them is maximized. Note: two graphs are vertex-disjoint if they do not have any vertex in common. 
Input : First line of input contains a single integer C. C test-cases follow. First line of each test-case contains three integers N, T, and M (the number of nodes, max number of feet in the crab graph, and number of edges, respectively). Each of next M lines contains two space separated v 1i , v 2i meaning that the there is an edge between vertexes v 1i and v 2i . Note that the graph doesn't have parallel edges or loops. 
Output : For each test-case, output a single integer indicating the maximum number vertices which can be covered by some vertex disjoint sub-graphs of graph which are crabs

Picking Cards
There are N cards on the table and each has a number between 0 and N. Let us denote the number on the ith card by c i . You want to pick up all the cards. The ith card can be picked up only if at least c i cards have been picked up before it. (As an example, if a card has a value of 3 on it, you can't pick that card up unless you've already picked up 3 cards previously) In how many ways can all the cards be picked up? Input: The first line contains the number of test cases T. T test cases follow. Each case contains an integer N on the first line, followed by integers c i ,...,c N on the second line. Output: Output T lines one corresponding to each test case containing the required answer for the corresponding test case. As the answers can be very big, output them modulo 1000000007. 

Count strings
A regular expression is used to describe a set of strings. For this problem the alphabet is limited to 'a' and 'b'.
We define R to be a valid regular expression if: 1) R is "a" or "b" 2) R is of the form "(R1R2)" where R1 and R2 are regular expressions 3) R is of the form "(R1|R2)" where R1 and R2 are regular expressions 4) R is of the form "(R1*)" where R1 is a regular expression.
Regular expressions can be nested, and will always have two elements in the parenthesis. ('*' is an element, '|' is not; basically, there will always be pairwise evaluation) Additionally, '*' will always be the second element; '(*a)' is invalid.
The set of strings recognised by R are as follows: 1) If R is "a", then the set of strings recognised = {a} 2) If R is "b", then the set of strings recognised = {b} 3) if R is of the form "(R1R2)" then the set of strings recognised = all strings which can be obtained by a concatenation of strings s1 and s2 where s1 is recognised by R1 and s2 by R2. 4) if R is of the form "(R1|R2)" then the set of strings recognised = union of the set of strings recognised by R1 and R2. 5) If R is of the form "(R1*)" then the the strings recognised are the empty string and the concatenation of an arbitrary number of copies of any string recognised by R1.
Task : Given a regular expression and an integer L, count how many strings of length L are recognized by it. Input: The first line contains the number of test cases T. T test cases follow. Each test case contains a regular expression R and an integer L. Output: Output T lines one corresponding to each test case containing the required answer for the corresponding test case. As the answers can be very big, output them modulo 1000000007.
 

ShareThis