#include <hat.h>
Collaboration diagram for Hat< T >:
Public Member Functions | |
void | init (const size_t aExpectedSize=min_hat_size) |
Hat (const size_t aExpectedSize=min_hat_size) | |
Hat default constructor (e.g., default since the class definition assigns a default value of zero to aExpectedSize). | |
Hat (const Hat< T > &hat) | |
~Hat () | |
void | insert (const T &a, const size_t i) |
Insert an element "a" into the array at index "i". | |
void | remove (const size_t i) |
Remove (delete is a reserved word) the element at index "i". | |
void | remove_n (const size_t i, const size_t n) |
remove_n: like remove, but this function removes more than one item. | |
T & | operator[] (const size_t i) |
T | operator[] (const size_t i) const |
T & | at (const size_t i) |
the at function as an l-value | |
T | at (const size_t i) const |
return the address of a data element | |
T | first () const |
return the first data element in the array | |
T | last () const |
Return the last data element in the array. | |
void | append (const T &a) |
Add an element to the end of the array. | |
int | isEmpty () const |
return TRUE if there are no data elements in the array. | |
void | set_to_empty () |
When the number of elements is set to zero, the array is empty of data, but contains storage. | |
size_t | length () const |
Return the number of data elements in the array. | |
void | decrement_length (const unsigned int val) |
Private Member Functions | |
size_t | leafSize () const |
size_t | topSize () const |
size_t | topIndex (const unsigned i) const |
size_t | leafIndex (const unsigned i) const |
void | resize (const size_t newExpectedSize) |
Allocate a new HAT and copy the data from "this" HAT into the new HAT. | |
void | addLeaf (const T &aValue, const int doResize) |
Add a new Leaf to the HAT pointer array and insert the new data element (aValue) into it. | |
size_t | recommendedPower (const size_t s) const |
void | setPower (const size_t p) |
void | add_to_end (const T &aValue, const int doResize=1) |
Private Attributes | |
T ** | top |
top points to leaves | |
size_t | topUsed |
amount of top actually used (number of leaves) | |
size_t | power |
power of 2 used for leaves and top | |
size_t | leafMask |
used to compute the leaf index | |
size_t | numElements |
Number of elements used in the array. | |
size_t | numAvail |
number of elements currently allocated in the array |
Arrays of this type are particularly useful in code generators, where algorithms like register range analysis and peephole optimization work best on arrays of instructions.
This class is implemented by a Hashed Array Tree (see HATs: Hashed Array Trees - very fast variable length arrays by Edward Sitarski, published in the "Algorithm Alley" section of Dr. Dobb's Journal, September 1996). The code here is derived from the public domain code published along with this article (e.g., ftp from the Dr. Dobbs ftp site).
This is a template class. Some C++ compilers (e.g., Sun) will only compile template classes with in-line class functions.
This code is fairly tricky. Memory is only allocated when it is needed and the addLeaf and resize functions end up being recursive co-routines. I like to think that my code is more transparent than this, since trickiness makes code difficult to understand and maintain. However, I've combed through this code and run some tests on it and as far as I can tell, it works correctly.
Hashed Array Trees (hat)
A slightly altered quote from the Hats: Hashed Array Trees article is included below:
Although used to implement one dimensional arrays, HATs are really two dimensional. A HAT consists of a "Top" array and a number of "Leaves" which are pointed to by the Top array. The number of pointers in the Top array and the number of elements in each Leaf is the same, and is always a power of 2.
Because the Top and Leaf arrays are powers of 2, you can efficiently find an element in a HAT using bit operations. Usually appending elements is very fast since the last leaf may have empty space. Less frequently, a new leaf must be added, which is also very fast and requires no copying.
When the Top array is full, the size of new Top and Leaf arrays are calculated (e.g., a Top, that is twice as big, is allocated and enough leaf arrays are allocated to hold the current data set). A new Top is allocated. The existing data is "appended" to the new hat, allocating new leaves and deallocating old leaves as the copying is done. Recopying only occurs when the Top array is full and this happens when the number of elements exceeds the square of a power of two
size_t
You've got to love C++, it's huge and they constantly add to it. There are endless corners of the language that few people know about which allow people like Al Stevens (a Dr. Dobb's columnist) to count coup on the rest of us. One of these obscure features is size_t. From section 5.3.2 of The Annotated C++ Reference Manual:
... size_t, an impementation-dependent unsigned integral type defined in the standard header <stddef.h>. On most systems this is the local version of "unsigned int".
Code bloat
The Sun compiler has a habbit of in-lining constructors. As a result, a lot of code, especially in templates, ends up in line, exploding the size of the program.
Definition at line 122 of file hat.h.
|
Hat default constructor (e.g., default since the class definition assigns a default value of zero to aExpectedSize).
Definition at line 287 of file hat.h. References Hat< T >::init().
00288 { 00289 init( aExpectedSize ); 00290 } |
|
Definition at line 293 of file hat.h. References Hat< T >::init().
00294 { 00295 init( 0 ); 00296 (*this) = hat; 00297 } |
|
Definition at line 299 of file hat.h. References NULL, Hat< T >::top, and Hat< T >::topUsed.
|
|
Definition at line 242 of file hat.h. References Hat< T >::addLeaf(), Hat< T >::numAvail, and Hat< T >::numElements. Referenced by Hat< T >::append(), and Hat< T >::resize().
00243 { 00244 if (numElements == numAvail) { 00245 addLeaf( aValue, doResize ); 00246 } 00247 else { 00248 (*this)[numElements] = aValue; 00249 numElements++; 00250 } 00251 } |
|
Add a new Leaf to the HAT pointer array and insert the new data element (aValue) into it.
Definition at line 182 of file hat.h. References FALSE, Hat< T >::leafSize(), Hat< T >::numAvail, Hat< T >::numElements, Hat< T >::resize(), Hat< T >::top, Hat< T >::topIndex(), Hat< T >::topSize(), Hat< T >::topUsed, and TRUE. Referenced by Hat< T >::add_to_end().
00183 { 00184 /* If the top array is either empty (topUsed == 0) or is full 00185 topUsed == topSize (no new leaves can be added), reallocate the 00186 array. */ 00187 if( topUsed % topSize() == 0 ) 00188 { 00189 int growTop = TRUE; 00190 00191 if( doResize ) 00192 { 00193 resize( numElements ); 00194 00195 // Check if after the resize we have room. 00196 if( topIndex(numElements) < topUsed ) 00197 { 00198 (*this)[numElements++] = aValue; 00199 return; 00200 } 00201 00202 // Check if we have room for a new leaf. 00203 if( topUsed % topSize() != 0 ) 00204 growTop = FALSE; 00205 } 00206 if( growTop ) 00207 { 00208 // Increase the top array by topSize. 00209 T **topNew = new T * [ topUsed + topSize() ]; 00210 for( size_t i = 0; i < topUsed; i++ ) 00211 topNew[i] = top[i]; 00212 delete [] top; 00213 top = topNew; 00214 } 00215 } 00216 /* add a new leaf */ 00217 top[topUsed] = new T [leafSize()]; 00218 top[topUsed][0] = aValue; 00219 numAvail += leafSize(); 00220 topUsed++; 00221 numElements++; 00222 } /* addLeaf */ |
|
Add an element to the end of the array.
Definition at line 441 of file hat.h. References Hat< T >::add_to_end(). Referenced by Hat< T >::insert().
00441 { 00442 add_to_end(a); 00443 } |
|
return the address of a data element
Definition at line 425 of file hat.h.
00426 {
00427 return (*this)[i];
00428 }
|
|
the at function as an l-value
Definition at line 419 of file hat.h.
00420 {
00421 return (*this)[i];
00422 }
|
|
Definition at line 469 of file hat.h. References Hat< T >::numElements.
00469 { 00470 if (val <= numElements) 00471 numElements = numElements - val; 00472 else 00473 numElements = 0; 00474 } // decrement_length |
|
return the first data element in the array
Definition at line 431 of file hat.h.
00431 {
00432 return (*this)[0];
00433 }
|
|
Definition at line 269 of file hat.h. References NULL, Hat< T >::numAvail, Hat< T >::numElements, Hat< T >::recommendedPower(), Hat< T >::setPower(), Hat< T >::top, and Hat< T >::topUsed. Referenced by Hat< T >::Hat().
00270 { 00271 size_t p; 00272 00273 /* get the smallest power of two greater than aExpectedSize */ 00274 p = recommendedPower(aExpectedSize); 00275 setPower( p ); 00276 numElements = 0; 00277 numAvail = 0; 00278 00279 top = NULL; 00280 topUsed = 0; 00281 } // init |
|
Insert an element "a" into the array at index "i". An empty array is an array with no data elements in it. This is not necessarily the same as the number of storage elements currently available in the array. A data element can be inserted into an empty array at index 0 (e.g., appended to an empty array). A data element can be inserted at the end of the array (that is array.insert( a, array.length())). A data element cannot be inserted beyond array.length() and an attempt to do so will result in an assert error. Except for insertion at element 0 of an empty array and insertion at the end of the array, insert can be an expensive operation. All elements from i+1 to the end of the array are moved up one array location. Definition at line 334 of file hat.h. References Hat< T >::append(), Hat< T >::length(), and Hat< T >::numElements.
00335 { 00336 assert( i <= numElements ); 00337 00338 if (numElements == 0 || i == numElements) { 00339 append( a ); 00340 } 00341 else { 00342 assert( length() > 0 ); 00343 00344 size_t last_ix = length() - 1; 00345 00346 // Move all data elements from i+1 up by one index 00347 append( (*this)[last_ix] ); // append the last element to the end of the array 00348 assert( length() == last_ix + 2 ); 00349 00350 int ix; 00351 00352 for (ix = last_ix; ix > i; ix--) { 00353 (*this)[ix] = (*this)[ix-1]; 00354 } 00355 (*this)[i] = a; 00356 } 00357 } // insert |
|
return TRUE if there are no data elements in the array.
Definition at line 447 of file hat.h. References Hat< T >::numElements.
00447 { 00448 return numElements == 0; 00449 } |
|
Return the last data element in the array.
Definition at line 436 of file hat.h. References Hat< T >::numElements.
00436 { 00437 return (*this)[numElements-1]; 00438 } |
|
Definition at line 128 of file hat.h. References GetLeafIndex.
00128 { return GetLeafIndex(i); } |
|
Definition at line 125 of file hat.h. References Hat< T >::power. Referenced by Hat< T >::addLeaf(), Hat< T >::resize(), and Hat< T >::setPower().
00125 { return 1<<power; /* return 2^power */ } |
|
Return the number of data elements in the array. As noted above, this is different from the number of storage elements. Definition at line 465 of file hat.h. References Hat< T >::numElements. Referenced by Hat< T >::insert(), Hat< T >::remove(), and Hat< T >::remove_n().
00465 { 00466 return numElements; 00467 } |
|
Definition at line 412 of file hat.h. References GetLeafIndex, GetTopIndex, Hat< T >::numAvail, and Hat< T >::top.
00413 { 00414 assert( i < numAvail ); 00415 return top[GetTopIndex(i)][GetLeafIndex(i)]; 00416 } |
|
Definition at line 406 of file hat.h. References GetLeafIndex, GetTopIndex, Hat< T >::numAvail, and Hat< T >::top.
00407 { 00408 assert( i < numAvail ); 00409 return top[GetTopIndex(i)][GetLeafIndex(i)]; 00410 } |
|
Definition at line 225 of file hat.h. Referenced by Hat< T >::init().
00226 { 00227 00228 const size_t powerMin = 1; // set a resonable minimum 00229 size_t p; 00230 for( p = powerMin; s > (1<<(p<<1)); p++ ) 00231 ; 00232 return p; 00233 } // recommendedPower |
|
Remove (delete is a reserved word) the element at index "i". Like insert, this can be an expensive operation. Unless the element is at the very end of the array, elements from i+1 to numElements are moved "down" one element. It is an error to attempt to remove an element beyond the end of the array. It is also an error to remove an element from an empty array. Definition at line 370 of file hat.h. References Hat< T >::length(), and Hat< T >::numElements.
00371 { 00372 assert( length() > 0 ); 00373 assert( i < length() ); 00374 00375 if (i < length() - 1) { 00376 int ix; 00377 00378 for (ix = i; ix < length()-1; ix++) { 00379 (*this)[ix] = (*this)[ix+1]; 00380 } 00381 } 00382 00383 numElements--; 00384 } // remove |
|
remove_n: like remove, but this function removes more than one item.
Definition at line 389 of file hat.h. References Hat< T >::length(), and Hat< T >::numElements.
00390 { 00391 assert( length() > 0 ); 00392 assert( i + n <= length() ); 00393 00394 if (i + n < length()) { 00395 int ix; 00396 00397 for (ix = i; ix < length()-n; ix++) { 00398 (*this)[ix] = (*this)[ix+n]; 00399 } 00400 } 00401 00402 numElements -= n;; 00403 } |
|
Allocate a new HAT and copy the data from "this" HAT into the new HAT.
Definition at line 134 of file hat.h. References Hat< T >::add_to_end(), Hat< T >::leafSize(), NULL, Hat< T >::numAvail, Hat< T >::numElements, Hat< T >::power, Hat< T >::setPower(), Hat< T >::top, Hat< T >::topIndex(), and Hat< T >::topUsed. Referenced by Hat< T >::addLeaf().
00135 { 00136 size_t i, leaf_ix; 00137 00138 Hat<T> hatNew( newExpectedSize ); 00139 00140 /* if the new HAT is the same size (e.g., power) as "this" 00141 HAT, return */ 00142 if( hatNew.power == power ) 00143 return; 00144 00145 /* copy the data from "this" hat into the new array */ 00146 for( i = 0, leaf_ix = 0; i < numElements; i++ ) 00147 { 00148 hatNew.add_to_end( (*this)[i], 0 ); // append, but do not resize again 00149 00150 leaf_ix++; 00151 // delete the leaves as we go - this decreases memory overhead. 00152 if( leaf_ix == leafSize() ) 00153 { 00154 delete [] top[topIndex(i)]; 00155 leaf_ix = 0; 00156 } 00157 } 00158 00159 // delete any unused leaves. 00160 for( i = topIndex(numElements); i < topUsed; i++ ) 00161 delete [] top[i]; 00162 00163 // assign the new array to the old. 00164 delete top; 00165 top = hatNew.top; 00166 setPower( hatNew.power ); 00167 topUsed = hatNew.topUsed; 00168 numAvail = hatNew.numAvail; 00169 00170 // clean up so nothing gets corrupted. 00171 hatNew.numElements = 0; 00172 hatNew.topUsed = 0; 00173 hatNew.top = NULL; 00174 } // resize |
|
When the number of elements is set to zero, the array is empty of data, but contains storage. This is the difference between numElements (the number of data items) and numAvail, the number of storage elements available. Definition at line 457 of file hat.h. References Hat< T >::numElements.
00457 { 00458 numElements = 0; 00459 } |
|
Definition at line 235 of file hat.h. References Hat< T >::leafMask, Hat< T >::leafSize(), and Hat< T >::power. Referenced by Hat< T >::init(), and Hat< T >::resize().
|
|
Definition at line 127 of file hat.h. References GetTopIndex. Referenced by Hat< T >::addLeaf(), and Hat< T >::resize().
00127 { return GetTopIndex(i); } |
|
Definition at line 126 of file hat.h. References Hat< T >::power. Referenced by Hat< T >::addLeaf().
00126 { return 1<<power; /* return 2^power */ } |
|
used to compute the leaf index
Definition at line 261 of file hat.h. Referenced by Hat< T >::setPower(). |
|
number of elements currently allocated in the array
Definition at line 266 of file hat.h. Referenced by Hat< T >::add_to_end(), Hat< T >::addLeaf(), Hat< T >::init(), Hat< T >::operator[](), and Hat< T >::resize(). |
|
Number of elements used in the array. numElements always points to the next free element in the array Definition at line 264 of file hat.h. Referenced by Hat< T >::add_to_end(), Hat< T >::addLeaf(), Hat< T >::decrement_length(), Hat< T >::init(), Hat< T >::insert(), Hat< T >::isEmpty(), Hat< T >::last(), Hat< T >::length(), Hat< T >::remove(), Hat< T >::remove_n(), Hat< T >::resize(), and Hat< T >::set_to_empty(). |
|
power of 2 used for leaves and top
Definition at line 259 of file hat.h. Referenced by Hat< T >::leafSize(), Hat< T >::resize(), Hat< T >::setPower(), and Hat< T >::topSize(). |
|
top points to leaves
Definition at line 255 of file hat.h. Referenced by Hat< T >::addLeaf(), Hat< T >::init(), Hat< T >::operator[](), Hat< T >::resize(), and Hat< T >::~Hat(). |
|
amount of top actually used (number of leaves)
Definition at line 257 of file hat.h. Referenced by Hat< T >::addLeaf(), Hat< T >::init(), Hat< T >::resize(), and Hat< T >::~Hat(). |