The programs in the src directory allow you to initialize a hash test database and populate it with random data. Several test programs give you different views of the database.
Test Program | Description |
---|---|
inithsh | Initialize the database in batch |
bldhsh | Populate the database |
delhsh | Delete random records in the database |
tmphsh | Populate a temporary database |
hshsize | Print the number of records |
dbck | Check the database pointers |
chains | Calculate statistics on overflow chains |
getprime | Calculate a prime number |
hreorg | Reorganize the database |
lsthsh | List the database in physical sequence |
dmphsh | Dump the database in hex |
schema | Print the schema for a database |
holdlock | Lock the database while running tstlock |
tstlock | Test the Database Lock |
The inithsh.c program is a batch wrapper
program for the hinit.c subroutine in
the library.
Read
Initialize a Hash Database
for initializing a new database.
The hshsize program counts the total
number of records in the hashing space and
the overflow space. Then it counts the total
number of empty records and deleted overflow
records.
The dbck program performs an integrity
check on a database. If the database is correct
the program ends quietly.
The chains program gathers statistics
on the overflow chains in a database.
As a guideline,
the total number of records in a hash
database should not exceed the totrcds
parameter in
hinit.c.
When this is true, the average length of each
overflow chain should generally not exceed 2.0.
The standard deviation should generally
be around 1.0.
The getprime program calculates
the largest prime number equal or less
than an estimated number.
The total number of records in a hash
database should be a prime number.
You may have to revise the total number of
records in the database as it grows beyond
your original estimate. Use
hreorg
to reorganize your database.
Input parameters to the reorg
program are:
If the total number of records parameter
is zero, the reorg program reorganizes
your database based on parameter input
and existing record counts.
The reorg program is included as a
convenience. However, it is not the best
way to reorganize your database. The best
way to reorganize is to sort the
lsthsh
report by column two and prioritize the groups
of records that hash to the same address.
Then when you insert the prioritized list
into a new database, the chains are optimal
for your application.
The lsthsh report may be used for doing mass
updates to a large database in physical sequence.
It may also be used for analyzing overflow chains.
The first column is the physical sequence number of
each record.
The second column is the offset in the database of
each record's hashing address.
The third column is the offset in the database of
each block. With overflow records, columns two and
three differ.
The fourth column is the offset in the database of
each record's next pointer in the overflow chain.
The last record in the chain always has a next
pointer of zero.
The fifth column is the record. The key is embedded
in columns 5-12 of the test database record.
Sort the report by column two for doing mass
updates to a large database. This allows you
to update in the physical sequence of the
database. More than one record may sort to the
same hashing address in column two. Your mass
update program should prioritize each record
in the record group sharing the same hashing address.
One of the goals for your mass update program should
be to make the overflow chains more efficient.
The fourth column contains the next pointer
of each record. This column is included to
allow you to check for broken chains.
The dmphsh listing may be used for analyzing
the pointers in the database.
The first 256 bytes are schema information about the
database. See the structure hashfmt in the
bthash.h header file for a description of
this schema. The fields from hndl to the end
are temporary. They aren't stored on disk.
The first record block starts in byte 256 of the database.
See the structure h_nodefmt in the bthash.h
header file for a description of the record block. The
first 4 bytes are the next pointer in the overflow
chain. Pointers are stored in little endian
format on Intel machines.
The data portion of the record starts in byte 5 of
the each block.
With this information, you can follow the structure of
the database and repair the database if it is damaged.
The free stack pointer is at location 0x2c of the
header. You may follow the free stack list until
you find a next pointer of zero. No record on the
free block chain should appear in an overflow chain.
The schema report lets you know what
the database characteristics are for a database.
This program prints the information that the
insert and update programs use to update the
database. This information consists of:
This information is saved in the first 256 bytes
of each database. The hashing routine uses these
four fields to calculate the hashing address,
based on the input key.
The bldhsh test program creates a record
with random data using hisrt. It then
reads it back with hget and
places 4 z's at the beginning of the record,
then replaces the record in the database using
the hupdt routine.
After all records have been created and updated,
the next step in the test script prints the database
in physical sequence to show the result of the
bldhsh program. You will see 4 z's in
the beginning of each record.
The tmphsh test program creates a temporary
database and adds records with random data using
hisrt. It then sweeps the database and
prints the records.
The delete program delhsh deletes
random records from the database.
It stores all the keys in a table, then
selects a random key, one at a time, to
delete. The size of the key table is
limited to 100 thousand keys.
Testing the database lock requires two logical
terminals and four steps.
Initializing a Hash Database
Database Size
Database Check
Overflow Chain Statistics
Prime Number Calculation
Database Reorganization
Lsthsh Report
Dmphsh Listing
Schema Report
Creating and Updating the Database
Create and Populate a Temporary Database
Delete Program
Testing the Database Lock
First Terminal | Second Terminal |
---|---|
Run holdlock.sh | |
Run tstlock.sh | |
Terminate holdlock | |
Re-run tstlock.sh |