Programming Assignment: Problem 1: In this assignment you will implement a program which prints out all anagrams of a specified string. Two strings are anagrams of one another if by rearranging letters of one of the strings you can obtain another. For example, the string "toxic" is an anagram of "ioxct". For the purpose of this assignment, we are only interested in those anagrams of a given string which appear in the dictionary. The dictionary you should use is the file words.txt. Algorithm and Implementation: Since we will be performing multiple anagram queries, our first step is to load all of the (25,000) words in the dictionary into an appropriate data structure. A primary requirement is that one must be able to efficiently search this data structure to look for anagrams of a given word. A clever trick that can be used is to first sort the letters of every word we insert into our data structure to produce a key for each word. For example, the key for the string "toxic" is "ciotx", similarly the key for both "star" and "rats" is "arst". (Expert Scrabble players will recognize this procedure as computing the alphagram of a set of letters). We will then use a hash table to store pairs of strings, where the pair consists of the original word and its key. When performing insertions, we will compute the hash of the key of the word to compute the correct bucket. This approach guarantees that all words which are anagrams of one another are stored in the same bucket of the hash table. Similarly, when we are searching for anagrams, we will first compute the key of the word we are searching for, then hash the key, then search that bucket for anagram matches. You should feel free to use the methods described in the text book and in class for appropriate hash functions for hashing strings (but please cite any source which you use). Write and submit two files: 1. hashtable.c -- basic functionality (insert, find) of a chained hash table. 2. anagram.c -- a main program which loads words from the dictionary into the hashtable and then answers anagram queries. Note: You may fix a size for your hash table -- for efficient searching, Recommended size at least 10,000 buckets. (For debugging your code,work with a much smaller practice dictionary, perhaps 10 words, and a much smaller hash table, perhaps 8 buckets). You may disregard any words in the dictionary which contain any characters which are not lowercase alphabetic characters (uppercase, punctuation, numbers). Your program should read anagram queries from a file which has been redirected on standard input. Each query will be on its own line and will simply consist of a string. The output file should echo the string, print the number of matching anagrams and list the anagrams. Problem 2: The goal of this assignment is to empirically verify the performance of a hash table with two different probing strategies (a) linear probing and (b) quadratic probing. In both the cases you have to report the average number of probes. The latter is defined as the total number of times a table location is calculated, i.e., including the one calculated by the hash function. After being sure that your average number of probes is being calculated correctly, empirically verify the performance vs. load factor lambda. Notes: 1. Use words from words.txt. 2. For this assignment, we are interested only in the expected number of probes required to do a single unsuccessful search or an insertion for a given load factor lambda. 3. To get the expected number of probes, you have to measure the expected number of probes that it takes to insert a single item into a hash table that is already “loaded” to a load factor lambda. Naturally, you should be picking words at random to do test insertions, so you should expect some variability in the number of probes it takes to insert each one. Therefore to get meaningful answers, you have to insert several and average the number of probes. A minor problem is that each insertion you do will change the load factor. However, with a large table this effect should be small enough to be ignored. That is, if you have 25,005 words and create table of size 50,000 you will have a load factor of 0.5 after inserting 25,000 of them. Inserting 5 or 10 more will change the load factor very little. 4. Run your program for lambda = 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.65, 0.70, 0.75, 0.8, 0.85 and 0.9. Plot the result with a spread sheet, or carefully on graph paper. Problem 3: The birthday paradox is the simple fact that the probability that at least two people in a room of randomly chosen people will share a common birthday is much greater than most people expect. Design and implement a C program that demonstrates this with a simulation. Your program should test values of n = 5, 10, 15, ..., 50 randomly chosen people. For each value of n, run 100 experiments in which n people are chosen at random and determine for what proportion of the 100 trials at least two of the n people have the same birthday. Your program should print the results in a simple table, such as: n % ---- ----- 5 0.013 10 0.121 15 0.347 Problem 4 (HASH): Suppose we want to find the first occurrence of a string p1p2. . . pk in a long input string a_1a_2 . . . a_n. We can solve this problem by hashing the pattern string, obtaining a hash value h_p, and comparing this value with the hash value formed from a_1a_2 . . . a_k,a_2a_3 . . . a_k+1,a_3a_4 .. . a_k+2, and so on until a_n-k+1a_n-k+2 . . . a_n. If we have a match of hash values, we compare the strings character by character to verify the match. We return the position (in a) if the strings actually do match, and we continue in the unlikely event that the match is false. Write a program to implement this algorithm.