Sorting data by frequency in C#

**DimPrawn** · 12 July 2007, 13:02

Originally posted by ASB

Yes, it is bog standard. But does it work? yes.

Now it may be that it doesn't work well enough, but I guess you'll never find out since it's rejected simply on the grounds of not being clever.

Please read the spec:

Just asking opinions on the most efficient and elegant way to achieve the following.

HTH

**Cowboy Bob** · 12 July 2007, 13:22

Doesn't anyone study algorithms anymore? A couple I can think of off the top of my head are Boyer-Moore (or a variation thereof) - http://en.wikipedia.org/wiki/Boyer-Moore - and KMP - http://en.wikipedia.org/wiki/Knuth%E...ratt_algorithm

That should give you a head start.

**ASB** · 12 July 2007, 13:23

Originally posted by DimPrawn

Please read the spec:

Just asking opinions on the most efficient and elegant way to achieve the following.

HTH

Yes, I did read that bit. And I conformed with it. The person who is not conforming is you.

What you are in effect saying at the moment is that a "standard" way of dealing with it is neither the most efficient not the most elegant.

You may turn out to be right - but you do not yet know that, and since you will not try a standard method you will never know how efficient it was (or wasn't).

**Burdock** · 12 July 2007, 14:11

written before the request that it should be written in binary and injected straight into the processor!

Code:

 Hashtable wordMap = new Hashtable();

            string[] words= <your array of words here>;


            foreach (string currentWord in words)
            {
	            if (currentWord.Length > 0)
	            {
		            if (wordMap.ContainsKey(currentWord))
		            {
			            wordMap[currentWord] = (int)wordMap[currentWord]+ 1;
		            }
                    else
                    {
                        wordMap.Add(currentWord, 1);
                    }
                }
            }
		   

            string[] wordNames = (string[])new ArrayList(wordMap.Keys).ToArray(typeof(string));
            int[] wordFrequencies = (int[])new
            ArrayList(wordMap.Values).ToArray(typeof(int));
            Array.Sort(wordFrequencies, wordNames);
            for (int currentWord = 0; currentWord < wordNames.Length; currentWord++)
	        {
                Console.WriteLine((wordNames[currentWord])+("\t")+(wordFrequencies[currentWord].ToString()));
            }

**Churchill** · 12 July 2007, 14:27

Obviously the most efficient way would be to use a binary tree and counter mechanism for the elements.

**ASB** · 12 July 2007, 14:36

Originally posted by Churchill

Obviously the most efficient way would be to use a binary tree and counter mechanism for the elements.

Not necessarily. It will depend on the actual words and how they map to whatever positioning algorithm is chosen.

For the terminally bored there is some discussion here:-

http://forum.java.sun.com/thread.jsp...sageID=4319661

**VectraMan** · 12 July 2007, 15:14

Go through the list once to work out the minimum and maximum length. Then if for example the minimum is 3 and the maximum is 5, you do a search for "aaa", "aaaa", and "aaaaa", then move onto "aaaab", until you've exhausted every possible word.

HTH.

**xoggoth** · 12 July 2007, 20:34

Dunno. I'd just use an efficient sort like shell mezner and count number in same blocks. Perhaps instead of sorting on entire string you could sort on just 1st x chars which would give you many unique items that did not need to be further sorted and could be discarded. Those that did sort on next 3 chars and so on.

**Ardesco** · 13 July 2007, 08:18

I was thinking more along the lines of sorting them by char length and then searching on first character and iterating down each char of the string.

Any word of x number of chars that occurs once can be discarded and then you can start filtering the words that have all the same number of chars.

Sorting data by frequency in C#

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Partners

Advertisers

Contractor Services

CUK News

Sorting data by frequency in C#

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Partners

Advertisers

Contractor Services

CUK News

Tag Cloud