Hello fellow .NET boffins,
Just asking opinions on the most efficient and elegant way to achieve the following.
I have a body of text. This text has been parsed into an array or collection of words. Stop words such as "in", "and", "or", "if" have been removed.
I would like to know the top N most frequent words, as efficiently (on CPU not RAM) as possible.
So to summarise. I bung in 100 or 10000 words pulled from a body of text and say I want the top 5 words and the routine in the blink of an eye says:
"Contractor"
"overpaid"
"lazy"
"tax"
"computer"
That would do for a start.
It needs to be using the sad old 1.1 framework, so no generics or anonymous delegate or yields please.
How would you approach this?
DP
Just asking opinions on the most efficient and elegant way to achieve the following.
I have a body of text. This text has been parsed into an array or collection of words. Stop words such as "in", "and", "or", "if" have been removed.
I would like to know the top N most frequent words, as efficiently (on CPU not RAM) as possible.
So to summarise. I bung in 100 or 10000 words pulled from a body of text and say I want the top 5 words and the routine in the blink of an eye says:
"Contractor"
"overpaid"
"lazy"
"tax"
"computer"
That would do for a start.
It needs to be using the sad old 1.1 framework, so no generics or anonymous delegate or yields please.
How would you approach this?
DP
Comment