AtW - Info needed - Contractor UK Bulletin Board

Visitors can check out the Forum FAQ by clicking this link. You have to register before you can post: click the REGISTER link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. View our Forum Privacy Policy.
Want to receive the latest contracting news and advice straight to your inbox? Sign up to the ContractorUK newsletter here. Every sign up will also be entered into a draw to WIN £100 Amazon vouchers!

You are not logged in or you do not have permission to access this page. This could be due to one of several reasons:

You are not logged in. If you are already registered, fill in the form below to log in, or follow the "Sign Up" link to register a new account.
You may not have sufficient privileges to access this page. Are you trying to edit someone else's post, access administrative features or some other privileged system?
If you are trying to post, the administrator may have disabled your account, or it may be awaiting activation.

Guest replied

8 March 2004, 12:07
Re: re

Anyway, don't blame me. That was your code - entered before I entered anything.
Leave a comment:
Guest replied

5 March 2004, 18:02
Re: re

it means what it says - I rarely had to join same table, ie:

select * from
Table T1, Table T2
where T1.ID=T2.ID
and T1.Class=1
and T2.Class=2

I find that having to do that kind of query indicates poor architecture :hat
Leave a comment:
Guest replied

5 March 2004, 17:55
Re: re

I never bothered to reference table itself in the same join before I seen it.

What does that mean?
Leave a comment:
Guest replied

5 March 2004, 00:41
Re: re

> select M1.DocID
> from MainIndex M1,MainIndex M2
> where M1.KeyWordID=1
> and M2.KeyWordID=2
> and M1.DocID=M2.DocID

listen, I am probably in weird mood to want to supress guilt but this code belongs to PerlOfWisdom, I never bothered to reference table itself in the same join before I seen it.

phew, that was good - not feeling guilty anymore
Leave a comment:
Guest replied

4 March 2004, 23:58
Re: re

MySQL Soundex
Leave a comment:
Guest replied

4 March 2004, 23:49
Re: re

reynolds, just ignore compression for now ok? At the end of the day you may find that having behind the scenes transparent disk zip compressors is the best way forward. If you want to learn more about hand-optimising your data structures then read how they did it at Google.
Leave a comment:
Guest replied

4 March 2004, 21:53
re

"In real world cases you will find that any mechanism has an overhead and you need to see the redundancy of the data being greater than the overhead. Remembering of course that not all applications require 100% preservation of information.

If you think about it - there needs to be a way to encode information within the data to specify how the size has been reduced, that is information so requires space.

Have you read any of Shannon ?"

No, but I've noticed he's big in compression.

I did a test compression on a string and noticed that the string needs to be a certain size to realise any benefits of the compression (like you said the encoding information is bundled in).
Leave a comment:
Guest replied

4 March 2004, 19:46
Re: re

> But if I turn words into wordID's I will still need to refer back
> to that hashtable just to turn them back into words again

yes thats what Table 1 is for - you will have unique index on WordID - you will only join it after you select X results you need so that you wont be joining _ALL_ found WordIDs, just those you will show on screen. Temporary table is handy for that sort of thing.

Edit: sorry Spod I was a wee bit inconsiderate today, hope I did not hurt your feelings
Leave a comment:
Guest replied

4 March 2004, 19:45
re:re

In real world cases you will find that any mechanism has an overhead and you need to see the redundancy of the data being greater than the overhead. Remembering of course that not all applications require 100% preservation of information.

If you think about it - there needs to be a way to encode information within the data to specify how the size has been reduced, that is information so requires space.

Have you read any of Shannon ?
Leave a comment:
Guest replied

4 March 2004, 18:48
Re: re

2 bits, where only 2 of the possible 4 values are ever used.
Leave a comment:
Guest replied

4 March 2004, 17:01
re

Apologies if this is a sh.it question. What is the smallest size chunk of data that can be compressed?
Leave a comment:
Guest replied

4 March 2004, 16:45
re

"no need to compress data - just make sure you eliminate redundant words (50% of data easy) and turn them into numbers - this will make tables and indices very compact. "

But if I turn words into wordID's I will still need to refer back to that hashtable just to turn them back into words again. For the space I'll save converting words to numbers is going to be used up again by providing a number-word lookup table.
Leave a comment:
Guest replied

4 March 2004, 16:29
ffs

no need to compress data - just make sure you eliminate redundant words (50% of data easy) and turn them into numbers - this will make tables and indices very compact. Intersect is not necessary, use that other query which will work just fine. I suggest you download SQL Server and play around with Query Analyser - it displays very nice query plans which would show how efficient your query is - and the whole game is about doing offline enough to be able to run very tight queries in real time.

The key to achieving exceptional real time performance is to do as much as possible offline.

And buy that book - even though I learnt most of stuff on my own I still found that book very useful - it should be even more so since you have not gone through it all the hard way (trial and error).

Hash tables are not the best - you need clustered index on it.
Leave a comment:
Guest replied

4 March 2004, 16:13
re search engine

Cheers Atw, SupremeSpod, PerlOfWisdom.

I'll look into those techniques. I want to fit the entire index into memory (ie. wordID, docID, locID) so response times are very quick. Unfortunately (fortunately?) I'm build my own database (a combination of hashtable and randomaccess files) so I can't use the special features such as intersect - I will just use a null pointer as a test to see if a term exists. Also I am building this into a web server so the whole package is just one application. AtW did you compress your data on the database? - I've found that this can cause I few problems.
Leave a comment:
Guest replied

4 March 2004, 15:47
Re: Just a suggestion...

"I just thought that the "SoundsLike" facility would be of use."

I've never used Soundex myself, but have seen it specified for various applications, normally lookups on people or place names.

To see what sort of results it gives try putting your own name into

resources.rootsweb.com/cg...xconverter

"smith" gives this little lot, by way of example:

SAINT | SAND | SANDY | SANTEE | SANTI | SCHMID | SCHMIDT | SCHMIT |
SCHMITT | SHAND | SHUMATE | SINNOTT | SMITH | SMITHEY | SMOOT |
SMOOTHY | SMYTH | SMYTHE | SNAITH | SNEAD | SNEATH | SNEED |
SNODDY | SOUNDY | SUNDAY |

Sunday sharing the same Soundex code as Smith ??
Leave a comment:

Reply to: AtW - Info needed

You are not logged in or you do not have permission to access this page. This could be due to one of several reasons:

Previously on "AtW - Info needed"

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Partners

Advertisers

Contractor Services

CUK News

Reply to: AtW - Info needed

You are not logged in or you do not have permission to access this page. This could be due to one of several reasons:

Previously on "AtW - Info needed"

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Partners

Advertisers

Contractor Services

CUK News

Tag Cloud