Re: re
Anyway, don't blame me. That was your code - entered before I entered anything.
- Visitors can check out the Forum FAQ by clicking this link. You have to register before you can post: click the REGISTER link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. View our Forum Privacy Policy.
- Want to receive the latest contracting news and advice straight to your inbox? Sign up to the ContractorUK newsletter here. Every sign up will also be entered into a draw to WIN £100 Amazon vouchers!
Reply to: AtW - Info needed
Collapse
You are not logged in or you do not have permission to access this page. This could be due to one of several reasons:
- You are not logged in. If you are already registered, fill in the form below to log in, or follow the "Sign Up" link to register a new account.
- You may not have sufficient privileges to access this page. Are you trying to edit someone else's post, access administrative features or some other privileged system?
- If you are trying to post, the administrator may have disabled your account, or it may be awaiting activation.
Logging in...
Previously on "AtW - Info needed"
Collapse
-
Guest replied
-
Guest repliedRe: re
it means what it says - I rarely had to join same table, ie:
select * from
Table T1, Table T2
where T1.ID=T2.ID
and T1.Class=1
and T2.Class=2
I find that having to do that kind of query indicates poor architecture :hat
Leave a comment:
-
Guest repliedRe: re
I never bothered to reference table itself in the same join before I seen it.
Leave a comment:
-
Guest repliedRe: re
> select M1.DocID
> from MainIndex M1,MainIndex M2
> where M1.KeyWordID=1
> and M2.KeyWordID=2
> and M1.DocID=M2.DocID
listen, I am probably in weird mood to want to supress guilt but this code belongs to PerlOfWisdom, I never bothered to reference table itself in the same join before I seen it.
phew, that was good - not feeling guilty anymore
Leave a comment:
-
Guest repliedRe: re
reynolds, just ignore compression for now ok? At the end of the day you may find that having behind the scenes transparent disk zip compressors is the best way forward. If you want to learn more about hand-optimising your data structures then read how they did it at Google.
Leave a comment:
-
Guest repliedre
"In real world cases you will find that any mechanism has an overhead and you need to see the redundancy of the data being greater than the overhead. Remembering of course that not all applications require 100% preservation of information.
If you think about it - there needs to be a way to encode information within the data to specify how the size has been reduced, that is information so requires space.
Have you read any of Shannon ?"
No, but I've noticed he's big in compression.
I did a test compression on a string and noticed that the string needs to be a certain size to realise any benefits of the compression (like you said the encoding information is bundled in).
Leave a comment:
-
Guest repliedRe: re
> But if I turn words into wordID's I will still need to refer back
> to that hashtable just to turn them back into words again
yes thats what Table 1 is for - you will have unique index on WordID - you will only join it after you select X results you need so that you wont be joining _ALL_ found WordIDs, just those you will show on screen. Temporary table is handy for that sort of thing.
Edit: sorry Spod I was a wee bit inconsiderate today, hope I did not hurt your feelings
Leave a comment:
-
Guest repliedre:re
In real world cases you will find that any mechanism has an overhead and you need to see the redundancy of the data being greater than the overhead. Remembering of course that not all applications require 100% preservation of information.
If you think about it - there needs to be a way to encode information within the data to specify how the size has been reduced, that is information so requires space.
Have you read any of Shannon ?
Leave a comment:
-
Guest repliedRe: re
2 bits, where only 2 of the possible 4 values are ever used.
Leave a comment:
-
Guest repliedre
Apologies if this is a sh.it question. What is the smallest size chunk of data that can be compressed?
Leave a comment:
-
Guest repliedre
"no need to compress data - just make sure you eliminate redundant words (50% of data easy) and turn them into numbers - this will make tables and indices very compact. "
But if I turn words into wordID's I will still need to refer back to that hashtable just to turn them back into words again. For the space I'll save converting words to numbers is going to be used up again by providing a number-word lookup table.
Leave a comment:
-
Guest repliedffs
no need to compress data - just make sure you eliminate redundant words (50% of data easy) and turn them into numbers - this will make tables and indices very compact. Intersect is not necessary, use that other query which will work just fine. I suggest you download SQL Server and play around with Query Analyser - it displays very nice query plans which would show how efficient your query is - and the whole game is about doing offline enough to be able to run very tight queries in real time.
The key to achieving exceptional real time performance is to do as much as possible offline.
And buy that book - even though I learnt most of stuff on my own I still found that book very useful - it should be even more so since you have not gone through it all the hard way (trial and error).
Hash tables are not the best - you need clustered index on it.
Leave a comment:
-
Guest repliedre search engine
Cheers Atw, SupremeSpod, PerlOfWisdom.
I'll look into those techniques. I want to fit the entire index into memory (ie. wordID, docID, locID) so response times are very quick. Unfortunately (fortunately?) I'm build my own database (a combination of hashtable and randomaccess files) so I can't use the special features such as intersect - I will just use a null pointer as a test to see if a term exists. Also I am building this into a web server so the whole package is just one application. AtW did you compress your data on the database? - I've found that this can cause I few problems.
Leave a comment:
-
Guest repliedRe: Just a suggestion...
"I just thought that the "SoundsLike" facility would be of use."
I've never used Soundex myself, but have seen it specified for various applications, normally lookups on people or place names.
To see what sort of results it gives try putting your own name into
resources.rootsweb.com/cg...xconverter
"smith" gives this little lot, by way of example:
SAINT | SAND | SANDY | SANTEE | SANTI | SCHMID | SCHMIDT | SCHMIT |
SCHMITT | SHAND | SHUMATE | SINNOTT | SMITH | SMITHEY | SMOOT |
SMOOTHY | SMYTH | SMYTHE | SNAITH | SNEAD | SNEATH | SNEED |
SNODDY | SOUNDY | SUNDAY |
Sunday sharing the same Soundex code as Smith ??
Leave a comment:
- Home
- News & Features
- First Timers
- IR35 / S660 / BN66
- Employee Benefit Trusts
- Agency Workers Regulations
- MSC Legislation
- Limited Companies
- Dividends
- Umbrella Company
- VAT / Flat Rate VAT
- Job News & Guides
- Money News & Guides
- Guide to Contracts
- Successful Contracting
- Contracting Overseas
- Contractor Calculators
- MVL
- Contractor Expenses
Advertisers
Contractor Services
CUK News
- How you think you look on LinkedIn vs what recruiters see Today 09:00
- Reports of umbrella companies’ death are greatly exaggerated Nov 28 10:11
- A new hiring fraud hinges on a limited company, a passport and ‘Ade’ Nov 27 09:21
- Is an unpaid umbrella company required to pay contractors? Nov 26 09:28
- The truth of umbrella company regulation is being misconstrued Nov 25 09:23
- Labour’s plan to regulate umbrella companies: a closer look Nov 21 09:24
- When HMRC misses an FTT deadline but still wins another CJRS case Nov 20 09:20
- How 15% employer NICs will sting the umbrella company market Nov 19 09:16
- Contracting Awards 2024 hails 19 firms as best of the best Nov 18 09:13
- How to answer at interview, ‘What’s your greatest weakness?’ Nov 14 09:59
Leave a comment: