- Visitors can check out the Forum FAQ by clicking this link. You have to register before you can post: click the REGISTER link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. View our Forum Privacy Policy.
- Want to receive the latest contracting news and advice straight to your inbox? Sign up to the ContractorUK newsletter here. Every sign up will also be entered into a draw to WIN £100 Amazon vouchers!
Reply to: Technical challenge
Collapse
You are not logged in or you do not have permission to access this page. This could be due to one of several reasons:
- You are not logged in. If you are already registered, fill in the form below to log in, or follow the "Sign Up" link to register a new account.
- You may not have sufficient privileges to access this page. Are you trying to edit someone else's post, access administrative features or some other privileged system?
- If you are trying to post, the administrator may have disabled your account, or it may be awaiting activation.
Logging in...
Previously on "Technical challenge"
Collapse
-
Very good AtW. Its all very good, but so far your search engine returns even less matches than www.onlyscanned5sites.com
Leave a comment:
-
Ok, a bit disappointing not to see any estimates, but I had pretty low expectations in the first place.
Anyway, here are the results of benchmarking (on AMD Athlon x2 3800 - single core used):
0) no caching tricks - search pattern is expected to be evenly distributed
1) 100 mln unique strings of 20 bytes each, data size: 2 GB
2) Indexing takes 30 mins
3) Generated index size is ~810 MB.
4) Running searches for 10,000 randomly selected strings, with 100 runs (total 10 mln searches) results in a sustained performance of ~232,000 (that's thousands) searches per second.
The system supports multiple indices, so it's not like everything is done for 100 mln strings: scalability is perfect because by adding 2nd CPU you will get double the speed, same goes with extra machines which will not offer extra speed but also load balancing and redundancy.
Could you do better than that? I doubt it. Not because I think I am so ****ing amazing, but because low level algorithms used in the system are such that mathematically pretty much perfect: you can't cheat them with probabilistic modelling because searches will be evenly distributed so caching tricks are off.
Leave a comment:
-
Originally posted by xoggothProvided it's on a PDP11/44.
Leave a comment:
-
I used to be a wiz at estimating hardware requirements in my sales support days so I can help. Provided it's on a PDP11/44.
Leave a comment:
-
Originally posted by AtWBenchmarking code is done, generating 100 mln unique strings index,actually slightly higher as the app need to remove duplicates that may or may not be present, but in the original task I thought to go easy on you
Hey AtW I can do it 10% quicker than your application. How do I know ? Easy, I am a fking genious and anything I do would be much, much better.
Leave a comment:
-
Benchmarking code is done, generating 100 mln unique strings index,actually slightly higher as the app need to remove duplicates that may or may not be present, but in the original task I thought to go easy on you
Leave a comment:
-
Have a look at the operating system TPF to do this or IMS running under zOS on a z9 EC 54 way. These systems do these type of things 24/7.
Leave a comment:
-
Originally posted by PRC1964At this point I gave up. There is no point in me living any more. Some fat moustachiod Russian immigrant in a bedsit in Birmingham has the answer to all the worlds problems and it is impossible for anyone to be better than him.
If only I could have such a great life.
Next ....
Leave a comment:
-
I will have my benchmarks in just a few minutes, hurry up with your own estimates because I will actually get it all working
Leave a comment:
-
don't worry about getting non-optimal answer as it is pretty much impossible to get it done better than my approach.
At this point I gave up. There is no point in me living any more. Some fat moustachiod Russian immigrant in a bedsit in Birmingham has the answer to all the worlds problems and it is impossible for anyone to be better than him.
If only I could have such a great life.
Leave a comment:
-
All very well AtW, wondering about the system, but you ought to know: most of us here could've got the client to pay us more to do the same thing.
threaded in "and here endeth the first lesson" mode
Leave a comment:
-
There is no current system - new system developed by me is totally brand new, which is in any case irrelevant - what is relevant is that you know what the system should do and I want your own estimate on the basis of your own experience.
How would you even approach this problem - use database?
Anything goes - don't worry about getting non-optimal answer as it is pretty much impossible to get it done better than my approach.
Leave a comment:
-
Originally posted by AtWok, here is what you have: 100 mln unique strings - 20 bytes each
Your job is to create a system that would allow to either confirm that a given string does not exist in the list of those unique strings you have, or, if it does exist, return unique numeric ID that you can associate with each of those unique strings - database equivalent of RowID.
In terms of performance the system should allow for at least 50 searches per second, or in other words one search should not take more than 0.020 sec.
How fast my system performs (writing benchmark code now) is irrelevant since I want to know what kind system would YOU need to build in order to get around that performance. It does not have to be exact, say 10 servers with X GB ram and Y CPUs each using Oracle etc: estimated cost £Z.
MarillionFan: if you can't give me estimate of work then you ain't getting contract in the first place because clearly you would not have a clue how to do execute it in a way that puts my interests first, ie: very cost effective solution.
Current performance is very relavent since we have no idea of what the software is capable of on it's current platform so we have no indication of the loads it is likely to put on the hardware.
If you are getting X% of your desired performance on a particular hardware platform then we can start to predict performance on other configurations taking into account OS overheads, disk speeds etc.
Is the idea of this excersise to show how much cheaper your solution is to an oracle based one?
Anyway, I'm of to do something more interesting with my free time than take on unpaid systems design work
Leave a comment:
-
Originally posted by AtWok, here is what you have: 100 mln unique strings - 20 bytes each
Your job is to create a system that would allow to either confirm that a given string does not exist in the list of those unique strings you have, or, if it does exist, return unique numeric ID that you can associate with each of those unique strings - database equivalent of RowID.
In terms of performance the system should allow for at least 50 searches per second, or in other words one search should not take more than 0.020 sec.
How fast my system performs (writing benchmark code now) is irrelevant since I want to know what kind system would YOU need to build in order to get around that performance. It does not have to be exact, say 10 servers with X GB ram and Y CPUs each using Oracle etc: estimated cost £Z.
MarillionFan: if you can't give me estimate of work then you ain't getting contract in the first place because clearly you would not have a clue how to do execute it in a way that puts my interests first, ie: very cost effective solution.
Remember what I said earlier about Aspergers Syndrome? This is a classic symptom, not having the basic empathy to understand that your post is quite probably the most boring ever to be posted on CUK ...
If Carlsberg did boring threads, this would be it ...
Leave a comment:
- Home
- News & Features
- First Timers
- IR35 / S660 / BN66
- Employee Benefit Trusts
- Agency Workers Regulations
- MSC Legislation
- Limited Companies
- Dividends
- Umbrella Company
- VAT / Flat Rate VAT
- Job News & Guides
- Money News & Guides
- Guide to Contracts
- Successful Contracting
- Contracting Overseas
- Contractor Calculators
- MVL
- Contractor Expenses
Advertisers
Contractor Services
CUK News
- Life Insurance services Yesterday 10:21
- Relevant Life Insurance Services Yesterday 10:08
- Will umbrella company regulation spark mergers and acquisitions? Yesterday 09:24
- Critical Illness Insurance for Contractors: Protect Yourself When It Matters Most Jan 14 16:26
- Relevant Life Insurance for Contractors with a Limited Company Jan 14 16:14
- Life Insurance for Contractors: Why it’s Essential Jan 14 16:09
- Guide to Income Protection Insurance for Contractors Jan 14 16:00
- Treasury minister told six actions can save contractor umbrella sector from ‘existential’ crisis Jan 14 09:40
- Critical Illness Services Jan 13 16:41
- Income Protection Services Jan 13 16:35
Leave a comment: