Phew, it was tough to scale the darn thing up as with so much data no code path is left untested and its rather painful to have long running process dieing without restart at the point of failure (now can do it).
A lot more URLs now (~30 mln + 7 mln links), and it should run faster thanks to additional subindexing and what not. It is now possible to search for ".NET" or "C#".
I implemented suggestion from this board about META descriptions shown for those sites that have them.
You can give it a spin here. Your comments will be much appreciated -- mind though that currently search engine runs on shared box with server and this affects speed of searching at times when server is busy archiving, to be fixed soon.
A lot more URLs now (~30 mln + 7 mln links), and it should run faster thanks to additional subindexing and what not. It is now possible to search for ".NET" or "C#".
I implemented suggestion from this board about META descriptions shown for those sites that have them.
You can give it a spin here. Your comments will be much appreciated -- mind though that currently search engine runs on shared box with server and this affects speed of searching at times when server is busy archiving, to be fixed soon.
Comment