• Visitors can check out the Forum FAQ by clicking this link. You have to register before you can post: click the REGISTER link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. View our Forum Privacy Policy.
  • Want to receive the latest contracting news and advice straight to your inbox? Sign up to the ContractorUK newsletter here. Every sign up will also be entered into a draw to WIN £100 Amazon vouchers!

FFS Microsoft

Collapse
X
  •  
  • Filter
  • Time
  • Show
Clear All
new posts

    #61
    Originally posted by AtW
    Look dude, let me explain it in terms that even you should be able to comprehent:

    a) using Array.Sort in my application made that sorting account for about 75% of runtime for one typical class of data: I know this for certain as I use profiler extensively, it lead me to Microsoft's source

    b) I took source code of Array.Sort and optimised and put into my own class Sorter.Sort - I use same data, every thing else is same: now sorting is less than 10% of run time

    Is that clear? If I could achieve such a major speed up without changing data that was sorted or anything else, then something is bloody wrong if Microsoft can't do it out of the box.

    madhippy: the application in question is a search engine that should scale as efficiently as possible to billions of pages.
    Just out of interest, what type and set size are you sorting and what version of the runtime are you using (1.1 or 2.0)? Also what is the application precisely - it may be the wrong use if you have to blow a large chunk of time on a sort like that.

    Microsoft provide "generic" sorting functionality but if you need anything high performance it's up to you. There is no such thing as a high performance, generic type-independent sorting algorithm - you have specialised it to your requirements.

    Oh and I'm self taught as well Via Knuth's books and experimentation.
    Last edited by TheMonkey; 2 August 2006, 08:25.
    Serving religion with the contempt it deserves...

    Comment


      #62
      Originally posted by Diestl
      You can pick up a computer and software, along with books and internet connection fairly cheap. Using this you can learn programming and nobody gets hurt, same can't be said for brain surgery unless you go digging fresh graves.
      Plenty of people get hurt - look at every failed IT project - one common cause is developer incompetence as displayed by AtW.

      Comment


        #63
        A few years back I used to do a lot of VC++/MFC/COM stuff and it was quite common to find problems with stability or performance within MS SDK libraries that required me to write my own version. I think its fair to say that it comes with the territory when working with Microsoft.

        IMHO C# is a good-enough language but .NET libraries need more release cycles before they are "prime-time".

        I'm surprised you didn't use Java, I'll admit its been a bit rubbish for years but 1.5 has made a substantial leap forward.

        Comment


          #64
          Originally posted by TheMonkey
          Just out of interest, what type and set size are you sorting and what version of the runtime are you using (1.1 or 2.0)?
          I use both - profiling 1.1 build since profiler hangs in one place on 2.0 build, must be a bug it it.

          All sorting is done on primitive types: int[] and byte[], no custom comparers (in this case) so it is natural to expect very high performance.

          The application is the search engine that deals with hundreds of millions of pages - I try to avoid sorting where possible, but in this case its not possible to easily avoid it, in fact its not possible at all, just could be different type of sorts used (ie merge sort), but with small enough dataset it should not be necessary that's why I resisted rewriting it and instead just fixed Microsoft's bug.

          Generics are better in this respect, however Microsoft still got same bug in them - look at the Microsoft's source code for Array.Sort to see what I mean, or better look under profiler which code path in their sorting is actually being executed - if its SortGenericArray then you are stuffed because its grossly inefficient and should not be called at all for primitive int[], byte[] etc types.

          C# and .NET are good - say I can use pointers and have direct memory access, this sped up some parts by factor of two, try doing same in Java, my #1 competitor is using it for their search engine, they started earlier and so far I've got superior stuff

          Comment


            #65
            Originally posted by zeitghost
            I can't believe that I'm reading a discussion on sorting in General...
            hush, zg, we're trying to raise the tone a little...

            Comment


              #66
              Grab his tail quick!

              Comment


                #67
                Originally posted by AtW
                I use both - profiling 1.1 build since profiler hangs in one place on 2.0 build, must be a bug it it.

                All sorting is done on primitive types: int[] and byte[], no custom comparers (in this case) so it is natural to expect very high performance.

                The application is the search engine that deals with hundreds of millions of pages - I try to avoid sorting where possible, but in this case its not possible to easily avoid it, in fact its not possible at all, just could be different type of sorts used (ie merge sort), but with small enough dataset it should not be necessary that's why I resisted rewriting it and instead just fixed Microsoft's bug.

                Generics are better in this respect, however Microsoft still got same bug in them - look at the Microsoft's source code for Array.Sort to see what I mean, or better look under profiler which code path in their sorting is actually being executed - if its SortGenericArray then you are stuffed because its grossly inefficient and should not be called at all for primitive int[], byte[] etc types.

                C# and .NET are good - say I can use pointers and have direct memory access, this sped up some parts by factor of two, try doing same in Java, my #1 competitor is using it for their search engine, they started earlier and so far I've got superior stuff
                It not a fu*king bug diptulip - it is an abstract general purpose piece of code - don't talk to me about templates - they just lead to excessive compile bloat.
                If you want a high performace sort routine or Random number generator or anything else high performance you write it yourself or get the code off the net - it's not hard - it is a solved problem.

                Comment


                  #68
                  Originally posted by AtW
                  I use both - profiling 1.1 build since profiler hangs in one place on 2.0 build, must be a bug it it.

                  All sorting is done on primitive types: int[] and byte[], no custom comparers (in this case) so it is natural to expect very high performance.

                  The application is the search engine that deals with hundreds of millions of pages - I try to avoid sorting where possible, but in this case its not possible to easily avoid it, in fact its not possible at all, just could be different type of sorts used (ie merge sort), but with small enough dataset it should not be necessary that's why I resisted rewriting it and instead just fixed Microsoft's bug.

                  Generics are better in this respect, however Microsoft still got same bug in them - look at the Microsoft's source code for Array.Sort to see what I mean, or better look under profiler which code path in their sorting is actually being executed - if its SortGenericArray then you are stuffed because its grossly inefficient and should not be called at all for primitive int[], byte[] etc types.

                  C# and .NET are good - say I can use pointers and have direct memory access, this sped up some parts by factor of two, try doing same in Java, my #1 competitor is using it for their search engine, they started earlier and so far I've got superior stuff
                  Yeah I've looked at it and I agree with you (reflector is a man's best friend). If you're not using an Icomparer it should be bloody fast!

                  Not wishing to sound horrible and waste effort, but have you tried dotLucene for a search engine?
                  Serving religion with the contempt it deserves...

                  Comment


                    #69
                    Originally posted by TheMonkey
                    Yeah I've looked at it and I agree with you (reflector is a man's best friend). If you're not using an Icomparer it should be bloody fast!
                    Exactly - which I was not using in that case, yet it was insisting on using the worst case sorter which was dog slow, this seems to happen because they use some rare .net op to take reference on array or something, but if it fails then resort to slowest method, now I don't know why the feck it was failing, but when I used 2nd slow method directly in my code (with explicit type like int[]) then it all works fast: I've got a couple more improvements for their loop in mind (like removing at least one unneecssary IF), plus variable swap is not efficient - might rewrite it in assembly to get maximum speed.

                    Originally posted by TheMonkey
                    Not wishing to sound horrible and waste effort, but have you tried dotLucene for a search engine?
                    No, and I am a lot better technologically than dotLucene - its not scalable for multi-billion page indices, I've got 1 bln pages now, and plan to add many more in a couple of months, which is why I am profiling code to make sure its fast.

                    Comment


                      #70
                      Originally posted by Jabberwocky
                      It not a fu*king bug diptulip - it is an abstract general purpose piece of code - don't talk to me about templates - they just lead to excessive compile bloat.
                      If you want a high performace sort routine or Random number generator or anything else high performance you write it yourself or get the code off the net - it's not hard - it is a solved problem.
                      Yeah but the performance isn't exactly scalable. They should have thought about it a little more.

                      Templates/generics are fine in C# - they are compiled to machine code with the final types at runtime. Very very very fast.
                      Serving religion with the contempt it deserves...

                      Comment

                      Working...
                      X