• Visitors can check out the Forum FAQ by clicking this link. You have to register before you can post: click the REGISTER link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. View our Forum Privacy Policy.
  • Want to receive the latest contracting news and advice straight to your inbox? Sign up to the ContractorUK newsletter here. Every sign up will also be entered into a draw to WIN £100 Amazon vouchers!

Thought for the day

Collapse
X
  •  
  • Filter
  • Time
  • Show
Clear All
new posts

    #11
    I mean this:

    public int iPos
    {
    get { return oBits[oPos]; }
    set { oBits[oPos]=( value<POS_MAX ? value : POS_MAX); }
    }

    oBits is a BitVector32 and oPos is BitVector32.Section - this code works slowly not because its a function call but because of BitVector itself - I already changed a few places in the past to deal with hex bitmasks directly - this code is likely to be inlined since its small, and performance gains were in region of 1000%.

    The alternative code that you have written MAY be faster than bogstandard indexing of BitVector32, but it pays to bother calculate masks manually and use them as constants - sure pain in arse to maintain, but performance improvements are way too high to ignore.

    Comment


      #12
      i'd have thought that was an ideal area for some nifty assemby. or better still, [why spoil a friday night?], some good old binary...

      Comment


        #13
        Originally posted by scotspine
        i'd have thought that was an ideal area for some nifty assemby.
        Problem with assembly is that you can't do it directly in .NET - you have to link DLL and make a function call - the costs of a call will exceed benefits of assembly unless you use it in a "batch" mode so that it does lots of work - this is something I am going to do next week as I have a couple of sections doing some custom-decompression that can be sped-up big time if I have complete control over registers.

        Comment


          #14
          great stuff as long as you know which cpu[s] you're targetting

          or specifically, instruction set

          Comment


            #15
            Originally posted by scotspine
            great stuff as long as you know which cpu[s] you're targetting
            Well, I am not planning to target anything apart from x86 in it - so code should run on all, but I am actually interested into SSE instructions (supported by pretty much all modern x86s now) - they might just be ideal to speed data processing nicely - next week is going to be interesting for SKA

            Comment


              #16
              Originally posted by VectraMan
              That's just stupid. They're both template container classes to help you out, neither are meant to be the best possible solution. If you really wanted performance, you should have used your own classes and optimised it to do exactly what you needed. And we don't know that you're comparing like with like here.

              Frankly I've always thought STL was needlessly over complicated, and that the MFC container classes are much easier to use.
              Not really. If you want reasonable performance off the shelf, then the STL is okay. And it is powerful. But yeah, roll your own if you want the best performance. Though that can take longer to implement.

              For my recent heap walk code I wrote my own simple list classes as a) they run much faster than STL and b) they were easy to write. And it allowed me to use a simple pooled allocation scheme to avoid mallocs.

              IMO the STL is well worth learning and more powerful than MFC.

              I tend to roll my own whenever possible, and then save copies of my libraries, and re-use them at the next client.

              A lot of the people at my current client are ex-UNIX bods, and there's an awful lot of open source code from Apache and others. It is another world.

              Fungus

              Comment


                #17
                Originally posted by AtW
                Well, I am not planning to target anything apart from x86 in it - so code should run on all, but I am actually interested into SSE instructions (supported by pretty much all modern x86s now) - they might just be ideal to speed data processing nicely - next week is going to be interesting for SKA
                When I've done performance tuning I always come across at least one of my juniors making the same assumption.

                It is a bad assumption.

                Even the size of the L2 cache can have significance.

                threaded in "I've achieved superlinearity, more than once, so ner ner" mode
                Insanity: repeating the same actions, but expecting different results.
                threadeds website, and here's my blog.

                Comment


                  #18
                  Originally posted by threaded
                  When I've done performance tuning
                  If this is as true as your performance cars then all I say say is -

                  Comment

                  Working...
                  X