• Visitors can check out the Forum FAQ by clicking this link. You have to register before you can post: click the REGISTER link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. View our Forum Privacy Policy.
  • Want to receive the latest contracting news and advice straight to your inbox? Sign up to the ContractorUK newsletter here. Every sign up will also be entered into a draw to WIN £100 Amazon vouchers!

Regexp help

Collapse
X
  •  
  • Filter
  • Time
  • Show
Clear All
new posts

    Regexp help

    I'm trying to spit a string and it wants a regexp to specify the delimeter.

    The delimeter should be " - " as in <space>-<space>.

    I have tried \\s-\\s but that still gives me the odd error.

    Anyone know how to exactly specify the delimeter?

    Cheers.

    #2
    try

    Code:
    [ ]-[ ]
    Thats a space inside the square brackets


    you can also use

    Code:
    [:space:]-[:space:]
    I've not had any problems with the former method
    Last edited by Spacecadet; 22 October 2009, 11:40.
    Coffee's for closers

    Comment


      #3
      Which regex implementation are you using? \s-\s works for me™ in JDK, Jakarta-ORO Perl 5, Jakarta-ORO AWK, JRegex and Jakarta-Regexp implementations.

      Comment


        #4
        JDK 1.7.

        It seems to work OK for 100 examples then just barfs on one through no other apparent reason that the regexp.

        Ill try the others after me lunch.

        Comment


          #5
          Originally posted by minestrone View Post
          JDK 1.7.

          It seems to work OK for 100 examples then just barfs on one through no other apparent reason that the regexp.

          Ill try the others after me lunch.
          Check that the failing one is really what it seems to be. If it has, say, an en-dash (–) instead of a hyphen (-)...

          Comment


            #6
            System.out.println( teamNames );
            String[] splitNames = teamNames.split("\\s-\\s") ;
            System.out.println( splitNames.length );
            System.out.println( splitNames[0] );
            Gives

            name1 - name2
            1
            name1 - name2
            Arrghh

            Even went as far as copying the dash out of the debugger and also from the console to make sure the thing was the same as I was looking for.

            I have discovered the ones that work do come from a different list and I can repeat the fail on everyone from the other list. No idea why that is though, the chars look the same.

            Comment


              #7
              Originally posted by minestrone View Post
              I have discovered the ones that work do come from a different list and I can repeat the fail on everyone from the other list. No idea why that is though, the chars look the same.
              Hyphen-minus is 0x2d, en dash is 0x2013. Time for a hex dump

              Comment


                #8
                Ahh, the ones that work have " - " in their data but the ones that fail have &nbsp;-&nbsp;.

                I am reading the XML data into a DOM then going through it but it looks like the 2 are not working the same even though they come out as spaces for both when I look at them in the debugger.

                Comment


                  #9
                  Originally posted by NickFitz View Post
                  Hyphen-minus is 0x2d, en dash is 0x2013. Time for a hex dump
                  just going to do that.

                  Comment


                    #10
                    The spaces comeout as 0xa0 if the come from a 'real' space and 0x20 if the come from a &nbsp;

                    I put the XML through JTidy to get the DOM.

                    Never even know there were 2 types of space and the debugger clearly never knew the difference.

                    Recon i'm jusy going to replace all the 0x20 to 0xa0s

                    Cheers.

                    Comment

                    Working...
                    X