• Visitors can check out the Forum FAQ by clicking this link. You have to register before you can post: click the REGISTER link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. View our Forum Privacy Policy.
  • Want to receive the latest contracting news and advice straight to your inbox? Sign up to the ContractorUK newsletter here. Every sign up will also be entered into a draw to WIN £100 Amazon vouchers!

Reply to: Regexp help

Collapse

You are not logged in or you do not have permission to access this page. This could be due to one of several reasons:

  • You are not logged in. If you are already registered, fill in the form below to log in, or follow the "Sign Up" link to register a new account.
  • You may not have sufficient privileges to access this page. Are you trying to edit someone else's post, access administrative features or some other privileged system?
  • If you are trying to post, the administrator may have disabled your account, or it may be awaiting activation.

Previously on "Regexp help"

Collapse

  • minestrone
    replied
    That is how you lose 2 hours doing the most trivial of things that should be easy.

    Cheers.

    Leave a comment:


  • NickFitz
    replied
    Originally posted by minestrone View Post
    The spaces comeout as 0xa0 if the come from a 'real' space and 0x20 if the come from a  

    I put the XML through JTidy to get the DOM.

    Never even know there were 2 types of space and the debugger clearly never knew the difference.

    Recon i'm jusy going to replace all the 0x20 to 0xa0s

    Cheers.
    Yup, 0xa0 (aka 160) is the non-breaking space. Therefore,

    Code:
    [\s\xa0]-[\s\xa0]
    should do what you want

    Leave a comment:


  • minestrone
    replied
    The spaces comeout as 0xa0 if the come from a 'real' space and 0x20 if the come from a  

    I put the XML through JTidy to get the DOM.

    Never even know there were 2 types of space and the debugger clearly never knew the difference.

    Recon i'm jusy going to replace all the 0x20 to 0xa0s

    Cheers.

    Leave a comment:


  • minestrone
    replied
    Originally posted by NickFitz View Post
    Hyphen-minus is 0x2d, en dash is 0x2013. Time for a hex dump
    just going to do that.

    Leave a comment:


  • minestrone
    replied
    Ahh, the ones that work have " - " in their data but the ones that fail have  - .

    I am reading the XML data into a DOM then going through it but it looks like the 2 are not working the same even though they come out as spaces for both when I look at them in the debugger.

    Leave a comment:


  • NickFitz
    replied
    Originally posted by minestrone View Post
    I have discovered the ones that work do come from a different list and I can repeat the fail on everyone from the other list. No idea why that is though, the chars look the same.
    Hyphen-minus is 0x2d, en dash is 0x2013. Time for a hex dump

    Leave a comment:


  • minestrone
    replied
    System.out.println( teamNames );
    String[] splitNames = teamNames.split("\\s-\\s") ;
    System.out.println( splitNames.length );
    System.out.println( splitNames[0] );
    Gives

    name1 - name2
    1
    name1 - name2
    Arrghh

    Even went as far as copying the dash out of the debugger and also from the console to make sure the thing was the same as I was looking for.

    I have discovered the ones that work do come from a different list and I can repeat the fail on everyone from the other list. No idea why that is though, the chars look the same.

    Leave a comment:


  • NickFitz
    replied
    Originally posted by minestrone View Post
    JDK 1.7.

    It seems to work OK for 100 examples then just barfs on one through no other apparent reason that the regexp.

    Ill try the others after me lunch.
    Check that the failing one is really what it seems to be. If it has, say, an en-dash (–) instead of a hyphen (-)...

    Leave a comment:


  • minestrone
    replied
    JDK 1.7.

    It seems to work OK for 100 examples then just barfs on one through no other apparent reason that the regexp.

    Ill try the others after me lunch.

    Leave a comment:


  • NickFitz
    replied
    Which regex implementation are you using? \s-\s works for me™ in JDK, Jakarta-ORO Perl 5, Jakarta-ORO AWK, JRegex and Jakarta-Regexp implementations.

    Leave a comment:


  • Spacecadet
    replied
    try

    Code:
    [ ]-[ ]
    Thats a space inside the square brackets


    you can also use

    Code:
    [:space:]-[:space:]
    I've not had any problems with the former method
    Last edited by Spacecadet; 22 October 2009, 11:40.

    Leave a comment:


  • minestrone
    started a topic Regexp help

    Regexp help

    I'm trying to spit a string and it wants a regexp to specify the delimeter.

    The delimeter should be " - " as in <space>-<space>.

    I have tried \\s-\\s but that still gives me the odd error.

    Anyone know how to exactly specify the delimeter?

    Cheers.

Working...
X