• Visitors can check out the Forum FAQ by clicking this link. You have to register before you can post: click the REGISTER link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. View our Forum Privacy Policy.
  • Want to receive the latest contracting news and advice straight to your inbox? Sign up to the ContractorUK newsletter here. Every sign up will also be entered into a draw to WIN £100 Amazon vouchers!
Collapse

You are not logged in or you do not have permission to access this page. This could be due to one of several reasons:

  • You are not logged in. If you are already registered, fill in the form below to log in, or follow the "Sign Up" link to register a new account.
  • You may not have sufficient privileges to access this page. Are you trying to edit someone else's post, access administrative features or some other privileged system?
  • If you are trying to post, the administrator may have disabled your account, or it may be awaiting activation.

Previously on "Regular Expressions"

Collapse

  • Guest's Avatar
    Guest replied
    regex

    Ah but if I do that in Java, it takes a lot longer. Basically the scene is - I've got a web crawler looking for news written in Java using multiple JVMs and it needs to recognise certain terms to dump in the DB. Because its multithreaded the algorithm needs to be as tight as possible or else it'll bottleneck.

    Leave a comment:


  • Guest's Avatar
    Guest replied
    Re: reex

    Code:
     foreach $s (" Lens ", " Lens,", " Lens.", " Lens'", " Lens\\", "Lens", "Lens ", " Lens" )
       {
       if ($s =~ / Lens[ ,\.']/)
          {
          print "--$s--\n";
          }
       }
    
    
    -- Lens --
    -- Lens,--
    -- Lens.--
    -- Lens'--

    Leave a comment:


  • Guest's Avatar
    Guest replied
    reex

    Reynolds

    dont forget about start of line conditions ie

    blah blah balh.
    Lens is a wonderful place.
    blah blah


    Of course I forgot java will itself strip \\ down to \ so you were right about the \\. to match a dot. Sorry havent done any java for a long time, I tend to use regex in awk.

    there is a good reference - look at the tutorials at
    www.javaregex.com specifically tutorial 3.

    Leave a comment:


  • Guest's Avatar
    Guest replied
    re

    cheers whats. Perl, the space should be at the start of the 4th as well.

    Leave a comment:


  • Guest's Avatar
    Guest replied
    Re: regex

    cant be arsed to optimise but the following Perl regex should work:

    my $RegExp=" Lens( |\,|\.|\')";

    If its performance critical (did not sound like) I'd do it in two stages:

    my $RegExp=" Lens.";

    .' will match any symbol, and in case of match I'd then check using simple switch what the symbol is.

    oh year Perl's question regarding trailing space in 4th stands, I assumed space is required in all cases.

    Leave a comment:


  • Guest's Avatar
    Guest replied
    Re: regex

    Are the leading spaces in the first three cases mandatory (and not in the fourth)?

    Leave a comment:


  • Guest's Avatar
    Guest replied
    regex

    of the top of my head cos I cant be bothered to check it
    you need java.util.regex and I think the pattern

    "[ ]*Lens[ ,\.\']+"


    Id expect to find a fullstop with \. \\. I would expect to find a slash folloed by any character.

    Leave a comment:


  • Guest's Avatar
    Guest started a topic Regular Expressions

    Regular Expressions

    I'm looking for a way to match the city Lens in a variety of texts. It has to match " Lens ", " Lens,", " Lens." and "Lens'"

    Does anyone know the best way of doing this with a java/perl regular expression?

    Is this any good?

    String j="Lens";
    String pattern = " "+j+" | "+j+",| "+j+"\\.|"+j+"'";

    And to find any word with a fullstop following does it have to have the fullstop like this \\. ?

Working...
X