• Visitors can check out the Forum FAQ by clicking this link. You have to register before you can post: click the REGISTER link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. View our Forum Privacy Policy.
  • Want to receive the latest contracting news and advice straight to your inbox? Sign up to the ContractorUK newsletter here. Every sign up will also be entered into a draw to WIN £100 Amazon vouchers!

You are not logged in or you do not have permission to access this page. This could be due to one of several reasons:

  • You are not logged in. If you are already registered, fill in the form below to log in, or follow the "Sign Up" link to register a new account.
  • You may not have sufficient privileges to access this page. Are you trying to edit someone else's post, access administrative features or some other privileged system?
  • If you are trying to post, the administrator may have disabled your account, or it may be awaiting activation.

Previously on "Data matching, merging and cleansing"

Collapse

  • TGAOTU
    replied
    You could try giving Datactics a call.

    Here is there webtulipe.

    Leave a comment:


  • DataDelta
    replied
    match/merge vendor list on DataDelta website

    There are several high-end vendors that fit the profile you descibe. My firm's website lists the main ones, and we offer both a comparrison guide and unique software tool that helps companies scientifically compare vendor results to help select the most best one:
    www.DataDelta.com

    Here is a partial list of some of the main ones:
    Trillium Software - www.trilliumsoftware.com
    Firstlogic - www.firstlogic.com
    DataFlux - www.dataflux.com (acquired by SAS)
    Ascential - www.ascential.com (acquired by IBM)
    DWL - www.dwl.com (acquired by IBM)
    Innovative Sys - www.innovativesystems.com
    Initiate Sys - www.initiatesystems.com
    Datanomic - www.datanomic.com (UK based)

    For 100 million records, these vendors can easily cost US$500k+ when all the costs are factored in, and prices of US$1M are not uncommon.

    I'm happy to chat with you to help point you in the right direction & save you some time.

    Regards,
    Ed Allburn, Pres/CEO DataDelta
    [email protected]

    Leave a comment:


  • t0bytoo
    replied
    Trillium data cleansing tools could be what you need.
    I worked on an implementation at a large US utility company.
    It's efficient, and wonderfully expensive.

    http://www.trilliumsoftware.com/site...olutionset.asp

    Leave a comment:


  • Moose
    replied
    If your validating addresses etc at point of entry then I still think Quick Address is your best answer as you can check against the electoral role in real time as well as having a batch option for cleansing large datasets.

    We used it at News Int for cleansing about 20 million addresses and it handled that no probs, plus we also used it for our data entry systems.

    All the other systems we looked at (we looked at HelpIt for example) just didn't come up to scratch.

    That being said, this was 3 years ago, so the other available systems could have got a lot better.

    Leave a comment:


  • Skeptical
    replied
    Another one

    Check this one... http://www.helpit.com/, seems they have the features but they don't cite performance metrics.

    Leave a comment:


  • DimPrawn
    replied
    Originally posted by nobody here but us chicke
    easy, outsource it to india.
    will solve all your ID card probs.

    do you only need English or also foreign names/addresses ?
    UK data. Which means a large number of foreign names and addresses of course.

    Leave a comment:


  • nobody here but us chicke
    replied
    errr

    easy, outsource it to india.
    will solve all your ID card probs.

    do you only need English or also foreign names/addresses ?

    Leave a comment:


  • DimPrawn
    replied
    The data is highly sensitive and needs security clearance to view it and the matching, merging must be done in a secure server environment in real time. Data has been entered by officials and so swearing is unlikely

    Problem with all the systems out there is:

    1. They are too feeble. Must match and merge new data against millions of records in a few milliseconds, all real time, not batch processing.
    2. Too cheap.


    Anyway, keep pointing at systems out there as I'm learning a lot about the types of matching the data feeds involved etc.

    Leave a comment:


  • Moose
    replied
    Originally posted by DimPrawn
    Thanks Vetran, looks interesting.

    The main issue I have is the volume of data and the fact that the client spec needs to support continous updates and inserts at rates of several per second. Nutters.
    Quick address PAF is what we always used, but we did a lot of turd polishing before offering the file to PAF. How are you getting your addresses, if they're keyed then you'll also need to do stuff like swear checks etc.

    p.s. Be carefull you do your swearchecks carefully or people in Scunthorpe will never get any

    Leave a comment:


  • DimPrawn
    replied
    Thanks Vetran, looks interesting.

    The main issue I have is the volume of data and the fact that the client spec needs to support continous updates and inserts at rates of several per second. Nutters.

    Leave a comment:


  • vetran
    replied
    clean names & addresses

    Tried AFD?


    http://www.afd.co.uk/products.asp

    Refiner looked good last time I looked.

    They are one of the market leaders, support were very helpful and the internet product & client is fairly fast. last contact with them was 3 years ago so they have been taken over by Alien lizards.

    Others like Hopweiser and PostcodeAnywhere are in this trade.

    Leave a comment:


  • DimPrawn
    started a topic Data matching, merging and cleansing

    Data matching, merging and cleansing

    please

    Anyone know of commercial (price not really an issue) tools that perform automated, realtime matching, merging and cleansing of data based around persons and addresses?

    ie. Robert J Smith, Rob Smith, R J Smythe, Bob John Smith etc

    The tool should cope with building matching rules based on a complex set of fields, give weightings to each field etc.

    For addresses, again intelligent matching. For example transposed or erranous digits in post codes, miss spelled street names etc, house number 4 and four, Salop and Shrops and Shropshire. Rules again would need to be tuneable so that, for instance more weighting is given to postcode than street name.

    The databases involved are pretty huge (SQL Server 2000, 100 million rows +) and the matching and merging of data needs to be very fast as new data is added.

    Anyone know of tools that provide this level of intelligent data cleansing and consolidation?

    Again, forget price (e.g. £100K per license no issue at all).
    Last edited by DimPrawn; 23 November 2005, 17:56.

Working...
X