Originally posted by Spacecadet
View Post
- Visitors can check out the Forum FAQ by clicking this link. You have to register before you can post: click the REGISTER link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. View our Forum Privacy Policy.
- Want to receive the latest contracting news and advice straight to your inbox? Sign up to the ContractorUK newsletter here. Every sign up will also be entered into a draw to WIN £100 Amazon vouchers!
Journey to SQL Authority
Collapse
X
-
You had a lookup table that is 30 million rows? Sounds like a design issue. Couldn't you use a join instead?"A life, Jimmy, you know what that is? It’s the s*** that happens while you’re waiting for moments that never come." -- Lester Freamon -
Sounds like a UK address table.Originally posted by Freamon View PostYou had a lookup table that is 30 million rows? Sounds like a design issue. Couldn't you use a join instead?Comment
-
it's not and no... although i might give the join another go tonight, just thought of another way of doing it.Originally posted by Freamon View PostYou had a lookup table that is 30 million rows? Sounds like a design issue. Couldn't you use a join instead?
Thanks
bingoOriginally posted by k2p2 View PostSounds like a UK address table.
problem with address data (at
least in this project) is that the incoming format is crap and doesn't join nicely to the PAF dataCoffee's for closersComment
-
I had this in a previous life. I built a neat pattern-matching tool to find likely dupesOriginally posted by Spacecadet View Postproblem with address data (at
least in this project) is that the incoming format is crap and doesn't join nicely to the PAF data
ah, halcyon days
(\__/)
(>'.'<)
("")("") Born to Drink. Forced to WorkComment
-
You'll have fun with that. Last time I did that I ended with a multi-stage process based on the accuracy of the individual records.Originally posted by Spacecadet View Post
bingo
problem with address data (at
least in this project) is that the incoming format is crap and doesn't join nicely to the PAF data
10% of it turned out to be total and utter crap which was surprising as they were all customers who had purchased goods via mail order.merely at clientco for the entertainmentComment
-
fuzzy logic. thats the answer, fuzzy logicOriginally posted by eek View PostYou'll have fun with that. Last time I did that I ended with a multi-stage process based on the accuracy of the individual records.
10% of it turned out to be total and utter crap which was surprising as they were all customers who had purchased goods via mail order.
(\__/)
(>'.'<)
("")("") Born to Drink. Forced to WorkComment
-
Fuzzy Logic doesn't cover the example of postcode given for Main Street, address given as High Row (which has a different postcode).Originally posted by EternalOptimist View Postfuzzy logic. thats the answer, fuzzy logic
merely at clientco for the entertainmentComment
-
yes it doesOriginally posted by eek View PostFuzzy Logic doesn't cover the example of postcode given for Main Street, address given as High Row (which has a different postcode).(\__/)
(>'.'<)
("")("") Born to Drink. Forced to WorkComment
-
That's pretty much the conclusion I'd got to as well, filter out the good, the not so good, the passable and then ignore the rest!Originally posted by eek View PostYou'll have fun with that. Last time I did that I ended with a multi-stage process based on the accuracy of the individual records.
10% of it turned out to be total and utter crap which was surprising as they were all customers who had purchased goods via mail order.Coffee's for closersComment
-
A lookup on a 30 million row table is definitely not a good idea. Lookup is designed to read the whole table into memory when the job starts up, and then access the data in memory for each row that passes through. If you read 30 million rows into memory your job will swap to disk a lot. I'd suggest using a join is definitely the way to go. In a real ETL tool you could also partition your data by the first half of the postcode to improve the join performance.Originally posted by Spacecadet View Postit's not and no... although i might give the join another go tonight, just thought of another way of doing it.
Thanks

Yep, I think most people have been through this pain at some point. Especially when it's addresses that have been manually keyed in with no sense-checking at the point of entry. There's specialist data cleansing tools for address data which may save you a lot of pain. Dataflux does a good job of it (amongst other things).Originally posted by Spacecadet View Postbingo
problem with address data (at
least in this project) is that the incoming format is crap and doesn't join nicely to the PAF data"A life, Jimmy, you know what that is? It’s the s*** that happens while you’re waiting for moments that never come." -- Lester FreamonComment
- Home
- News & Features
- First Timers
- IR35 / S660 / BN66
- Employee Benefit Trusts
- Agency Workers Regulations
- MSC Legislation
- Limited Companies
- Dividends
- Umbrella Company
- VAT / Flat Rate VAT
- Job News & Guides
- Money News & Guides
- Guide to Contracts
- Successful Contracting
- Contracting Overseas
- Contractor Calculators
- MVL
- Contractor Expenses
Advertisers
Contractor Services
CUK News
- Is your Director’s Loan Account (DLS) a target of HMRC’s closer look at close companies? Today 04:45
- Contractors, are you making any of the five big limited company bank account mistakes of 2026? Yesterday 05:51
- ‘Welcome’ increase in HMRC mileage rates for contractors using their own cars for work May 27 05:18
- King’s Speech 2026 including a welcome Late Payments Bill still leaves contractors short May 26 04:42
- Getting a mortgage when you're a contractor. The system wasn't built for you. Is that finally changing? May 22 06:11
- How deepfake AI contractors threaten umbrella company supply chains under JSL May 20 06:31
- Mileage rates review: Will the first AMAP rethink in 15 years benefit contractors? May 19 05:57
- What is a Forward Deployed Engineer (FDE), and are FDE jobs for IT contractors ripe? May 18 04:43
- IT contractor demand lunged towards growth in April 2026 May 13 04:48
- What does PGMOL’s win over HMRC mean for contractors? May 12 07:25


Comment