ChatGPT suggestions

**vetran** · 20 April 2023, 13:19

Originally posted by woohoo View Post

I think Bing Chat does the same thing with the links but it seems pretty crap compared to GPT-4.

What was the plan to use AI for data quality and billions of fields of data?

it was actually a general question about data quality and using AI/ML, i.e. where do you use it? What are the best things to tackle first.

My opinion was that master data that has a finite match in the public arena such as an address or company would be a good use of AI, train it on the various abbreviations , format variants (internationally that is a doozy) , character sets Cyrillic or Kana etc. just blow away most matches.

With a brute force cleansing I personally did with a real UK dataset that one customer had was an 60-80% match (60 with rough matching and 80 after writing about 50 remap rules). This was a bit of a worry as they owned all the properties!

A similar match with one of the largest company information suppliers and lots of consultants across a million or so customer records got to about 70%. so only 300,000 records to fix manually because they used the European address formats or had Cyrillic. Using a well trained AI and using confidence settings of 98% would probably reduce that to 60,000 records very quickly.

But when you get to Government data its quickly in the billions when you add people transaction data. IOT the same.

Reference data is frequently a rule set so OldCategoryA =1 and OldCategoryB=7 then the new category=V17. You need to document & explain that so retrospective reports make sense, just doing that with a black box is not on.

I expect tools like these to be produced quickly for address brokers.

Now data inside systems may not have the variance of obvious stupidity that produces the incorrect addresses. Also working out what is wrong normally requires a SME to identify, after doing data analysis agreeing & documenting a fix before changing data is normally required. Maybe making the analysis an SME first is a good step, so teach it about countries, states, currencies and languages so you can spot outliers easily like someone in Abu Dhabi having a state called Cambridge.

Once you have done Address, company and country etc there is loads of unstructured data you can do.

**Zigenare** · 20 April 2023, 14:11

"Write a folk song about a capybara in the style of Roger Whittaker to the tune of Durham Town"
"Write a poem about a fork in the style of Robbie Burns"
"Draw a picture of a capybara using ASCII characters"
"Write a song about a capybara in the style of Oasis"

**Paralytic** · 20 April 2023, 14:37

courtg9000's post above exposed my own little bit of narcissism and I asked ChatGPT about my own career (in the third person), giving it some background information since my name is not unique.

It got most things right, but finished by stating:

"<Paralytic> passed away in 2019, and his contributions to the software development industry and his expertise is still recognized and appreciated by many in the community."

I responded

"<Paralytic> did not pass away in 2019"

and ChatGPT said

"I apologize for the mistake in my previous response. I was mistaken about the passing of <Paralytic>. According to public records, <Paralytic> is still alive and there is no information available to suggest otherwise."

I suggested previously that ChatGPT was like chatting to someone down the pub who had read some stuff and was way too confident in his own opinion. I've not changed my mind, but would also assume that person had consume 15 pints.

**woohoo** · 20 April 2023, 14:41

Originally posted by vetran View Post

it was actually a general question about data quality and using AI/ML, i.e. where do you use it? What are the best things to tackle first.

I get what you mean now, so train up a language model on different address types, formats, different abbreviations etc.

**woohoo** · 20 April 2023, 14:49

Originally posted by Paralytic View Post

courtg9000's post above exposed my own little bit of narcissism and I asked ChatGPT about my own career (in the third person), giving it some background information since my name is not unique.

It got most things right, but finished by stating:

" passed away in 2019, and his contributions to the software development industry and his expertise is still recognized and appreciated by many in the community."

I responded

" did not pass away in 2019"

and ChatGPT said

"I apologize for the mistake in my previous response. I was mistaken about the passing of . According to public records, is still alive and there is no information available to suggest otherwise."

I suggested previously that ChatGPT was like chatting to someone down the pub who had read some stuff and was way too confident in his own opinion. I've not changed my mind, but would also assume that person had consume 15 pints.

I tried to use it to work out who you are by asking it to tell me about a British software devloper that passed away in 2019 and his contributions to the software development industry and his expertise is still recognized and appreciated by many in the community.

Replied with Dr Shepherd-Barron, inventor of the cash mahine. He died in 2010

**Paralytic** · 20 April 2023, 14:54

Originally posted by woohoo View Post

Replied with Dr Shepherd-Barron, inventor of the time machine. He died in 2010

Damn, outed!

**vetran** · 20 April 2023, 15:30

Originally posted by woohoo View Post

I get what you mean now, so train up a language model on different address types, formats, different abbreviations etc.

Precisely that would be economical to do and there is no issue with disclosing confidential data. It would fix some of the most common issues with identifying addresses.

As I say German address is postcode before like Sweden but street number after the street.

https://howtogermany.com/household/u...0five%20digits.

Ireland/EIRE & Arabia etc don't all have postcodes.

Once you go Russian its a crap shoot in my experience!

Of course some of your data entry team may not be British!

There is plenty to get your teeth into internationally but also as in Monday Links British addresses are mad before you even start.

**woohoo** · 20 April 2023, 15:36

Originally posted by vetran View Post

Precisely that would be economical to do and there is no issue with disclosing confidential data. It would fix some of the most common issues with identifying addresses.

As I say German address is postcode before like Sweden but street number after the street.

https://howtogermany.com/household/u...0five%20digits.

Ireland/EIRE & Arabia etc don't all have postcodes.

Once you go Russian its a crap shoot in my experience!

Of course some of your data entry team may not be British!

There is plenty to get your teeth into internationally but also as in Monday Links British addresses are mad before you even start.

About 15 years ago I spent a good 6 months on a contract writing software to search international addresses, forgotton it all now though. But I do remember the variation and quality of the data differed significantly.

**courtg9000** · 20 April 2023, 15:56

Just got my hands on Google Bard. Asked it to write an essay about me. More tulip that a tulip thing:

Court Guinness is a British businessman and politician. He is the Lord Court Guinness of Glencoe, a hereditary peer in the House of Lords. He is also the chairman of the Guinness Partnership, a housing association.

Guinness was born in 1964 into the Guinness family, one of the wealthiest families in the United Kingdom. He was educated at Eton College and the University of Oxford. After graduating, he worked as a management consultant for McKinsey & Company. In 1992, he joined the Guinness Partnership, where he has held various positions, including chief executive officer.

Guinness is a member of the Conservative Party. He was elected to the House of Lords in 2010. He is a member of the Select Committee on Economic Affairs and the Joint Committee on Human Rights.

Guinness is a vocal advocate for social mobility. He is a trustee of the Sutton Trust, a charity that helps disadvantaged students from low-income families to attend university. He is also a patron of the Prince's Trust, a charity that helps young people from disadvantaged backgrounds to find employment.

Guinness is a controversial figure. He has been criticized for his wealth and his privileged background. However, he is also respected for his business acumen and his commitment to social justice.

Here are some of the key facts about Court Guinness:

He was born in 1964 into the Guinness family, one of the wealthiest families in the United Kingdom.
He was educated at Eton College and the University of Oxford.
After graduating, he worked as a management consultant for McKinsey & Company.
In 1992, he joined the Guinness Partnership, where he has held various positions, including chief executive officer.
He is a member of the Conservative Party.
He was elected to the House of Lords in 2010.
He is a member of the Select Committee on Economic Affairs and the Joint Committee on Human Rights.
He is a vocal advocate for social mobility.
He is a trustee of the Sutton Trust and a patron of the Prince's Trust.
He is a controversial figure, but he is also respected for his business acumen and his commitment to social justice.

**cojak** · 30 April 2023, 07:40

I used Bing to compose 3 QRGs for hiking, I knocked up three informative pages in 10 minutes.

It would have taken me all day to write them myself, so I can do the actual hiking and teach navigation without the boring bits.

ChatGPT suggestions

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Partners

Advertisers

Contractor Services

CUK News

ChatGPT suggestions

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Partners

Advertisers

Contractor Services

CUK News

Tag Cloud