MySQL Question - Contractor UK Bulletin Board

Visitors can check out the Forum FAQ by clicking this link. You have to register before you can post: click the REGISTER link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. View our Forum Privacy Policy.
Want to receive the latest contracting news and advice straight to your inbox? Sign up to the ContractorUK newsletter here. Every sign up will also be entered into a draw to WIN £100 Amazon vouchers!

You are not logged in or you do not have permission to access this page. This could be due to one of several reasons:

You are not logged in. If you are already registered, fill in the form below to log in, or follow the "Sign Up" link to register a new account.
You may not have sufficient privileges to access this page. Are you trying to edit someone else's post, access administrative features or some other privileged system?
If you are trying to post, the administrator may have disabled your account, or it may be awaiting activation.

Sysman replied

20 March 2011, 13:28
Originally posted by Freamon View Post

I know there's commercial Data Masking tools you can get to anonymize customer data (e.g. for dev/test databases) - Informatica have one - but I can't instantly find any free ones that seem worth looking at. Good project for someone to code an open source one.

Data anonymisation for testing purposes was a project a former client had a lot of difficulty with. And yes, this was a bank.

I had a play around with a few freebies several months ago, but their data was horribly Yank. Hands up anyone who has ever met a Zacchary.
Leave a comment:
Freamon replied

20 March 2011, 00:02
Originally posted by Cliphead View Post

No real problem to solve just setting up a duplicate system for a new dev to work on but wanted to strip out all the real customer data.

Thanks to the excellent pointers I got the job done with minimum of fuss and nothing broken.

I know there's commercial Data Masking tools you can get to anonymize customer data (e.g. for dev/test databases) - Informatica have one - but I can't instantly find any free ones that seem worth looking at. Good project for someone to code an open source one.
Leave a comment:
Cliphead replied

19 March 2011, 17:15
Originally posted by Freamon View Post

I suppose it's worth asking why you want to do this?

There's probably a better way to solve the underlying problem that you're trying to solve.

No real problem to solve just setting up a duplicate system for a new dev to work on but wanted to strip out all the real customer data.

Thanks to the excellent pointers I got the job done with minimum of fuss and nothing broken.
Leave a comment:
Sysman replied

16 March 2011, 06:40
Originally posted by NickFitz View Post

I've sometimes thought it would be useful to create a storage engine that returns result sets using a randomly-chosen way of ordering: each time you run a query without an ORDER BY clause it just picks one of order-of-insertion, reverse-order-of-insertion, every-other-record-by-order-of-insertion-followed-by-the-rest-in-reverse-order-of-insertion-permuted-according-to-the-fibonacci-series... and of course randomly-ordered, along with lots of others. It would be a useful training tool, reminding people that if they want things ordered a particular way, they need to say so. I've also thought of doing the same thing with the "for-each" construct in XSLT - too many people think it operates like a loop, but of course there's no such thing as a loop (which implies a temporal sequence of events) in declarative languages like SQL or XSLT. It all happens outside time

Now you're taking me back to a database I learnt many years ago, when hierarchical rather than relational databases were the norm. It constructed a (hopefully unique*) hash of the primary key and used that to determine the location of the data. This was done for performance and avoided ploughing through index levels on data lookups. At this point my memory gets dodgy, but I do recall sorts on a mainframe being blazingly fast.

* logic was there to cope with non-unique hashes, but of course these would incur a performance penalty with extra I/O.
Leave a comment:
NickFitz replied

16 March 2011, 03:04
Originally posted by Cliphead View Post

I have several tables containing over 20k rows. Is there a quick way to delete all but the first 20 rows in each of the tables?

I agree with:

Originally posted by Freamon View Post

I suppose it's worth asking why you want to do this?

There's probably a better way to solve the underlying problem that you're trying to solve.

if only because tables in relational databases are inherently unordered (they're sets), and therefore the concept of "the first 20 rows" doesn't apply: it's like referring to "the first twenty places in Berkshire", in that you need to define what criteria are used to determine where comes before where once all the "places in Berkshire" have been identified.

When you do "SELECT * FROM doobry" the database might well return the records in order of insertion, but that's merely an artefact of the implementation, and must not be relied on. It's entirely possible that the implementation (in MySQL's case, the implementation of the specific storage engine used for that table) could choose to rearrange the way it's stored the contents of the table (maybe to optimise disk access once the table grows beyond a certain size, or some such reason), and then the same query could return the results in a different order. This is completely legitimate behaviour, and a classic example of why one should program to the abstraction's defined behaviour rather than relying on the implementation's circumstantial behaviour leaking to a higher level.

I've sometimes thought it would be useful to create a storage engine that returns result sets using a randomly-chosen way of ordering: each time you run a query without an ORDER BY clause it just picks one of order-of-insertion, reverse-order-of-insertion, every-other-record-by-order-of-insertion-followed-by-the-rest-in-reverse-order-of-insertion-permuted-according-to-the-fibonacci-series... and of course randomly-ordered, along with lots of others. It would be a useful training tool, reminding people that if they want things ordered a particular way, they need to say so. I've also thought of doing the same thing with the "for-each" construct in XSLT - too many people think it operates like a loop, but of course there's no such thing as a loop (which implies a temporal sequence of events) in declarative languages like SQL or XSLT. It all happens outside time

Last edited by NickFitz; 16 March 2011, 03:21.
Leave a comment:
Freamon replied

15 March 2011, 22:04
I suppose it's worth asking why you want to do this?

There's probably a better way to solve the underlying problem that you're trying to solve.
Leave a comment:
TheFaQQer replied

10 March 2011, 16:24
Originally posted by administrator View Post

How many tables? Do the tables have an ID column? As other posters have said you can simply delete where ID > 20 so

DELETE from TABLE where ID > 20;

That only works if ID is sequential with no gaps though.
Leave a comment:
wobbegong replied

10 March 2011, 16:09
Load the MySQL ODBC driver. Within MS-Access, link to the tables, open them and highlight the rows you don't want, hit 'Del'.
Leave a comment:
yorkshireman replied

10 March 2011, 16:06
Originally posted by Cliphead View Post

I have several tables containing over 20k rows. Is there a quick way to delete all but the first 20 rows in each of the tables?

Use phpmyadmin or Mysql Workbench and select/delete with your mouse.
Leave a comment:
2BIT replied

10 March 2011, 14:35
Originally posted by Cliphead View Post

I'm clueless when it comes to sql

dunno mysql but in Oracle you could use the rownum, mysql prob has something similar,

or perhaps SELECT * FROM TABLE WHERE NOT IN (SELECT TOP 20 * FROM TABLE)
Leave a comment:
administrator replied

10 March 2011, 13:58
How many tables? Do the tables have an ID column? As other posters have said you can simply delete where ID > 20 so

DELETE from TABLE where ID > 20;
Leave a comment:
lilelvis2000 replied

10 March 2011, 13:01
Good stuff.
Leave a comment:
eek replied

10 March 2011, 10:13
Originally posted by Cliphead View Post

I have several tables containing over 20k rows. Is there a quick way to delete all but the first 20 rows in each of the tables?

well this will give you the first 20 rows

SELECT tableid FROM {table} ORDER BY {row} asc LIMIT 20

so something like will work
delete from table where tableid not in (
SELECT tableid FROM {table} ORDER BY {row} asc LIMIT 20)
Leave a comment:
Cliphead replied

10 March 2011, 10:03
Originally posted by FiveTimes View Post

Can you not use the order by limit 20 in the delete statement ?

I'm clueless when it comes to sql
Leave a comment:
FiveTimes replied

10 March 2011, 10:01
Can you not use the order by limit 20 in the delete statement ?
Leave a comment: