Good people,
I have observed that the MS Access ORDER BY clause sorts records in non-ASCII way. This is different from MySQL - which is generally ASCII-compliant. Let me give you a little background so you understand why this is a problem to me.
Back in 2010, I wrote a generic database transaction logger. The goal was to detect changes occurring on (theoretically) any SQL database and log them in another database. To do this, I use a shadow MySQL database where I maintain a copy of the entire source database. The shadow database is designed using the EAV model so that it is agnostic to the source database schema.
Every once in a while, I read out of both the source and shadow databases, order the records based on their primary keys and format the records to correspond one-to-one. Then, I do a full database compare using a merge algorithm.
This solution has worked okay until last week when a user set it up against an Access database with string primary keys which are not always alphanumeric. All of a sudden the software started logging ghost transactions that have not happened on the source database.
On closer examination, I found out that MS Access orders non-alphanumeric characters in a fashion different from MySQL. As such, my merge algorithm, which assumes similar sort order for both source and shadow records, started to fail.
Now, I have figured out a way I could tweak my software to "cure" such primary keys before using them but it would help a great deal if I know precisely what is the nature of MS Access' ordering scheme. Any ideas will be highly appreciated.
PS: Let me know if there's anything I need to clarify. I am trying to avoid typing too much of what may not be useful.
I had a difficult time with this a few years ago. I'm sorry I didn't retain the solution, but it used VBA and it was not concise or elegant.
I opened the tables as DAO recordsets, advanced through the records, and used the strcomp() function to compare keys. I experimented a lot with the binary/text option of strcomp() and I believe it was finally necessary to insert an error-handling component!
This discussion may be relevant. Also this and this.
Related
I have been a lurking visitor here for years, but I think this is my first time asking a question, so here goes:
Is there a way in Teradata SQL Assistant 16.20 to find how often specific table columns are used in queries, without being a DB admin? Basically, we have a rather large table that keeps growing in number of columns, and we would like to deprecate ones that are not being used by anyone but can't very well ask over 100 users which columns they use. My team "manages" this table to the extent of creating it and populating the data, but we do not have full admin rights to the database, so any solution that would require full DB access isn't really an option without submitting a ticket, which I can do but would rather find a solution I can do on my own.
Thanks in advance!
I need to implement a custom fields in my database so every user can add any fields he wants to his form/entities.
The user should be able to filter or/and sort his data by any custom field.
I want to work with MySQL because the rest of my data is very suitable to SQL. So, unless you have a great idea, SQL will be preferred over NoSQL.
We thought about few solutions:
JSON field - Great for dynamic schema. Can be filtered and sorted. The problem is that it is slower then regular columns.
Dynamic indexes can solve that but is it too risky to add indexes dynamically.
Key-value table - A simple solution but a really slow one. You can't index it properly and the queries are awful.
Static placeholder columns - Create N columns and hold a map of each field to its placeholder. - A good solution in terms of performance but it makes the DB not readable and it has limited columns.
Any thoughts how to improve any of the solutions or any idea for a new solution?
As many of the commenters have remarked, there is no easy answer to this question. Depending on which trade-offs you're willing to make, I think the JSON solution is neatest - it's "native" to MySQL, so easiest to explain and understand.
However, given that you write that the columns are specified only at set up time, by technically proficient people, you could, of course, have the set-up process include an "alter table" statement to add new columns. Your database access code and all the associated view logic would then need to be configurable too; it's definitely non-trivial.
However...it's a proven solution. Magento and Drupal, for instance, have admin screens for adding attributes to the business entities, which in turn adds columns to the relational database.
I am pretty excited about the new Mysql XMl Functions.
Now I can finally embed something like "object oriented" documents in my oldschool relational database.
For an example use-case consider a user who sings up at your website using facebook connect.
You can fetch an object for the user using the graph api, and get nice information. This information however can vary vastly. Some fields may or may not be set, some may be added over time and so on.
Well if you are just intersted in very special fields (for example friends relations, gender, movies...), you can project them into your relational database scheme.
However using the XMl functions you could store the whole object inside a field and then your different models can access the data using the ExtractValue function. You can store everything right away without needing to worry what you will need later.
But what will the performance be?
For example I have a table with 50 000 entries which represent useres.
I have an enum field that states "male", "female" (or various other genders to be politically correct).
The performance of for example fetching all males will be very fast.
But what about something like WHERE ExtractValue(userdata, '/gender/') = 'male' ?
How will the performance vary if the object gets bigger?
Can I maby somehow put an Index on specified xpath selections?
How do field types work together with this functions/performance. Varchar/blob?
Do I need fulltext indexes?
To sum up my question:
Mysql XML functins look great. And I am sure they are really great if you just want to store structured data that you fetch and analyze further in your application.
But how will they stand battle in procedures where there are internal scans/sorting/comparision/calculations performed on them?
Can Mysql replace document oriented databases like CouchDB/Sesame?
What are the gains and trade offs of XML functions?
How and why are they better/worse than a dynamic application that stores various data as attributes?
For example a key/value table with an xpath as key and the value as value connected to the document entity.
Anyone made any other experiences with it or has noticed something mentionable?
I tend to make comments similar to Pekka's, but I think the reason we cannot laugh this off is your statement "This information however can vary vastly." That means it is not realistic to plan to parse it all and project it into the database.
I cannot answer all of your questions, but I can answer some of them.
Most notably I cannot tell you about performance on MySQL. I have seen it in SQL Server, tested it, and found that SQL Server performs in memory XML extractions very slowly, to me it seemed as if it were reading from disk, but that is a bit of an exaggeration. Others may dispute this, but that is what I found.
"Can Mysql replace document oriented databases like CouchDB/Sesame?" This question is a bit over-broad but in your case using MySQL lets you keep ACID compliance for these XML chunks, assuming you are using InnoDB, which cannot be said automatically for some of those document oriented databases.
"How and why are they better/worse than a dynamic application that stores various data as attributes?" I think this is really a matter of style. You are given XML chunks that are (presumably) documented and MySQL can navigate them. If you just keep them as-such you save a step. What would be gained by converting them to something else?
The MySQL docs suggest that the XML file will go into a clob field. Performance may suffer on larger docs. Perhaps then you will identify sub-documents that you want to regularly break out and put into a child table.
Along these same lines, if there are particular sub-docs you know you will want to know about, you can make a child table, "HasDocs", do a little pre-processing, and populate it with names of sub-docs with their counts. This would make for faster statistical analysis and also make it faster to find docs that have certain sub-docs.
Wish I could say more, hope this helps.
We downloaded today RedGate's Toolbet, in oder to automatize some tasks that take so long in our company when it comes to databases.
The first one appear with a 15 GB database we have, with a lot of indexes, constrains and also several triggers. We want this database to be migrated exactly with the schema, all the data, triggers, etc to a new DB with the idea to reduce the size an also to get a better performance hidding all the mistakes commited in the past. Unfortunately this was the first customer's release DB of one products, and we used it to test lot of things that no always worked pretty well. We are sure that if we do something like this, we will get more tha 50% of the size back into our disk.
Can one or some Toolbet tools combined be useful to do this? If answer is not, is there available other tool useful for this task?
One common way this can happen is if you are not selecting all your tables to be included in the compare. For example, you may have selected a child table and not the parent table. This could lead to a FK error like you describe.
I was wondering if somebody knows an elegant solution to the following:
Suppose I have a table that holds orders, with a bunch of data. So I'm at 1M records, and searches begin to take time. So I want to speed it up by archiving some data that is more than 3 years old - saving it into a table called orders-archive, and then purging them from the orders table. So if we need to research something or customer wants to pull older information - they still can, but 99% of the lookups are done on the orders no older than a year and a half - so there is no reason to keep looking through older data all the time. These move & purge operations can be then croned to be done on a weekly basis. I already did some tests and I know that I will slash my search times by about 4 times. So far so good, right?
However I was thinking about how to implement older archival lookups and the only reasonable thing I can think of is some sort of if-else If not found in orders, do a search in orders-archive. However - I have about 20 tables that I want to archive and god knows how many searches / finds are done through out the code, that I don't want to modify. So I was wondering if there is an elegant rails-way solution to this problem, by extending a model somehow? Has anyone dealt with similar case before?
Thank you.
MySQL 5.x can handle this natively using Horizontal Partitioning.
The basic idea behind partitioning is that you tell the database to store records in a certain range in a separate file. You can still query against all the records, but as long as you're querying only current records, the database engine won't be encumbered with all of the archived records.
You can use the order_date column or something similar as the cutoff for your partitions. This is the elegant solution.
Overview of Partitioning in MySQL
Otherwise, your if/else idea with dynamically generated queries seems about right. You can add year numbers after the archival tables and use reflection to build a list of tables, then have at it.