MSAccess: SELECT INTO NewTable cuts long texts - ms-access

When I use (in MS Access 2003 SP3):
SELECT * INTO NewTable FROM SomeQuery;
MEMO fields are converted to TEXT fields (which are limited to 255 characters), so longer texts are cut.
The output of the query itself is fine and not truncated; the text is cut only in the new table that is created.
An update: I managed to focus the problem on the IIF statement in my query.
When I remove the IIF, the new table contains the MEMO field, but with the IIF the same field appears as TEXT. The weird thing is that the query output shows the long strings in full, even when the IIF is being used. Only when it is 'copied' to the new table (by the INTO statement), the text is cut.
Do you know of any problems that IIF may cause to MEMO fields?
Thank you for your answers.

You have here the current workarounds for avoiding any Truncation of Memo fields.
In your case, that may be the result of the query's Properties Sheet including a "set Unique" Values to Yes (which forces comparison of Memo fields, triggering the truncation), or the Format property of the field, e.g. forcing display in upper case (>) or lower case (<).
What Access are you using, and what format are you saving your document into ?
In a Access 2000 compatible format, the cell are in Excel5.0/95 format: 255 characters max.
Do you have any other (non-Memo) field with lengthy value you could try to select, just to see if it also gets truncated ?
If the output is fine, but the export in a new table does truncate the Memo fields, could you check the following:
In the export dialog under advanced, even though it looks like you can only inlude the name, if you click very carefully to expand column that don't appear, you can change the data type to memo.

I have just tested in A2K3 with a make table and appending a memo field. I had no difficulty getting it to append full data to a memo field.
Perhaps you could post the SQL for the query you're using to populate your table. If you're sorting (or grouping) on the memo fields that could do it, because sorting on memo fields is supposed to truncate them to 255 characters (though in the test I just ran on A2K3 SP3 with all the latest post-SP3 patches, mere sorting doesn't truncate but GROUP BY does).
Another issue is that it's usually not advisable to have a Make Table query in a production app. Anything that's happening repeatedly enough that you programmed for it really ought to be appending to a pre-defined table, instead of replacing an existing table. For one, a pre-defined table can have indexes defined on it, which makes it much more efficient to use after it's been populated. Sure, you have to delete existing records before appending your new data, but the benefit is pretty big in terms of indexing. And, yes, you could redefine indexes each time you run your Make Table query, but, well, if it's too much trouble to delete existing data, isn't it even more work to add indexes to the newly-created table?
I hardly ever use Make Table queries except when I'm manipulating data that I'm massaging for some other purpose. It's not always predictable what data types you'll end up with in a target table because it is partly dependent on the data in your source table. That alone makes it inadvisable to use them in most situations.

SP3 of 2003 is notorious, it may be related to that. There is a hotfix:
http://support.microsoft.com/default.aspx/kb/945674

Related

Is there a way to get only the numeric elements of a string in mysql?

I'm looking to make it easier for the clients to search for stuff like phone/mobile/fax numbers. For that to happen I want to strip both the search value and the relevant columns in my database of any non-numeric characters before comparing them. I'm using these functions to get only the numeric elements of the strings in mysql but they slow my queries down to a crawl when I use them.
Is there any way to do it without blowing my run times sky high?
The reason why your query times are exploding is because any use of such functions disables you from using any index. Since you are not searching directly on a field, but on the output of a function, there is no way mySQL can use an index to execute the query.
This is in addition to the fact that you have to compute the function output for each record.
The best way around these runtimes, if you have access and permission to do so, is to add a new column with the content you're filtering. Add a WRITE trigger to fill the column with the stripped values, run a script that updates the field once for all records. Add an index and include the new column. Then, in your application, use the new column for searches for a number value of a telephone. Downsides are table schema alterations and added code for the business logic and/or data abstraction layer.

MySQL varchar(2000) vs text?

I need to store on average a paragraph of text, which would be about ~800 characters in the database. In some rare cases it may go up to 2000-2500~ characters. I've read the manual and I know there are many of these questions already, but I've read over 10+ questions on stackoverflow and I still find it a bit hard to figure out whether I should simply use text or something like varchar(2000). Half seem to say use varchar, while the other half say text. Some people say always use text if you have more than 255 characters (yea, this was after 5.0.3 which allowed varchar up to 65k). But then I thought to myself if I were to use text everytime the characters were over 255, then why did mysql bother increasing the size at all if that was always the best option?
They both have a variable size in storage I've read, so would there be no difference in my situation? I was personally leaning towards varchar(2000) then I read that varchar stores the data inline while text doesn't. Does this mean that if I constantly select this column, storing the data as varchar would be better, and conversely if I rarely select this column then using text would be better? If that is true, I think I would now choose the text column as I won't be selecting this column many of the times I run a query on the table. If it matters, this table is also frequently joined to as well (but won't be selecting the column), would that also further the benefit of using text?
Are my assumptions correct that I should go with text in this case?
When a table has TEXT or BLOB columns, the table can't be stored in memory. This means every query (which doesn't hit cache) has to access the file system - which is orders of magnitude slower than the memory.
Therefore you should store this TEXT column in a seperate table which is only accessed when you actually need it. This way the original table can be stored in memory and will be much faster.
Think of it as separating the data into one "memory table" and one "file table". The reason for doing this is to avoid accessing of the filesystem except when neccessary (i.e. only when you need the text).
You don't earn anything by storing the text in multiple tables. You still have to access the file system.
Sorry what I meant was for example, a forum script, in the posts table they might be >storing 20 columns of post data, they also store the actual post as a text field in the >same table. So that post column should be separated out?
Yes.
It seems weird to have a table called post, but the actual post isn't stored there, maybe >in another table called "actual_post" not sure lol.
You can try (posts, post_text) or (post_details, posts) or something like that.
I have a tags table that has just three fields, tag_id, tag, and description. So that >description column should also be separated out? So I need a tags table and a >tags_description table just to store 3 columns?
If the description is a TEXT column and you run queries against this table that doesn't need the description it would certainly be preferable.
I think you summarized it well. Another thing you could consider is just moving the "text" to another table... and join back to the master record. That way every time you are actually using the master table, that extra data of where the "text" is isn't even taking up space in the master record. When you need it you can join to that table. This way you can store it as a varchar just in case you want to do something like " where text like... "

Calculated columns in Access 2003 have null value when inserting into new table

I have a make table query where some of the columns are calculated. An example of how one of those columns looks is as follows:
SQFTCost: (([SUPPLY_MASTER]![LAST_COST]+[SUPPLY_MASTER]![FREIGHT_COST])/[SUPPLY_MASTER]![SQFT_PER_CTN])
In this case, LAST_COST is a decimal with a precision of 9 and a scale of 3. FREIGHT_COST is is a decimal with a precision of 8 and a scale of 3, and SQFT_PER_CTN is a decimal with a precision of 7 and a scale of 3.
Whenever I run the make table query, that column and all the others like it are filled with nulls. I know that they are actually null, because I tested that in a routine that I wrote.
However, if I change the query to a SELECT query, all is well. The values are correct.
Does anyone have any idea what can be done to fix this? I am using Access 2003.
Just a few suggestions:
Try adding a cLng() or something equivalent in front of your expressions, to force a well defined data type
I avoid Make Table queries, preferring Append queries. Just make an "template" table properly set up, and use a copy with Append queries. It's the only way to have a clean design.
I have come across the correct answer to this issue. It is the result of a sometimes flaky ODBC driver that we are using here. Our data rests in COBOL files that are in the Vision file format. The trouble is coming from using fields that are defined in the xfds as REDEFINES of other fields in the file. Using the original field fixes the issue.

How to search for rows containing a substring?

If I store an HTML TEXTAREA in my ODBC database each time the user submits a form, what's the SELECT statement to retrieve 1) all rows which contain a given sub-string 2) all rows which don't (and is the search case sensitive?)
Edit: if LIKE "%SUBSTRING%" is going to be slow, would it be better to get everything & sort it out in PHP?
Well, you can always try WHERE textcolumn LIKE "%SUBSTRING%" - but this is guaranteed to be pretty slow, as your query can't do an index match because you are looking for characters on the left side.
It depends on the field type - a textarea usually won't be saved as VARCHAR, but rather as (a kind of) TEXT field, so you can use the MATCH AGAINST operator.
To get the columns that don't match, simply put a NOT in front of the like: WHERE textcolumn NOT LIKE "%SUBSTRING%".
Whether the search is case-sensitive or not depends on how you stock the data, especially what COLLATION you use. By default, the search will be case-insensitive.
Updated answer to reflect question update:
I say that doing a WHERE field LIKE "%value%" is slower than WHERE field LIKE "value%" if the column field has an index, but this is still considerably faster than getting all values and having your application filter. Both scenario's:
1/ If you do SELECT field FROM table WHERE field LIKE "%value%", MySQL will scan the entire table, and only send the fields containing "value".
2/ If you do SELECT field FROM table and then have your application (in your case PHP) filter only the rows with "value" in it, MySQL will also scan the entire table, but send all the fields to PHP, which then has to do additional work. This is much slower than case #1.
Solution: Please do use the WHERE clause, and use EXPLAIN to see the performance.
Info on MySQL's full text search. This is restricted to MyISAM tables, so may not be suitable if you wantto use a different table type.
http://dev.mysql.com/doc/refman/5.0/en/fulltext-search.html
Even if WHERE textcolumn LIKE "%SUBSTRING%" is going to be slow, I think it is probably better to let the Database handle it rather than have PHP handle it. If it is possible to restrict searches by some other criteria (date range, user, etc) then you may find the substring search is OK (ish).
If you are searching for whole words, you could pull out all the individual words into a separate table and use that to restrict the substring search. (So when searching for "my search string" you look for the the longest word "search" only do the substring search on records containing the word "search")
I simply use SELECT ColumnName1, ColumnName2,.....WHERE LOCATE(subtr, ColumnNameX)<>0
To get rows with ColumnNameX having the substring.
Replace <> with = to get rows NOT having the substring.

How do I search part of a column?

I have a mysql table containing 40 million records that is being populated by a process over which I have no control. Data is added only once every month. This table needs to be search-able by the Name column. But the name column contains the full name in the format 'Last First Middle'.
In the sphinx.conf, I have
sql_query = SELECT Id, OwnersName,
substring_index(substring_index(OwnersName,' ',2),' ',-1) as firstname,
substring_index(OwnersName,' ',2) as lastname
FROM table1
How do I use sphinx search to search by firstname and/or lastname? I would like to be able to search for 'Smith' in only the first name?
Per-row functions in SQL queries are always a bad idea for tables that may grow large. If you want to search on part of a column, it should be extracted out to its own column and indexed.
I would suggest, if you have power over the schema (as opposed to the population process), inserting new columns called OwnersFirstName and OwnersLastName along with an update/insert trigger which extracts the relevant information from OwnersName and populats the new columns appropriately.
This means the expense of figuring out the first name is only done when a row is changed, not every single time you run your query. That is the right time to do it.
Then your queries become blindingly fast. And, yes, this breaks 3NF, but most people don't realize that it's okay to do that for performance reasons, provided you understand the consequences. And, since the new columns are controlled by the triggers, the data duplication that would be cause for concern is "clean".
Most problems people have with databases is the speed of their queries. Wasting a bit of disk space to gain a large amount of performance improvement is usually okay.
If you have absolutely no power over even the schema, another possibility is to create your own database with the "correct" schema and populate it periodically from the real database. Then query yours. That may involve a fair bit of data transfer every month however so the first option is the better one, if allowed.
Judging by the other answers, I may have missed something... but to restrict a search in Sphinx to a specific field, make sure you're using the extended (or extended2) match mode, and then use the following query string: #firstname Smith.
You could use substring to get the parts of the field that you want to search in, but that will slow down the process. The query can not use any kind of index to do the comparison, so it has to touch each record in the table.
The best would be not to store several values in the same field, but put the name components in three separate fields. When you store more than one value in a fields it's almost always some problems accessing the data. I see this over and over in different forums...
This is an intractable problrm because fulll names can contains prefixes, suffixes, middle names and no middle names, composite first and last names with and without hyphens, etc. There is no reasonable way to do this with 100% reliability