How can I select from a database all of the rows with an ID stored in a varchar comma separated. for example, I have a table with this:
, 7, 9, 11
How can I SELECT the rows with those IDs?
Normalize your database. You should be using a lookup table most likely.
You have 2 options:
Use a function to split the string into a temp table and then join the table your selecting from to that temp table.
Use dynamic SQL to query the table where id in (#variable) --- bad choice if you choose this way.
select * from table_name where id in (7, 9, 11)
If you do typically have that comma at the start, you will need to remove it first.
Use match(column) against('7,9,11')
this willl show all varchar column of your id's where 7,9,11 is there.
But you have to be shure that ur column have fulltext index.
Just yesterday I was fixing a bug in an old application here and saw where they handled it like this:
AND (T.ServiceIDs = '#SegmentID#' OR T.ServiceIDs LIKE '#SegmentID#,%'
OR T.ServiceIDs LIKE '%,#SegmentID#,%' OR T.ServiceIDs LIKE '%,#SegmentID#')
I am assuming you are saying something like the value of ServiceIDs from the database might contain 7,9,11 and that the variable SegmentID is one or more values. It was inside a CFIF statement checking to see that SegmentID in fact had a value(which was always the case due to prior logic that would default it.
I personally though would do as others have suggested and I'd create what I always refer to as a bridging table that allows you to have 0 to many PKs from one table related to the PK of another.
I had to tackle this problem years ago where I could not change the table structure and I created a custom table type and a set of functions so I could treat the values via SQL as if they were coming from a table. That custom table type solution though was specific to Oracle and I'd not know how to do that in MySQL without some research on my part.
There is a reason querying lists is so difficult: databases are not designed to work with delimited lists. They are optimized to work best with rows (or sets) of data. Creating the proper table structure will result in much better query performance and simpler sql queries. (So while it is technically possible, you should seriously consider normalizing your database as Todd and others suggested.)
Many-to-many relationships are best represented by three (3) tables. Say you are selling "widgets" in a variety of "sizes". Create two tables representing the main entities:
Widget (unique widgets)
WidgetID | WidgetTitle
1 | Widget 1
2 | Widget 2
....
Size (unique sizes)
SizeID | SizeTitle
7 | X-Small
8 | Small
9 | Medium
10 | Large
11 | X-Large
Then create a junction table, to store the relationships between those two entities, ie Which widgets are available in which sizes
WidgetSize (available sizes for each widget)
WidgetID | SizeID
1 | 7 <== Widget 1 "X-Small"
1 | 8 <== Widget 1 + "Small"
2 | 7 <== Widget 2 + "X-Small"
2 | 9 ....
2 | 10
2 | 11
....
With that structure, you can easily return all widgets having any (or all) of a list of sizes. Not tested, but something similar to the sql below should work.
Find widgets available in any of the sizes: <cfset listOfSizes = "7,9,11">
SELECT w.WidgetID, w.WidgetTitle
FROM Widget w
WHERE EXISTS
( SELECT 1
FROM WidgetSize ws
WHERE ws.WidgetID = w.WidgetID
AND ws.SizeID IN (
<cfqueryparam value="#listOfSizeIds#"
cfsqltype="cf_sql_integer" list="true" >
)
)
Find widgets available in all three sizes: <cfset listOfSizes = "7,9,11">
SELECT w.WidgetID, w.WidgetTitle, COUNT(*) AS MatchCount
FROM Widget w INNER JOIN WidgetSize ws ON ws.WidgetID = w.WidgetID
WHERE ws.SizeID IN (
<cfqueryparam value="#listOfSizeIds#"
cfsqltype="cf_sql_integer" list="true" >
)
GROUP BY w.WidgetID, w.WidgetTitle
HAVING MatchCount = 3
Related
Let's say we have a table called Workorders and another table called Parts. I would like to have a column in Workorders called parts_required. This column would contain a single item that tells me what parts were required for that workorder. Ideally, this would contain the quantities as well, but a second column could contain the quantity information if needed.
Workorders looks like
WorkorderID date parts_required
1 2/24 ?
2 2/25 ?
3 3/16 ?
4 4/20 ?
5 5/13 ?
6 5/14 ?
7 7/8 ?
Parts looks like
PartID name cost
1 engine 100
2 belt 5
3 big bolt 1
4 little bolt 0.5
5 quart oil 8
6 Band-aid 0.1
Idea 1: create a string like '1-1:2-3:4-5:5-4'. My application would parse this string and show that I need --> 1 engine, 3 belts, 5 little bolts, and 4 quarts of oil.
Pros - simple enough to create and understand.
Cons - will make deep introspection into our data much more difficult. (costs over time, etc)
Idea 2: use a binary number. For example, to reference the above list (engine, belt, little bolts, oil) using an 8-bit integer would be 54, because 54 in binary representation is 110110.
Pros - datatype is optimal concerning size. Also, I am guessing there are tricky math tricks I could use in my queries to search for parts used (don't know what those are, correct me if I'm in the clouds here).
Cons - I do not know how to handle quantity using this method. Also, Even with a 64-bit BIGINT still only gives me 64 parts that can be in my table. I expect many hundreds.
Any ideas? I am using MySQL. I may be able to use PostgreSQL, and I understand that they have more flexible datatypes like JSON and arrays, but I am not familiar with how querying those would perform. Also it would be much easier to stay with MySQL
Why not create a Relationship table?
You can create a table named Workorders_Parts with the following content:
|workorderId, partId|
So when you want to get all parts from a specific workorder you just type:
select p.name
from parts p inner join workorders_parts wp on wp.partId = p.partId
where wp.workorderId = x;
what the query says is:
Give me the name of parts that belongs to workorderId=x and are listed in table workorders_parts
Remembering that INNER JOIN means "INTERSECTION" in other words: data i'm looking for should exist (generally the id) in both tables
IT will give you all part names that are used to build workorder x.
Lets say we have workorderId = 1 with partID = 1,2,3, it will be represented in our relationship table as:
workorderId | partId
1 | 1
1 | 2
1 | 3
Currently the database looks like this: (product IDs with name value pairs)
id, attribute_name, attribute_value
1, Clockspeed, 1.6Ghz
1, Screen, 13.3"
2, Clockspeed, 1.8Ghz
2, Screen, 15.1"
I would like to convert the above data to the following format (separated by product ID, with only one line per id) for migrating to a new platform.
id, Clockspeed, Screen
1, 1.6Ghz, 13.3"
2, 1.8Ghz, 15.1"
What is the easiest way to achieve this result? My gut tells me this is going to be done with the concat or group_concat function but I need a point in the right direction, going bald from pulling my hair out.
This points out one of the problems with the entity-attribute-value database design.
There are two methods to use SQL to pivot the attributes into columns, as though you had stored the data in a conventional table:
SELECT id, MAX(CASE attribute_name WHEN 'Clockspeed' THEN attribute_value END) AS Clockspeed,
MAX(CASE attribute_name WHEN 'Screen' THEN attribute_value END) AS Screen
FROM eav_table
GROUP BY id;
SELECT id, c.attribute_value AS Clockspeed, s.attribute_value AS Screen
FROM eav_table AS c
JOIN eav_table AS s USING(id)
WHERE c.attribute_name = 'Clockspeed' AND s.attribute_name = 'Screen'
Output of both queries after testing on MySQL 5.6:
+------+------------+--------+
| id | Clockspeed | Screen |
+------+------------+--------+
| 1 | 1.6GHz | 13.3" |
| 2 | 1.8GHz | 15.1" |
+------+------------+--------+
The latter solution requires N-1 joins to output N attributes. It doesn't scale well.
Both of the above solutions require that you write quite a bit of application code to format the SQL query, according to the number of attributes you want to fetch. And that means if the number of attributes varies (which is likely because that's one of the primary advantages of using EAV), then it's possible to fetch too many attributes for the query to have good performance.
Another solution is to forget about pivoting the data using only SQL. Instead, fetch the rows of data base to your application one attribute per row, as they are stored in the database. Then write application code to post-process the results into one object.
I actually have a table with 30 columns. In one day this table can get around 3000 new records!
The columns datas look like :
IMG Name Phone etc..
http://www.site.com/images/image.jpg John Smith 123456789 etc..
http://www.site.com/images/image.jpg Smith John 987654321 etc..
I'm looking a way to optimize the size of the table but also the response time of the sql queries. I was thinking of doing something like :
Column1
http://www.site.com/images/image.jpg|John Smith|123456789|etc..
And then via php i would store each value into an array..
Would it be faster ?
Edit
So to take an example of the structure, let's say i have two tables :
package
package_content
Here is the structure of the table package :
id | user_id | package_name | date
Here is the structure of the table package_content :
id | package_id | content_name | content_description | content_price | content_color | etc.. > 30columns
The thing is for each package i can get up to 16rows of content. For example :
id | user_id | package_name | date
260 11 Package 260 2013-7-30 10:05:00
id | package_id | content_name | content_description | content_price | content_color | etc.. > 30columns
1 260 Content 1 Content 1 desc 58 white etc..
2 260 Content 2 Content 2 desc 75 black etc..
3 260 Content 3 Content 3 desc 32 blue etc..
etc...
Then with php i make like that
select * from package
while not EOF {
show package name, date etc..
select * from package_content where package_content.package_id = package.id and package.id = package_id
while not EOF{
show package_content name, desc, price, color etc...
}
}
Would it be faster? Definitely not. If you needed to search by Name or Phone or etc... you'd have to pull those values out of Column1 every time. You'd never be able to optimize those queries, ever.
If you want to make the table smaller it's best to look at splitting some columns off into another table. If you'd like to pursue that option, post the entire structure. But note that the number of columns doesn't affect speed that much. I mean it can, but it's way down on the list of things that will slow you down.
Finally, 3,000 rows per day is about 1 million rows per year. If the database is tolerably well designed, MySQL can handle this easily.
Addendum: partial table structures plus sample query and pseudocode added to question.
The pseudocode shows the package table being queried all at once, then matching package_content rows being queried one at a time. This is a very slow way to go about things; better to use a JOIN:
SELECT
package.id,
user_id,
package_name,
date,
package_content.*
FROM package
INNER JOIN package_content on package.id = package_content.id
WHERE whatever
ORDER BY whatever
That will speed things up right away.
If you're displaying on a web page, be sure to limit results with a WHERE clause - nobody will want to see 1,000 or 3,000 or 1,000,000 packages on a single web page :)
Finally, as I mentioned before, the number of columns isn't a huge worry for query optimization, but...
Having a really wide result row means more data has to go across the wire from MySQL to PHP, and
It isn't likely you'll be able to display 30+ columns of information on a web page without it looking terrible, especially if you're reading lots of rows.
With that in mind, you'll be better of picking specific package_content columns in your query instead of picking them all with a SELECT *.
Don't combine any columns, this is no use and might even be slower in the end.
You should use indexes on a column where you query at. I do have a website with about 30 columns where atm are around 600.000 results. If you use EXPLAIN before a query, you should see if it uses any indexes. If you got a JOIN with 2 values and a WHERE at the same table. You should make a combined index with the 3 columns, in order from JOIN -> WHERE. If you join on the same table, you should see this as a seperate index.
For example:
SELECT p.name, p.id, c.name, c2.name
FROM product p
JOIN category c ON p.cat_id=c.id
JOIN category c2 ON c.parent_id=c2.id AND name='Niels'
WHERE p.filterX='blaat'
You should have an combined index at category
parent_id,name
AND
id (probably the AI)
A index on product
cat_id
filterX
With this easy solution you can optimize queries from NOT DOABLE to 0.10 seconds, or even faster.
If you use MySQL 5.6 you should step over to INNODB because MySQL is better with optimizing JOINS and sub queries. Also MySQL will try to run them into MEMORY which will make it a lot faster aswel. Please keep in mind that backupping INNODB tables might need some extra attention.
You might also think about making MEMORY tables for super fast querieing (you do still need indexes).
You can also optimize by making integers size 4 (4 bytes, not 11 characters). And not always using VARCHAR 255.
Firstly I'd like to start by apologizing for the potentially miss-leading title... I am finding it difficult to describe what I am trying to do here.
With the current project I'm working on, we have setup a 'dynamic' database structure with MySQL that looks something like this.
item_details ( Describes the item_data )
fieldID | fieldValue | fieldCaption
1 | addr1 | Address Line 1
2 | country | Country
item_data
itemID | fieldID | fieldValue
12345 | 1 | Some Random Address
12345 | 2 | United Kingdom
So as you can see, if for example I wanted to lookup the address for the item 12345 I would simply do the statement.
SELECT fieldValue FROM item_data WHERE fieldID=1 and itemID=12345;
But here is where I am stuck... the database is relatively large with around ~80k rows and I am trying to create a set of search functions within PHP.
I would like to be able to perform a query on the result set of a query as quickly as possible...
For example, Search an address name within a certain country... ie: Search for the fieldValue of the results with the same itemID's as the results from the query:
'SELECT itemID from item_data WHERE fieldID=2 and fieldValue='United Kingdom'..
Sorry If I am unclear, I have been struggling with this for the past couple of days...
Cheers
You can do this in a couple of ways. One is to use multiple joins to the item_data table with the fieldID limited to whatever it is you want to get.
SELECT *
FROM
Item i
INNER JOIN item_data country
ON i.itemID = country.itemID
and fieldid = 2
INNER JOIN item_data address
ON i.itemID = country.itemID
and fieldid = 1
WHERE
country.fieldValue= 'United Kingdom'
and address.fieldValue= 'Whatever'
As an aside this structure is often referred to as an Entry Attribute Value or EAV database
Sorry in advance if this sounds patronizing, but (as you suggested) I'm not quite clear what you are asking for.
If you are looking for one query to do the whole thing, you could simply nest them. For your example, pretend there is a table named CACHED with the results of your UK query, and write the query you want against that, but replace CACHED with your UK query.
If the idea is that you have ALREADY done this UK query and want to (re-)use its results, you could save the results to a table in the DB (which may not be practical if there are a large number of queries executed), or save the list of IDs as text and paste that into the subsequent query (...WHERE ID in (...) ... ), which might be OK if your 'cached' query gives you a manageable fraction of the original table.
Ok, I have 5 tables which I need to pull information from based on one variable.
gameinfo
id | name | platforminfoid
gamerinfo
id | name | contact | tag
platforminfo
id | name | abbreviation
rosterinfo
id | name | gameinfoid
rosters
id | gamerinfoid | rosterinfoid
The 1 variable would be gamerinfo.id, which would then pull all relevant data from gamerinfo, which would pull all relevant data from rosters, which would pull all relevant data from rosterinfo, which would pull all relevant data from gameinfo, which would then pull all relevant data from platforminfo.
Basically it breaks down like this:
gamerinfo contains the gamers basic
information.
rosterinfo contains basic information about the rosters
(ie name and the game the roster is
aimed towards)
rosters contains the actual link from the gamer to the
different rosters (gamers can be on
multiple rosters)
gameinfo contains basic information about the games (ie
name and platform)
platform info contains information about the
different platforms the games are
played on (it is possible for a game
to be played on multiple platforms)
I am pretty new to SQL queries involving JOINs and UNIONs and such, usually I would just break it up into multiple queries but I thought there has to be a better way, so after looking around the net, I couldn't find (or maybe I just couldn't understand what I was looking at) what I was looking for. If anyone can point me in the right direction I would be most grateful.
There is nothing wrong with querying the required data step-by-step. If you use JOINs in your SQL over 5 tables, we sure to have useful indexes on all important columns. Also, this could create a lot of duplicate data:
Imagine this: You need 1 record from gamerinfo, maybe 3 of gameinfo, 4 ouf of rosters and both 3 out of the remaining two tables. This would give you a result of 1*3*4*3*3 = 108 records, which will look like this:
ID Col2 Col3
1 1 1
1 1 2
1 1 3
1 2 1
... ... ...
You can see that you would fetch the ID 108 times, even if you only need it once. So my advice would be to stick with mostly single, simple queries to get the data you need.
There is no need for UNION just multiple JOINs should do the work
SELECT gameinfo.id AS g_id, gameinfo.name AS g_name, platforminfoid.name AS p_name, platforminfoid.abbreviation AS p_abb, rosterinfo.name AS r_name
FROM gameinfo
LEFT JOIN platforminfo ON gameinfo.platforminfoid = platforminfo.id
LEFT JOIN rosters ON rosters.gameinfoid = gameinfo.id
LEFT JOIN rosterinfo ON rosterinfo.id = rosters.rosterinfoid
WHERE gameinfo.id = XXXX
this should pull all info about game based on game id
indexing on all id(s) gameinfoid, platformid, rosterinfoid will help on performance