I have 3 Tables:
Regions Table which has 1500 Towns in it.
Categories Table which has 1800 categories.
Companies Table containing businesses.
What I need to do is grab a town, for example Brimingham and list in an Array which categories have businesses using our main Companies Table, so we don't have any categories stored in the array which don't have businesses in Brimingham.
The problem I have is the size of the array being stored, when I populate all the towns with the serialized array I cant even open the table to browse. See below array example:
a:9:{s:8:"Bailiffs";s:1:"1";s:20:"Business Consultants";s:1:"1";s:25:"Car Garages and Mechanics";s:1:"1";s:35:"Farming Livestock and Other Animals";s:1:"2";s:19:"Fashion Accessories";s:1:"1";s:6:"Hotels";s:1:"1";s:20:"Post Office Services";s:1:"1";s:13:"Schools State";s:1:"1";s:14:"Wood Craftsmen";s:1:"1";}
Can anyone suggest an alternative solution?
Cheers
I'd suggest a totally different approach that gets rid of the storage problem entirely, and should make your app more efficient. Storing serialized arrays full of information that can be retrieved from your database anyway is redundant and highly inefficient. The best approach here would be to normalize your data.
You should create a fourth table, perhaps called 'region_categories', which will be a simple lookup table:
CREATE TABLE region_categories (
regionId int unsigned not null,
categoryId int unsigned not null,
PRIMARY KEY(regionId,categoryId)
);
Now, instead of saving everything to an array, for each town/region you should instead populate this table with the categories that are in that town. Your data size is very small, as all you are storing is a pair of ids.
When it comes time to retrieve the categories for a given region, you just have to run a simple SELECT statement:
SELECT category.*
FROM region_categories AS rc LEFT JOIN categories AS c ON rc.categoryId=c.categoryId
WHERE rc.regionId=[whatever region you're dealing with]
Now you can iterate through your results, and you'll have all the categories for that region.
Related
I'm working on a mysql database select and cannot find a solution for this tricky problem.
There's one table "words" with id and names of objects (in this case possible objects in a picture).
words
ID object
house
tree
car
…
In the other table "pictures" all the information to a picture is saved. Besides to information to resolution, etc. there are especially informations on the objects in the picture. They are saved in the column objects by the ids from the table words like 1,5,122,345, etc.
Also the table pictures has a column "location", where the id of the place is written, where I took the picture.
pictures
location objectsinpicture ...
1 - 1,2,3,4
2 - 1,5,122,34
1 - 50,122,345
1 - 91,35,122,345
2 - 1,14,32
1 - 1,5,122,345
To tag new pictures of a particular place I want to become suggestions of already saved information. So I can create buttons in php to update the database instead of using a dropdown with multiple select.
What I have tried so far is the following:
SELECT words.id, words.object
FROM words, pictures
WHERE location = 2 AND FIND_IN_SET(words.id, pictures.objectsinpicture)
GROUP BY words.id
ORDER BY words.id
This nearly shows the expected values. But some information is missing. It doesn't show all the possible objects and I cannot find any reason for this.
What I want is for example all ids fo location 2 joined to the table words and to group double entries of objectsinpicture:
1,5,122,34
1,14,32
1,5,14,32,34,122
house
...
...
...
...
...
Maybe I need to use group_concat with comma separator. But this doesn't work, either. The problem seems to be where condition with the location.
I hope that anyone has an idea of solving this request.
Thanks in advance for any support!!!
This is a classic problem of denormalization causing problems.
What you need to do is store each object/picture association separately, in another table:
create table objectsinpicture (
picture_id int,
object_id int,
primary key (picture_id, object_id)
);
Instead of storing a comma-separated list, you would store one association per row in this table. It will grow to a large number of rows of course, but each row is just a pair of id's so the total size won't be too great.
Then you can query:
SELECT w.id, w.object
FROM pictures AS p
JOIN objectsinpicture AS o ON o.picture_id = p.id
JOIN words AS w ON o.object_id = w.id
WHERE p.location = 2;
I'm working on an application that previously had unique handles for users only--but now we want to have handles for events, groups, places... etc. Unique string identifiers for many different first class objects. I understand the thing to do is adopt something like the Party Model, where every entity has its own unique partyId and handle. That said, that means on pretty much every data-fetching query, we're adding a join to get that handle! Certainly for every user.
So just what is the performance loss here? For a table with just three or four columns, is a join like this negligible? Or is there a better way of going about this?
Example Table Structure:
Party
int id
int party_type_id
varchar(256) handle
Events
int id
int party_id
varchar(256) name
varchar(256) time
int place_id
Users
int id
int party_id
varchar(256) first_name
varchar(256) last_name
Places
int id
int party_id
varchar(256) name
-- EDIT --
I'm getting a bad rating on this question, and I'm not sure I understand why. In PLAIN TERMS, I'm asking,
If I have three first class objects that must all share a UNIQUE HANDLE property, unique across all three objects, does adding an additional table that must be joined with on almost any request incur a significant performance hit? Is there a better way of accomplishing this in a relational database like MySQL?
-- EDIT: Proposed Queries --
Getting one user
SELECT * FROM Users u LEFT JOIN Party p ON u.party_id = p.id WHERE p.handle='foo'
Searching users
SELECT * FROM Users u LEFT JOIN Party p ON u.party_id = p.id WHERE p.handle LIKE '%foo%'
Searching all parties... I guess I'm not sure how to do this in one query. Would you have to select all Parties matching the handle and then get the individual objects in separate queries? E.g.
db.makeQuery(SELECT * FROM Party p WHERE p.handle LIKE '%foo%')
.then(function (results) {
// iterate through results and assemble lists of matching parties by type, then get those objects in separate queries
})
This last example is what I'm most concerned about I think. Is this a reasonable design?
The queries you show should be blazingly fast on any modern implementation, and should scale to tens or hundreds of thousands of millions of records without too much trouble.
Relational Database Management Systems (of which MySQL is one) are designed explicitly for this scenario.
In fact, the slow part of your second query:
SELECT * FROM Users u LEFT JOIN Party p ON u.party_id = p.id WHERE p.handle LIKE '%foo%'
is going to be WHERE p.handle LIKE '%foo%' as this will not be able to use an index. Once you have a large table, this part of the query will be many times slower than the join.
I am designing a database application for an award. It has a 75 year history and numerous categories that have changed over time. Right now, the design I am thinking of has two kinds of tables:
entities
people
publishers
categories
novel
movie
author
artist
and such like. Each category has data particular to that category, for example:
NOVEL
title varchar(1024)
author int #FK into people table ID
publisher int #FK into publisher table ID
year year(4)
winner bool
or
ARTIST
name int
year year(4)
winner bool
So far so good. However, there are 38 (!) of these categories that have existed over time (some do not exist anymore) and I really can't imagine doing a query for say, all of the winners from 1963 by doing:
SELECT * from table1,table2,...,table38 WHERE year=1963 and winner=TRUE;
These tables will never be that large (each category usually has at most five nominees, so even after a 100 years, there would be at most 500 rows per table and at a lot less for the early ones that aren't continued). So this isn't a performance question. It is just that that query feels very, very wrong to me, if only because every query will have to be changed every time a new category is created or an old one removed. That happens every few years or so.
The questions then are:
is this query evidence that I've designed this wrong?
if not, is there a better way to do that query?
I keep thinking there must be some way to create a lookup table which pulls from other tables, but I could be misremembering. Is there some way of doing such a thing?
Many thanks,
Glenn
You could do that with 3 tables.
First one is entities. It contains data about all publishers/artist/etc.
entities
name varchar(1024)
publisher bool
Second is data where all data from all categories is stored.
data
title varchar(1024)
author/name int #FK into people table ID
publisher int #FK into publisher table ID
year year(4)
winner bool
category int #FK into category table ID
Third is category in which you can find all categories names with their IDs.
category
ID int
name varchar(1024)
Now you have to join only three tables.
select * from entities e, data d, category c where d.name=e.name and d.category=c.id and winner=bool and year=1963
You would better to have a table for categories where you can save category key value, or just normal category table and you can save the row's id only in other table:
for example,
Table: Category
columns: id, name, slug, status, active_since, inactive_since etc...
In slug, you can keep slugified form of cat to make it easy for queries and url: for example, Industry Innovations category will be saved as industry-innovations.
In status, keep 0 or 1 to show if it is active now. You can also keep dates when it was active and when became inactive in active_since and inactive_since fields.
When you search, you can search those have status 1 for example etc. I dont think your problem is complex and it is very simple for mysql to search when you join tables.
There are projects where dozens of tables are joined and it is ok.
I am applying a group of data mining algorithms to a dataset comprised of a set of customers along with a large number of descriptive attributes that summarize various aspects of their past behavior. There are more than 10,000 attributes, each stored as a column in a table with the customer id as the primary key. For several reasons, it is necessary to pre-compute these attributes rather than calculating them on the fly. I generally try to select customer with a specified attribute set. The algorithms can combine any arbitrary number of these attributes together in a single SELECT statement and join the required tables. All the tables have the same number of rows (one per customer).
I am wondering what's the best way to structure these tables of attributes. Is it better to group the attributes into tables of 20-30 columns, requiring more joins on average but fewer columns per SELECT, or have tables with the maximum number of columns to minimize the number of joins, but having potentially all 10K columns joined at once?
I also thought of using one giant 3-column customerID-attribute-value table and storing all the info there, but it would be harder to structure a "select all customers with these attributes-type query that I need."
I'm using MySQL 5.0+, but I assume this is a general SQL-ish question.
From my expirience using tables with 10,000 columns is very-very-very bad idea. What if in future this number will be increased?
If there are a lot of attributes you shouldn't use a horizontal scaled tables (with large number of columns). You should create a new table attributes and place alltributes values into it. Then connect this table with Many-To-One relationship to main entry table
Maybe the second way is to use no-SQL (like MongoDB) systems
As #odiszapc said, you have to use a meta-model structure, like for instance:
CREATE TABLE customer(ID INT NOT NULL PRIMARY KEY, NAME VARCHAR(64));
CREATE TABLE customer_attribute(ID INT NOT NULL, ID_CUSTOMER INT NOT NULL, NAME VARCHAR(64), VALUE VARCHAR(1024));
Return basic informations of given customer:
SELECT * FROM customers WHERE name='John';
Return customer(s) matching certain attributes:
SELECT c.*
FROM customer c
INNER JOIN attribute a1 ON a1.id_customer = c.id
AND a1.name = 'address'
AND a1.value = '1078, c/ los gatos madrileños'
INNER JOIN attribute a2 ON a2.id_customer = c.id
AND a2.name = 'age'
AND a2.value = '27'
Your generator should generate the inner joins on the fly.
Proper indexes on the tables should allow all this engine to go relatively fast (if we assume 10k attributes per customer, and 10k customers, that's actually pretty much a challenge...)
10,000 columns is much. The SELECT statement will be very long and messy if you wouldn't use *. I think you can narrow the attributes down to most useful and meaningful ones, eliminating others
I have a database that allows a list of businesses.
The Master table has the customer aka businesses details:
id
firstname
lastname
tradingname
storeaddress
state
postcode
In a table called Otherstores, I have the following:
master_id
store_id
storeaddress
state
postcode
phonenumber
What I now need to do is a PHP script that allows me to show all the stores in a list function but here is the catch:
I only want to show 8 stores from different types of categories so they are random.
However I then need it NOT to show a store twice on the same search.
I need it to make the sub-stores aka Otherstores also be randomly added into the query so that they are seeable as well.
I wondering the best way to do this.
WHY I DON'T HAVE ANY CODE:
It's tough to show you code as my idea was to do a left join or INNER join and limit it to id 1.
However I know that won't work because I would need to be able to join them together some how, but I want each sub store to be like its a master store so if I join it to the master table I can't see that working, and instead you will just get errors.