mysql: how to write a table of views - mysql

I think my question is best explained via an example: Suppose I want to write a webserver for streaming music. I have the following columns in the songs table:
|song_id|song_name|song_genre|song_length|artist
I would like to enable my users to save playlists, but I don't want my playlists to be defined by explicitly specifying the songs that are in the playlist, rather by something like "all songs by ringo starr", which would mean that when new songs by ringo starr are added to the table, the playlist will automatically have them. Actually what I want is a table called playlists that contains a list of mysql views.
The most naive approach would be to simply have a table called playlists, and one of it's columns would be called playlist_query which would store for the above example something like the string "select song_id from songs where artist='ringo starr'.
This is of course incredibly insecure, but it would suffice in case there is no better solution, since this application is used only internaly inside our company by just a few employees who are all good programmers that know their way around mysql.
So what do you suggest? go with this ugly solution, or is there some easier way to do this?

You could store the name of the view in the playlists table instead of the query itself.
You'd still have to create the required views though and I don't know how that helps your "security problem".
Could you elaborate on what kind of security is required?

I'd define a Playlist as a list of Filters where a filter might be a song ID, an artist ID, a song name, a genre ... I'd then fetch all songs for each filter, either in a single query combining filers with simple OR or using UNION if you want the playlist to be in the same order as the filters. Note though that it needs some additional effort to get results from UNION in the order of the queries, see this answer.
The advantages of this approach:
no millions of views to manage
no SQL queries stored in DB
pretty flexible filtering
Admittedly, I can't say much about the performance of this solution. Shouldn't be too bad though.

Related

Laravel Many to Many Relationship : Pivot VS JSON

i wanted to get your expert opinion about this dilema chosing bewteen JSON or Pivot Table
Let just say we have 2 tables here
people
jobs
A person may have multiple jobs, alas, a jobs might have multiple person subscirbed to it.
What is the best approach to it?
Method 1: JSON
I would have jobs column in people table, that contain json array of that person's jobs id, example : [1,2,4]
Method 2: Pivot
I would create pivot table job_person with job_id and person_id column, well, you know Laravel Eloquent style many to many pivot table
I have done some searching, and i found articels favouring each method, some say JSON better because it musch simpler, others would say Pivot is better due to that is how relationship database should work, etc etc.
But i want to know, which one should i use in what scenario? Like if it is just simple case like above scenario, JSON would be better?
What if there are other variables included like additional pivot columns
(Maybe each pivot also contain status column that can be set to active or past_job)
Or what if in the future we want to be able to get all peoples whom have a specific jobs, in which case Pivot would be preferable i think.
What if instead of jobs, the other table would be books and a person can have an extensive of books making we might have tens, or even hundreed pivot records just for one person? And there will be another hundreed persons?
What if instead of books, the other table were stocks in which case, a person might subscribed / unsubscribed multiple stock multiple times?
And maybe to the basic principle, what is each one's advantages/disadvantages?
Thank you very much
I would rather not choose JSON, as there's no benefit from choosing it, you will sacrifice many of the database features and make querying the data difficult and slow.
What if there are other variables included like additional pivot
columns (Maybe each pivot also contain status column that can be set
to active or past_job)
Job and Person are not dependent on each others, so you need to create an associative table between them something like "PersonJob" and add necessary information to it, this is easy to traverse in Laravel.
Or what if in the future we want to be able to get all peoples whom
have a specific jobs, in which case Pivot would be preferable i think.
You could easily query this using the associative table.
And maybe to the basic principle, what is each one's
advantages/disadvantages?
it just that relational databases are made for this kind of stuff and JSON offer no value just hardship.

Implementing Comments and Likes in database

I'm a software developer. I love to code, but I hate databases... Currently, I'm creating a website on which a user will be allowed to mark an entity as liked (like in FB), tag it and comment.
I get stuck on database tables design for handling this functionality. Solution is trivial, if we can do this only for one type of thing (eg. photos). But I need to enable this for 5 different things (for now, but I also assume that this number can grow, as the whole service grows).
I found some similar questions here, but none of them have a satisfying answer, so I'm asking this question again.
The question is, how to properly, efficiently and elastically design the database, so that it can store comments for different tables, likes for different tables and tags for them. Some design pattern as answer will be best ;)
Detailed description:
I have a table User with some user data, and 3 more tables: Photo with photographs, Articles with articles, Places with places. I want to enable any logged user to:
comment on any of those 3 tables
mark any of them as liked
tag any of them with some tag
I also want to count the number of likes for every element and the number of times that particular tag was used.
1st approach:
a) For tags, I will create a table Tag [TagId, tagName, tagCounter], then I will create many-to-many relationships tables for: Photo_has_tags, Place_has_tag, Article_has_tag.
b) The same counts for comments.
c) I will create a table LikedPhotos [idUser, idPhoto], LikedArticles[idUser, idArticle], LikedPlace [idUser, idPlace]. Number of likes will be calculated by queries (which, I assume is bad). And...
I really don't like this design for the last part, it smells badly for me ;)
2nd approach:
I will create a table ElementType [idType, TypeName == some table name] which will be populated by the administrator (me) with the names of tables that can be liked, commented or tagged. Then I will create tables:
a) LikedElement [idLike, idUser, idElementType, idLikedElement] and the same for Comments and Tags with the proper columns for each. Now, when I want to make a photo liked I will insert:
typeId = SELECT id FROM ElementType WHERE TypeName == 'Photo'
INSERT (user id, typeId, photoId)
and for places:
typeId = SELECT id FROM ElementType WHERE TypeName == 'Place'
INSERT (user id, typeId, placeId)
and so on... I think that the second approach is better, but I also feel like something is missing in this design as well...
At last, I also wonder which the best place to store counter for how many times the element was liked is. I can think of only two ways:
in element (Photo/Article/Place) table
by select count().
I hope that my explanation of the issue is more thorough now.
The most extensible solution is to have just one "base" table (connected to "likes", tags and comments), and "inherit" all other tables from it. Adding a new kind of entity involves just adding a new "inherited" table - it then automatically plugs into the whole like/tag/comment machinery.
Entity-relationship term for this is "category" (see the ERwin Methods Guide, section: "Subtype Relationships"). The category symbol is:
Assuming a user can like multiple entities, a same tag can be used for more than one entity but a comment is entity-specific, your model could look like this:
BTW, there are roughly 3 ways to implement the "ER category":
All types in one table.
All concrete types in separate tables.
All concrete and abstract types in separate tables.
Unless you have very stringent performance requirements, the third approach is probably the best (meaning the physical tables match 1:1 the entities in the diagram above).
Since you "hate" databases, why are you trying to implement one? Instead, solicit help from someone who loves and breathes this stuff.
Otherwise, learn to love your database. A well designed database simplifies programming, engineering the site, and smooths its continuing operation. Even an experienced d/b designer will not have complete and perfect foresight: some schema changes down the road will be needed as usage patterns emerge or requirements change.
If this is a one man project, program the database interface into simple operations using stored procedures: add_user, update_user, add_comment, add_like, upload_photo, list_comments, etc. Do not embed the schema into even one line of code. In this manner, the database schema can be changed without affecting any code: only the stored procedures should know about the schema.
You may have to refactor the schema several times. This is normal. Don't worry about getting it perfect the first time. Just make it functional enough to prototype an initial design. If you have the luxury of time, use it some, and then delete the schema and do it again. It is always better the second time.
This is a general idea
please donĀ“t pay much attention to the field names styling, but more to the relation and structure
This pseudocode will get all the comments of photo with ID 5
SELECT * FROM actions
WHERE actions.id_Stuff = 5
AND actions.typeStuff="photo"
AND actions.typeAction = "comment"
This pseudocode will get all the likes or users who liked photo with ID 5
(you may use count() to just get the amount of likes)
SELECT * FROM actions
WHERE actions.id_Stuff = 5
AND actions.typeStuff="photo"
AND actions.typeAction = "like"
as far as i understand. several tables are required. There is a many to many relation between them.
Table which stores the user data such as name, surname, birth date with a identity field.
Table which stores data types. these types may be photos, shares, links. each type must has a unique table. therefore, there is a relation between their individual tables and this table.
each different data type has its table. for example, status updates, photos, links.
the last table is for many to many relation storing an id, user id, data type and data id.
Look at the access patterns you are going to need. Do any of them seem to made particularly difficult or inefficient my one design choice or the other?
If not favour the one that requires the fewer tables
In this case:
Add Comment: you either pick a particular many/many table or insert into a common table with a known specific identifier for what is being liked, I think client code will be slightly simpler in your second case.
Find comments for item: here it seems using a common table is slightly easier - we just have a single query parameterised by type of entity
Find comments by a person about one kind of thing: simple query in either case
Find all comments by a person about all things: this seems little gnarly either way.
I think your "discriminated" approach, option 2, yields simpler queries in some cases and doesn't seem much worse in the others so I'd go with it.
Consider using table per entity for comments and etc. More tables - better sharding and scaling. It's not a problem to control many similar tables for all frameworks I know.
One day you'll need to optimize reads from such structure. You can easily create agragating tables over base ones and lose a bit on writes.
One big table with dictionary may become uncontrollable one day.
Definitely go with the second approach where you have one table and store the element type for each row, it will give you a lot more flexibility. Basically when something can logically be done with fewer tables it is almost always better to go with fewer tables. One advantage that comes to my mind right now about your particular case, consider you want to delete all liked elements of a certain user, with your first approach you need to issue one query for each element type but with the second approach it can be done with only one query or consider when you want to add a new element type, with the first approach it involves creating a new table for each new type but with the second approach you shouldn't do anything...

Getting the most efficient query based on multiple tables/primary & foreign keys

I have a site where some pages (we call them gateway pages) are based loosely on certain departments in the organization. Each department has classes associated with it. Unfortunately some of my pages are not associated with a specific department, but do display information about several classes from a department so I can't just query the database strictly on department alone.
Would it be smarter to create a table called gateway_classes with a fk from the gateway table in each class or form a query to somehow filter out exactly what I need from my existing tables using an array of classes to be pulled during the query?
Here's my tables:
departments_classes | classes_vendors | departments | vendors | classes | products | gateway
Any guidance is greatly appreciated.
More Info: There are roughly 350 classes and 18 departments and 12 gateway pages...
Your indexing table idea sounds like it'd work just fine. The only downside to that is that you've got to maintain it separately, and you want to make sure that the data you hold in that table isn't being duplicated in any of your existing tables.
If you don't want to maintain that data differently than you're currently doing so, you can use CF's arrays (or structs) to hold that correlation data (which you'd have to pull from the db in a separate query) and then loop over it as you construct the query that pulls the classes for a given page.
Either way would work okay, it's more a matter of how you prefer to do it, and what you think would be easiest to build, test, and maintain.
One thing about efficiency - make sure you not only link your tables via Foreign Keys (which helps to maintain data integrity), but also put in (nonclustered) indices, which helps the efficiency of the joins and lookups your queries will be doing.
I've seen dramatic speed improvements in my queries (CFQUERYs operating against MS SQL) with the simple act of putting in indices.
In MS SQL, you do so like this:
CREATE NONCLUSTERED INDEX yourIndexName ON yourTableName(yourFieldName)
I hope this helps!
Your problem sounds similar to a common scenario for determining user rights. A User may belong to some Group that has Rights associated with it or the User may be assigned Rights individually. In your case, the User is the Gateway, the Group is the Department, and the Rights are the Classes. A Gateway can then be linked to any number of Departments and/or Classes.
Using this model, you just need to add the gateway_classes table as you describe along with a gateway_departments table.
You could then use UNION to merge the "gateway classes" query with the "gateway departments" query (or perhaps something more elegant) but I think this schema will do want you need without introducing any redundant information.

Guidelines for join/link/many to many tables

I have my own theories on the best way to do this, but I think its a common topic and I'd be interested in the different methods people use. Here goes
Whats the best way to deal with many-to-many join tables, particularly as far as naming them goes, what to do when you need to add extra information to the relationship, and what to do whene there are multiple relationships between two tables?
Lets say you have two tables, Users and Events and need to store the attendees. So you create EventAttendees table. Then a requirement comes up to store the organisers. Should you
create an EventOrganisers table, so each new relationship is modelled with a join table
or
rename EventAttendees to UserEventRelationship (or some other name, like User2Event or UserEventMap or UserToEvent), and an IsAttending column and a IsOrganiser column i.e. You have a single table which you store all relationship info between two attendees
or
a bit of both (really?)
or
something else entirely?
Thoughts?
The easy answer to a generic question like this is, as always, "It all depends on the details".
But in general, I try to create fewer tables when this can be done without abusing the data definitions unduly. So in your example, I would probably add an isOrganizer column to the table, or maybe an attendeeType to allow for easy future expansion from audience/organizer to audience/organizer/speaker/caterer or whatever may be needed. Creating an extra table with essentially identical columns, where the table name is in effect a flag identifying the "attendee type", seems to me the wrong way to go both from a pristine design perspective and also from a practical point of view.
A single table is more flexible. With one table and a type field, if we want to know just the organizers -- like when we're sending invitations to a planning meaning -- fine, we write "select userid from userevent where eventid=? and attendeetype='O'". If we want to know everyone who will be there -- like when we're printing name cards for the lunch tables -- we just don't include the attendeetype test.
But suppose we have two tables. Then if we want just the organizers, okay, that's easy, join on the organizer table. But if we want both organizers and audience, then we have to do a union, which makes for more complicated queries and is usually slow. And if you're thinking, What's the big deal doing a union?, note that there may be more to the query. Perhaps a person can have multiple phone numbers and we care about this, so the query is not just joining user and eventAttendee but also phone. Maybe we want to know if they've attended previous conferences because we give special deals to "alumni", so we have to join in eventAttendee a second time, etc etc. A ten-table join with a union can get very messy and confusing to read.

Save a list of user ids to a mysql table

I need to save a list of user ids who viewed a page, streamed a song and / or downloaded it. What I do with the list is add to it and show it. I don't really need to save more info than that, and I came up with two solutions. Which one is better, or is there an even better solution I missed:
The KISS solution - 1 table with the primary key the song id and a text field for each of the three interactions above (view, download, stream) in which there will be a comma separated list of user ids. Adding to it will be just a concatenation operation.
The "best practice" solution - Have 3 tables with the primary key the song id and a field of user id that did the interaction. Each row has one user id and I could add stuff like date and other stuff.
One thing that makes me lean towards options 2 is that it may be easier to check whether the user has already voted on a song?
tl;dr version - Is it better to use a text field to save arrays as comma separated values, or have each item in the array in a separate table row.
Definitely the 2nd:
You'll be able to scale your application as it grows
It will be less programming language dependent
You'll be able to make queries faster and cleaner
It will be less painful for any other programmer coding / debugging your application later
Additionally, I'd add a new table called "operations" with their ID, so you can add different operations if you need later, storing the operation ID instead of a string on each row ("view", "download", "stream").
It's definitely better to have each item in a separate row. Manipulating text fields has performance disadvantages by itself. But if ever you want to find out which songs user 1234 has viewed/listened to/etc., you'd have to do something like
SELECT * FROM songactions WHERE userlist LIKE '%,1234,%' OR userlist LIKE '1234,%' OR userlist LIKE '%,1234' OR userlist='1234';
It'd be just horribly, horribly painful.