I want to have folders and documents which every one have a folder. Folders can have infinite children folders. What is the best mysql schema in your opinion.Do you think this is good?
Table Folders
id
name
parent (if null the root)
auth_user (access control type)
created_date
created_by
Table documents
id
name
type
idFolder (FK id of folders)
auth_user (access control type)
created_date
created_by
Do you think the above is good or gonna have problem later? Do you think with the above can get fast and easy the folders tree (i think with ORDER BY parent ASC can get the tree right)?
adjacency lists are nice for inserts and moving sub-trees but if you need to query deeper than one level it's pain in the a** because you will end up with n-joins if you go n-levels deep. An example: Show me all descendants/ancestors of Folder X.
I suggest to use the adjacency list (the parent_id) in combination with one of the following models:
Nested Sets
Materialized Paths
I really like the nested set - but it has a draw back - inserts are slow. But usually you will have more reads (browsing) the structure than inserting new nodes.
Another thing:
I usually put folders and documents in the same table and flag them with a boolean is_folder column. I like to think of folders/files as "nodes" in a tree so they're basically the same. Further metadata will be stored in another table.
Related
I am trying to create a database that can be used like Twitter works. That is:
Treestructure: Any node can have multiple childnodes.
All nodes have a timestamp
Criteria 1 and 2 suggests a table structure based on basic columns something like:
NodeID (int)
ParentNodeID (int)
UserID (int)
TS (TimeStamp)
MSG (varchar)
When viewing any node (n) all parent nodes until and including root should be selected, that is easy using the ParentNodeID pointer.
Here comes the caveat: In addition to the parent nodes all child nodes from the current node (n) should also be selected in Chronological order (based on TS) from the table. All child nodes, no matter what child-branch, that belongs to the subtree where (n) is the root.
How do I best (better) structure the table for such queries?
You should take look at how Twitter have been evolving, and check if your use case is similar enough.
A good start could be this article with database schema examples: https://web.archive.org/web/20161224194257/http://www.cubrid.org/blog/dev-platform/decomposing-twitter-database-perspective/
Imagine that I have these entities: Folder, Group, Document.
Now, each Folder has many Document's and these Document's can have a Group or not, inside that Folder.
So, a Folder would contain both grouped and ungrouped Document's.
How would I structure such relations?
I was thinking of two ways:
Document belongs to Group and Group belongs to Folder. This
requires me to always have an "ungrouped" Group for every Folder
(really ugly).
Document belongs to both Folder and Group, Group belongs to
Folder. This way I have the reference to the Folder and the
Group, where the group_id can be null to represent "ungrouped".
But this opens a door for error, by assigning a Document to a
Group/Folder where the Group doesn't belong to that Folder.
Neither solution seems right. This probably needs some kind of composite key and the Document would reference that.
What would be the proper way of doing this?
I would expect your second approach to give you the least headache in a real-world application. Although it might depend on specifics of the actual application.
One other alternative is to have groups not belong to folders (since it wasn't an explicit requirement), so that the same group_id might potentially span across various folders. Then you would interpret a pair (folder_id, group_id) as an actual group identifier in the Document table.
I have a type of data called a chain. Each chain is made up of a specific sequence of another type of data called a step. So a chain is ultimately made up of multiple steps in a specific order. I'm trying to figure out the best way to set this up in MySQL that will allow me to do the following:
Look up all steps in a chain, and get them in the right order
Look up all chains that contain a step
I'm currently considering the following table set up as the appropriate solution:
TABLE chains
id date_created
TABLE steps
id description
TABLE chains_steps (this would be used for joins)
chain_id step_id step_position
In the table chains_steps, the step_position column would be used to order the steps in a chain correctly. It seems unusual for a JOIN table to contain its own distinct piece of data, such as step_position in this case. But maybe it's not unusual at all and I'm just inexperienced/paranoid.
I don't have much experience in all this so I wanted to get some feedback. Are the three tables I suggested the correct way to do this? Are there any viable alternatives and if so, what are the advantages/drawback?
You're doing it right.
Consider a database containing the Employees and Projects tables, and how you'd want to link them in a many-to-many fashion. You'd probably come up with an Assignments table (or Project_Employees in some naming conventions).
At some point you'd decide you want not only to store each project assignment, but you'd also want to store when the assignment started, and when it finished. The natural place to put that is in the assignment itself; it doesn't make sense to store it either with the project or with the employee.
In further designs you might even find it necessary to store further information about the assignment, for example in an employee review process you may wish to store feedback related to their performance in that project, so you'd make the assignment the "one" end of a relationship with a Review table, which would relate back to Assignments with a FK on assignment_id.
So in short, it's perfectly normal to have a junction table that has its own data.
That looks fine, and it's not unusual for the join table to contain a position/rank field.
Look up all steps in a chain, and get them in the right order
SELECT * FROM chains_steps
LEFT JOIN steps ON steps.id = chains_steps.step_id
WHERE chains_steps.chain_id = ?
ORDER BY chains_steps.step_position ASC
Look up all chains that contain a step
SELECT DISTINCT chain_id FROM chains_steps
LEFT JOIN chains ON chains.id = chains_steps.chain_id
I think that the plan you've outlined is the correct approach. Don't worry too much about the presence of step_position on your mapping table. After all the step_position is a bit of data that is directly related to a step in the context of a chain. So the chains_steps table is the right place for it IMHO.
Some things to think about:
Foreign keys - use 'em!
Unique key on the chains_steps table - can a step be present in more than one position in a single chain? What about in different chains?
Good luck!
----- PHP and mySQL -----
I have two quick questions need some advice.
On my site I will allow users to upload following files - PDF/Videos/Photos. All files uploaded by the user are shown on a profile page. All uploaded files can be searched by name or tags and file type.
What would be the best mysql database design?
Store all files in one table, easier to display on user’s profile page and searching by type and etc.
One table per type e.g. pdf, videos and photos <- this might be better for performance but for searching I don’t know?
Second question is, I allow users to create their own menus/categories with parent and children categories for example:
->parent category
> child category
> child category
->parent category
> child category
> child category
At moment I have two database tables, one stores all the parent categories for each user and second store child category with foreign key (id) to parent category.
To get all the categories I first get all the parent categories and using a foreach loop.
I call a function within the loop to get the children categories by parent id.
I want to know is this the best approach of doing this or can this be done in mySQL query without looping?
thanks guys !!!
For your first question, it depends on what information you want to store about the files.
If it's generic across all types, (name, date, filetype, size, etc.) then a single Files table by itself with a type column makes sense.
But if you're going to save attributes of the files that have to do with what kind of file they are, frame rate of a video file, height and width of an image file, author of a PDF, for example, then you will also need some ancillary tables to store that information. You don't want to have a bunch of columns hanging off your file table that are only useful each for a certain file type.
For your second question, the rough SQL is based on a JOIN between your parent category table and your child category tables.
Example psuedo code:
select p.userid, p.parentcategoryid,c.childcategoryid
from ParentCategory p INNER JOIN
ChildCategory c
on p.parentcategoryid=c.parentcategoryid
WHERE
p.userid = #UserID
I have two tables, applications and applicationRevisions, structured:
applications (id, time, user, views)
applicationRevisions(id, application, time, user, name, description)
The latter holds 'revisions' of an application page, containing title and description, so it's a one-to-many relationship. When an application page is shown, the latest applicationRevisions row with the matching application ID is chosen.
However, I have no way of knowing if an application with a certain name exists at any particular time because previous revisions may have different names, and the name is not stored in the applications table.
I have a workarounds; store the current name as a field in applications, and when an application is edited, add the row as usual to applicationRevisions, but then update the name in the applications row.
Is there a better way to do this?
So, you'd like to search for a name in to get the application and then get the most recent revision of that application, which may no longer have that name. This is certainly possible with subqueries but what are you going to do about applications which happen to have the same name for some revisions?
Would be much more clear if the application table could hold the name and description. To be honest, this could just be a single table since time and user in application would likely just be the same as used for the first revision of each application. Only the views field is left and that could be in a table of just applications_views (application, views).
Anyway, if you want to avoid major changes to the schema and the name confusion between applications is OK, you could make a query something like this:
select * from applications join applicationRevisions
on (applications.id=applicationRevisions.application)
where applicationRevisions.id in
(select max(id)
from applicationRevisions
where name = 'foobar'
group by application);
If I guessed the relationships correctly, this will give you all fields from the most recent revision of each application that ever used the name 'foobar'.
Correct me If I am missing something here, every application entry must atleast have a applicationRevision right ?
Why not use foreign key constraints ? Make application field of the applicationRevision table a foreign key. Identify application with a id and not a name. Make name the property of the revision.
So lets say you want to search for a application which has a name "wxyz", so you do a
select id,application from applicationRevision where name="wxyz" order by time DESC LIMIT 1;
This gives you the application id. You can do a JOIN and get application fields from a single query