I'm considering storing a working directory (i.e. recursive/nested folder/files) into a mysql database.
The idea is to have a 'projects' table, and a table that contains all folders / files with their corresponding paths in the tree (varchar).
Querying a project should return folders and files as a list of String paths (+some meta-data) that I will use to build the tree at the client.
I should mention that the 'file' records in my working directory are meta-data that represent json files in a mongodb datastore (they describe webpages as complex nested json). File records will be queried more frequently than folders (they don't get queried at all), and will be linked to other tables. Files have a different meaning for my app than folders, which are merely important for my working directory.
My question is what would be the best option:
Store files and folders in separate tables
Store files and folders in the same table (the records will be near identical), and use a FK on a joined table with 2 records "file"/"folder"
Short answer: "classic" relational databases (including mysql) are not very good at this. Here is a good link: Managing hierarchical data in mySQL
On the other hand, Wordpress is an application that does many of the things you're trying to do - and it does it all in mySql (or equivalent RDBMS).
Look here, look especially at the "taxonomies" section.
Another approach might be to look at a graph database, like neo4J.
'Hope that helps ...
Related
I have a system that produces CSV data on a regular event-driven basis (say, daily). Each event triggers the creation of a new folder and a fixed set of CSV files, each representing different types of data. For instance:
PlansDB.csv - data for plans of action
StepsDB.csv - descriptions of steps used by different plans
GroupsDB.csv - data on groups that can handle plans
RoomsDB.csv - data on places where a group can work on a plan
ResultsDB.csv - the records of results from steps of a plan
These have fields that identify the relationships between the different files, and I have no problems creating a data model for the CSVs in any given folder.
But how do I switch between folders? Once I have a working data model and some reports built off it, I'd like to view those reports on specific folders of data. How does that work? Can I switch easily to yesterday's folder, or last weeks, etc. with minimal effort (preferably just pointing to the folder).
The CSV files maintain the same names across folders which represent the types of the data they store. Can Power BI use that?
And can I run reports over multiple folders maintaining this data model? I know of the Folder merge capability, but my attempts at using it just merges all files as if they were the same type, whereas I would need each type merged separately.
You need to change the data source. To do this, from "Edit Queries" select "Data source settings":
Then click "Change Source..." button and select the new folder. After that Power BI Desktop will tell you to apply the changes and will reload the data from the new folder:
I work at a company with three different departments and about 15 different GIS users (ESRI). Our work is in natural resources and we have hundreds of large orthophotos and lidar data that take up a lot of space.
Right now, the three different departments "share" all the GIS data, but they store them on different servers (and have for 15 years). So at this point, even though most departments have the same data, they are stored in different places, organized differently, have different naming conventions, and taking up 3 times the storage space (since everyone has different instances of the same data), etc.
I have been tasked to consolidate all the data into one, shared, organized folder. We want to have everything stored in this "master" folder and have the departments using this master folder going forward.
We plan to have separate folders for orthophotos, lidar files, vector data, etc. Within those folders, we have agreed that every file should be named with some variation of the following metadata: year, source, and geography. Everyone has their own way organizing their data, and have for 15 years. Some people want it organized by year_source_geography, some want it geography_year_source, some want it source_year_geography, etc.
QUESTION:
1) Lets say that we all agree to sort the master folder by year_source_geography: is there a way to create other folders with different naming conventions that will "shortcut" to the master data? The idea is that everyone can still have their files (shortcuts) organized the way they want to, but without creating duplicate files. For example: in addition to the master folder which is sorted by year_source_geography, can we have another folder that is sorted by geography_year_source that will "shortcut" to the master list?
2) Do you have any comments on how I'm organizing this data? I'm fairly new to GIS organization, so any suggestions or comments on how I should be organizing this data is welcomed.
Thanks!
I would implement DB rather than plain files.
Start from the following: http://geojson.org/
You can add your properties, such as year, filenames, paths, etc...
Even if your users still want the data in plain files, I still would manage the links and paths in NoSQL DB. This will provide you a great level of flexibility. Push the data into AWC or similar platform.
You can start testing with MongoDB free service http://www.mlab.com
I am still learning SQL Server.
The scenario is that I have a lot of .txt files with name format like DIAGNOSIS.YYMMDDHHSS.txt and only the YYMMDDHHSS is different from file to file. They are all saved in folder Z:\diagnosis.
How could I write a stored procedure to upload all .txt files with a name in the format of DIAGNOSIS.YYMMDDHHSS.txt in folder Z:\diagnosis? Files can only be loaded once.
Thank you
I would not do it using a stored proc. I would use SSIS. It has a for each file task you can use. When the file has been loaded, I would move it to an archive location so that it doesn't get processed the next time. Alternatively you could create a table where you store the names of the files that were successfully processed and have the for each file loop skip any in that table, but then you just keep getting more and more files to loop through, better to move processed ones to a different location if you can.
And personally I also would put the file data in a staging table before loading the data to the final table. We use two of them, one for the raw data and one for the cleaned data. Then we transform to staging tables that match the relational tables in production to make sure the data will meet the needs there before trying to affect production and send exceptions to an exception table of records that can't be inserted for one reason or another. Working in the health care environment you will want to make sure your process meets the government regulations for storage of patient records for the country you are in if they exist (See HIPAA in the US). You may have to load directly to production or severely limit the access to staging tables and files.
Background:
I am making a website where I want modular administrative rights for read/write/edit priviledges. My intent is to allow for any number of access level types, and to base it off of folder structure.
As an example, root admins would have read/write/edit for all site pages. Group A may have read/write/edit to all files in the path www.example.com/section1/ (including subfolders), Group B would have read/write/edit to all files in www.example.com/section2/, and so on.
I have considered two options to impliment this: create a MySQL database that would hold:
Group Name (reference name for the access group)
Read (list of folders the group can read separated by comma)
Write (list of folders the group can write new content to separated by comma)
Edit (list of folders the group can change already existing information separated by comma)
The other option I considered is creating a 'GroupAccess.txt' file somewhere and hand-jamming the information into that to reference.
Question: What are the advanatages of each of these systems? Specifically, what do I gain from putting admin access information in a database versus a text file, and vice versa? (i'm looking for information on potential speed issues, ease of maintainability, ease of editing/changing the information that will be stored)
Note: I'm not looking for a 'which is better', I want to know specific advantages so I can make a better informed decision on what's best for me.
The first thing that comes to mind is that the database would be more secure over a text file for the simple reason a text file can be read over the internet as most web servers serve .txt file by default, this would allow for users with restricted access and non-users of the site to see the whole structure of you site and in turn can make you more open to possible attacks on certain areas of your site.
Another benefit of using a database is that you can easily use a join to check is a user has access to some content in the database where as with a file you'll need to read the file get the permissions and the go build the SQL and get the data from the database.
Those are just two of the things that have stuck out from reading your question, hope it helps.
I'm looking to make a hopefully rather simple document storage system in sql 2008. We have a general idea of the elements we need, some meta data storage, filesteam, etc, but there are a few things we aren't quite sure of.
Specifically, we would like to implement a fake folder structure, as well as some (flexible) permissions. Permissions could be on a group level or by individual users, and should we should be able to specify no access, read, read/write, on either file level or folder level.
I'm not looking for someone to write this schema for me. But what I am hoping for is someone has resources that would cover these topics?
Thanks
~Prescott
I think you should go with the classic route of having a documents table that would hold the docs (If using 2008 or above look at FILESTREAM). The meta tables would then link to that.
Your folder structure could be achieved by having a folder table, the material table could then have a field to show which folder the material is in.
To get the sub folder levels you would just have a parent folder field in your folders table self linking back to the same table. You can then render that up in a treeview control in what ever flavour language you wanted
Have you looked at FILESTREAM Storage in SQL Server 2008?