I'm learning couchDB ,and decided to build a bookmark manager as a practice project.
I'm a little stuck in RDBMS ways of thinking and keep wanting to creat documents as if they were tables.
So I'm wondering if this approach is "correct" in terms of using document database like couchDB?
Each document contains user data, I name the document user12345612345.json and then structure the data inside like this:
{
"username":"todd",
"password":"hSnxS987sDJf77600sHjdyDhskJShdskshjS75sa765sa"
"bookmarks":
[
{
"url":"http://www.hello.com",
"title":"Hello website"
},
... etc...
]
}
So, I am storing all bookmarks for one user in a single document. That way, I can load this on login, and manage the data, and if there is an update, I update that one document.
I'm thinking with a number of users, if I stored each bookmark as a document, that would be zillions of documents that would need to be indexed by user - not sure that would be the right approach.
My next step is to add folders and tags. I'm thinking I would simply add those arrays into this user document.
Right?
You are doing the right thing. The biggest mental step you need to take when moving from structured to "less structured" is that you keep object properties together. So if a user is unique and bookmarks are associated to that user, keep the user and the bookmarks together like you are doing now.
Same for your users and tags. Think about what you are doing in your applications. Probably you would like to retrieve a user with his bookmarks, tags, folder, etc in just one call instead of going over several tables, joins, etc. That's the beauty of NoSQL. ;-)
Related
I want to build an application that uses data from several endpoints.
Lets say I have:
JSON API for getting cinema data
XML Export for getting data about ???
Another JSON API for something else
A csv-file for some more shit ...
In my application I want to bring all this data together and build views for it and so on ...
MY idea was to set up a database by create schemas for all these data sources, so I can do some kind of "import scripts" which I can call whenever I want to get the latest data.
I thought of schemas because I want to be able to easily adept a new API with any kind of schema.
Please enlighten me of the possibilities and best practices out there (theory and practice if possible :P)
You are totally right on making a database. But the real problem is probably not going to be how to store your data. It's going to be how to make it fit together logically and semantically.
I suggest you first take a good look at what your enpoints can provide. Get several samples from every source and analyze them if you can. How will you know which data is new? How can you match it against existing data and against data from other sources? If existing data changes or gets deleted, how will you detect and handle that? What if sources disagree on something? How and when should you run the synchronization? What will you do if one of your sources goes down? Etc.
It is extremely difficult to make data consistent if your data sources are not. As a rule, if the sources are different, they are not consistent. Thus the proverb "garbage in, garbage out". We, humans, have no problem dealing with small inconsistencies, but algorithms cannot work correctly if there are discrepancies. Even if everything fits together on paper, one usually forgets that data can change over time...
At least that's my experience in such cases.
I'm not sure if in the application you want to display all the data in the same view or if you are going to be creating different views for each of the sources. If you want to display the data in the same view, like a grid, I would recommend using inheritance or an interface depending on your data and needs. I would recommend setting this structure up in the database too using different tables for the different sources and having a parent table related to all them that has a type associated with it.
Here's a good thread with discussion about choosing an interface or inheritance.
Inheritance vs. interface in C#
And here are some examples of representing inheritance in a database.
How can you represent inheritance in a database?
i'm starting a project about a social-based application, so i need to track users actions in time. To avoid an Epic-Sized table, i'm thinking about create a table for every single user, and record actions by user.
I've never heard about links of this type (row to table) and i don't know where to find some documentation about this particular argument.
About this, my boss wants to use Drupal for this project, someone have infos about this kind of structure in particular in drupal?
Hmm...maybe you should go for some "lower level" solution instead of Drupal. Drupal is CMS and if you make website Drupal's way you won't have freedom defining your tables the way you like. It's more you have one table defining some common (default) content type fields and for any new you add Drupal actually creates new table so you end up with some complex database structure.
Of course you can manually create your tables and use them also manually instead of using drupal's nodes and views and stuff, but then...what's the point of using Drupal?
So, IMHO some framework or even plain PHP would be more suitable for your project.
I have a huge MS Word file I use for personal notes but I want it to be more flexible. The file is made from short articles (600 words) with date, title, sometimes a table or some images. I came up with the idea of separating the single articles an put them in Access, to extract them from the database with queries, add tags, sort chronologically.
One big question is: which format should I use? I tried Access 2010 Rich Text Edit but it doesn't show tables, and I don't know where to store images. My idea is to store images outside the file.
Another thing I tried is to store the files as HTML in the database directory, but when I try to add some interface functionality I encounter problems with the most trivial things, like making VBA open the associated file. I don't like storing outside Access also because I don't have full-text search.
The primary requisite for this application is that it must not be cumbersome: it's a prototype I want to use to see if my model of storing notes works, so I don't want to spend a month programming an user interface, and if I note any defect at runtime I must be able to switch to design mode and fix it in minutes. If I want to write something, i don't have to worry about HTML syntax but I want to be able to add some simple table or image.
What I finally search is a HTML viewer in Access interface that receives an HTML string (composed by a query) and displays it.
At this moment I'm considering to remain with my MS Word file because switching seems too complex, although I don't like the sequentiality of articles and the hierarchy of chapters/subchapters, which made me think about this Idea.
The answer to all those problems was Evernote, which is like a Wiki you can edit quickly also from a smartphone, with or without an internet connection, which syncs to a master version on an Evernote server and without the constraint of having to invent a title for every page/idea.
If I had a huge Word document like yours, I'd probably split it into individual files and use something like dtSearch.
I have a web application that uses a relational database (MySQL). We're adding a new feature that will allow certain users to dynamically construct 'forms' from a pool of optional form elements and distribute these forms for completion/submission to other users.
The problem lies in storing the completed form submissions. Each form can and will vary in the number and combination of form elements, and with a relational database my options are somewhat limited to dynamically creating a new table to hold the submissions of each form (seems like a bad path to go down) or storing each of the submitted forms as JSON in a TEXT column (losing all useful querying abilities of RDBMSs)
I've never actually used MongoDB in a production project before, but I'm thinking it might be a good idea to use my MySQL relational database to store all the forms created by certain users of my application, and then store all the submissions in MongoDB with each document referencing the UUID of the form in MySQL.
The first disadvantage I can think of with this approach is there's no referential integrity between form submissions and the forms located in MySQL. If I delete a form in MySQL, all of the form submissions will have to be manually deleted (if I want to replicate the 'Cascade' effect)
Would I store all of my form submissions for all of my forms in a single MongoDB collection as individual documents? Any advice is greatly appreciated. :)
EDIT 1
Based on the documentation here: http://www.mongodb.org/display/DOCS/Using+a+Large+Number+of+Collections
I'm now considering creating a new collection to hold all the submissions from each unique form type.
EDIT 2
After some careful consideration and the advice of others I've decided to abandon my dual-database approach to solving this problem in favor of a relational-database schema that I think solves the problem of creating dynamic forms and saving the form submissions in such a way that they're easily query-able for complex reporting.
Essentially every record in 'forms' represents a unique form that was built by a user. 'forms_fields' has a foreign key that references the form and an enum-type with the options:
1. checkbox
2. textfield
3. textarea
4. select
5. multi-select
6. date
'forms_fields_options' contains all of the 'options' a select field would have.
With these three tables, users can create customized forms.
When another user fills out & submits the form, a record is created in forms_submissions. For each field, a corresponding record will be created in 'forms_submissions_fields' that references the form submission and the forms_fields_id. The final table, 'forms_submissions_options_multiselect' is essentially a join-table to indicate which options from a multi-select form field the user selected.
A colleague of mine recently lead a webinar on just this subject, titled "Hybrid Applications with MongoDB and RDBMS". You can view it here:
http://www.10gen.com/events/hybrid-applications
From the comments, it looks as though you have already decided to go the RDBMS route, but hopefully this can give you some ideas for a future project, or be beneficial to someone else reading this thread.
Good luck with your application!
This can definitely be done in SQL using EAV. So NoSQL is definitely not required.
Using a tool like MongoDB could be a good fit for the flexible results you want to save, however, there are some trade-offs here, but they may not be exactly what you're expecting.
... storing each of the submitted forms as JSON in a TEXT column (losing all useful querying abilities of RDBMSs)
How many form submissions are you planning to have? What type of querying are you planning to do?
My experience with MongoDB is that it performs very poorly when you're querying against data that is not indexed. In addition, aggregation is generally done in batches using Map/Reduce (or the new Aggregation Framework).
If you compare the complexity of doing roll-ups or the efficiency of doing queries, it's not clear that MongoDB is significantly better here than EAV.
If I delete a form in MySQL, all of the form submissions will have to be manually deleted
Oddly, I have rarely seen this as a problem as you will probably never delete the form in SQL. you will likely do a logical delete and never really remove anything. So this is probably a moot point.
Would I store all of my form submissions for all of my forms in a single MongoDB collection as individual documents?
Again depends on how many forms and submissions you're planning to get. If you have lots of both, then using a collection / submission is going to be very difficult to shard later on.
Honestly, I would use a single collection and then override the _id field to something that can reasonably be used as a shard key. There are some fancy tricks you can play here, but that's beyond the scope of this little write-up.
Summary
End of the day, you can definitely use MongoDB for this problem, but it's not a "home run". If you are unfamiliar with MongoDB, this is definitely a fair "learning project", but expect to hit some roadblocks around querying and aggregation.
I think you're overlooking the fact that an RDBMS will allow things like EAV (Entity-Attribute-Value, which is horrid if you overuse it, but can be great in moderation) or join tables, to construct multiple ordered relations from a single form to various form elements.
I'm not suggesting that an RDBMS is perfect for everything, or even your situation, but I know I have had to build similar systems and have never had to go noSQL to support them in a reasonable way.
Edit: More to the point... storing actual field values puts you one more relation out from the original form elements, but if your UI keeps things consistent you can do this generically. I'd say looking further into which noSQL solutions allow the particular kinds of value-based querying you need might shed more light on your options.
I've got several pages about products that I want to load into a database and instead of creating a separate html page for each product, I was thinking of creating a single page that will display whatever product the user clicks on. Each product page will have a similar structure with its name, picture, description, bullet points for features (varies from product to product), price.
My question is if I want to store all those information in a database (I imagine I would need a different field for each paragraph, picture, name, each bullet point, etc) is there a way to get around that? To store all those information in a single field or as few as possible and still keep the formatting. It seems like I would be overloaded with the number of fields I have to manage.
I'm starting to doubt if this was even a good idea to begin with...
Do not store all that information in a single field. If you are going to do that, then just create the HTML page and save yourself the trouble of having a database that you aren't properly utilizing.
What you need to do is identify the relationship between all parts of your page. For example, if a single product can have multiple photos you would want to define a multi table relationship that defines a one-to-many relationship between the Product and ProductImage tables.
Grasping how relational databases relate to the data you are working with can be difficult at first and it might pay off to hire someone for a few hours to go over what you are trying to do and how to implement this effectively using a DB. Since it is a real world example for you it will be an excellent way to learn. Good luck!
You're not the first person to want to do something like this. It's a very common problem that has a well established solution. You need to use what's called a web content management system. WCMSs allow you to use a common template throughout your website while filling in specific stuff for each page. I recommend Joomla because it's easy to setup, easy to use, and most web hosts support it. But you can also look at stuff like Wordpress or Drupal. Wordpress is more blog centric though and Drupal has a steep learning curve.