Is a relational database needed? [closed] - mysql

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
I am in the middle of attempting to help my company digitize their history. One project is taking maintenance records for equipment and putting them into a mysql database. The idea is to be able to pull up a history at any time without flipping through piles and piles of paper.
My experience is limited to using phpMyAdmin to create tables and fumbling through php to output data how I want it. I've never used a relational setup.
The data fields would always be the same, the database would be populated via copy/paste from Excel (until such time comma delimited importing can be figured out), and this data would not need to be edited by endusers. It is strictly for viewing/printing purposes only.
Example fields:
id, unit number, unit_type, date, maintenance_performed
My question is, would putting all this into one table be an acceptable way to accomplish this task? Or would a relational setup be better due to the different types of units? Why?

I would focus on getting the data into the database and not on its storage. You are going to have enough problems copy-and-pasting the data in. For instance, how will you ensure that the dates are always in a consistent format?
After the data is loaded into tables, then you can worry about how to optimize it for querying purposes. How will new records continue to be uploaded? That will be a very important part of the process (I would recommend having field a creation date in the database, in addition to other information in the record).
After the data is loaded, you can worry about the best structure for organizing it. This is analogous to a real archivist, who tends to start by gathering lots and lots of data, and then figuring out the best way to organize it.

Related

How to calculate/deal with big amounts of data? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I have a table in MySQL that has about 50 million records (continuing growing), and it is about subscription consumptions.
So, everyday I have to select these records and make calculations on it in order to target different kind of consumptions/clients, for example if a client is active/inactive, how long has been active, if it had changed product, and so on.
At the moment, I have different queries to select the different business cases and then I load data to the staging area and data warehouse. Although, some of these queries are very low and they are overloading productive environment.
I would like to know if there is a known solution(s) or technology to this kind of daily tasks.
I am open to continue with MySQl or try a new big data technology. For example, selecting everyday the millions of raw records to a staging area/ODS and then work on them with some technology.
Does anybody know good solutions for these kind of tasks?
Thank you.
One option might be replication - http://dev.mysql.com/doc/refman/8.0/en/replication.html
That way you can run whatever queries you want on the replicated DB without impacting the live DB.

Which database should i prefer? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I am thinking of storing persons contact data centrally. So there will be so many persons and each will have their contact list. There will be more number of updates and selects on database as user will be searching their contacts or searching for a person not in his/her contact list. Person may be updating their contact details. But inserts in database will be limited because only one time enrollment will be there. I am confused in using databases MySQL or Neo4j. Because when I think of searching person from database neo4j seems better. But when I think of handling millions of records MySQL seems better. So can anyone suggest which database suits best? MySQL/Neo4j/ both MySQL and Neo4j or some other database?
Neo4j allows you to store the connections between the people via their contacts, so if you want to leverage the network effect in your application it makes sense to look into that.
It all depends on how you want people to search and interact with your app. If treat people as individual records with no connections then MySQL is good enough. Otherwise Neo4j would probably work better.
IF you have the time to a tiny PoC with some realistic data with both and then decide for yourself.
you can use MySQL latest version it is quite simple and relevant to your need , you need to just use locking system on your database or you can lock your table when inserting or updating.

MongoDB multiple/single collection and MySQL advice [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
I have a project which is using NodeJS and I have different entities for example, people and places.
I need the ability to find both types of entities by location together so what I was thinking of doing is having an index on a field called, type, for example, which would be either person or place and make use geospatial indexes, does this sound a good way to do this or is there a better way?
I will probably need a lot of joins too, so should I use MySQL alongside MongoDB and use MongoDB just for delivering the location based queries?
Thanks
This question is a poor fit for stackoverflow, but here's some radom bullet points:
PostgreSQL supports both joins and geospatial. I'd pick that first personally lacking other details warranting a different data store.
A totally valid option would be to keep people and places separate and query multiple collections as necessary. However, if you need to sort the results, then yes best to throw them in the same collection.
You could also keep people and places in separate mongodb collections but have a mapreduce job translate them into a locations collection for search purposes.
Generally, there are lots of ways to do this and the best one depends very much on the specific aspects of you application. Reads vs writes, data stability, data size, query load, etc, etc.
My broad word of advice is start with the most logical, easiest-to-follow, straightforward data organization (separate collections), and deviate from that when you understand the specific pain you have and how doing something more complicated or unusual will be an overall win.

Voting system on NoSQL [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
Is it possible/reasonable to have a voting system on NoSQL database ?
For example how would be possible to store StackOverflow question into the NoSQL database.
I can easily imagine almost everything except how the relation will work between question/vote/user. Everything else can be stored in one document, like tags, comments(assuming there are relatively small amount of comments on posts, in my case I will not have comments anyway), user information, etc... but can't imagine how to store user votes as document will become huge. One of the options is that I can have votes stored in separate collection/document, but it will mean that while loading a question there will be a need to send another request to check if the user have voted for a question or not.
A good reference is the MongoDB documentation on Embedded documents vs Referenced documents, since those are what you seem to be referring into your question. There's no perfect solution, as both have their trade offs. You just have to make the best decision based on the type of operations/queries and their frequencies that you're expecting to be run on your database.
Honestly, until your database starts getting some serious traffic, the difference between SQL and NoSQL won't matter. Pre optimization can end up doing more harm than good, so I would just go with the one that is easiest to get deployed and you're more comfortable with to begin with.

What database to use for a clicks stream application RELATIONAL OR NOSQL? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
I need to create a database for dealing with click stream (from ~240 subdomains). I use a Java Script for grabbing information like (Host, Page, Date, userID, Referer, HostName, RefererPath, uniqueUserID) for each click and than insert the data to the database through a java web dynamic application. There are about 9 milion new records each day and I have to insert new records every minute. Another application needs to be able to retrieve information about pageviews/unique visitors/ect for a certain article/subdomain in the last (10min, 20min, 30min, 1hour...24 hours). I only need to keep records for the last 3 months.
Initially I thought about using MySQL as I'm only interested in open-source. But I'm thinking about NoSQL solutions. The problem is that I've had experience only with relational databases and am not really able to tell if NoSQL would be a better solution here or not. Also which database should I use if I choose to go wiht NoSQL? and would Key-value store be the best way to go?
I'm guessing this data consistency isn't critical (statistics ?) so you could indeed spare a bit of consistency. NoSQL seems a good choice and a key value store would also be my pick. Now the real question is : what is the most suited one ?
I'd give a consideration to Redis and Riak (which are basically the most well-known ones) :
Riak (AP system) :
Fault-tolerant (masterless with partitioning and replication)
Map reduce
Full text search
BASE
Redis (CP system) :
Really fast
In-memory : You need RAM ! That also means you want replication so you don't lose everything on a crash. Redis also uses disk snapshot I believe.
Master/Slave with reelection
BASE
Both have a lot more features, you should go read the documentation for gotchas. Redis is primarly used as a cache since it's fast, whereas Riak focuses on fault-tolerance. Given your scalability requirements, both can satisfy your need. Therefore you must chose according to what's above.