How should I store similar entities - in one table or several? - mysql

I am creating a CV website, but in difference to most I am trying to make it with database. I mean that usually such websites are static and all of the information is hard coded in the HTML. Since I am back-end developer I like to make it so everything including buttons and welcome messages are taken from the database. I am trying to store projects that I have worked on. There are several types:
Github Repository - a project that is done purely on github.
Work related - a project I have done on work and there is no github repository of it, only link to view the final result
UpWork or other freelance website - as a freelancer I have projects to fix something on a website and those projects can be viewed only on my profile there and I would like to list them with link to UpWork or wherever there is information on what exactly I was hired to do.
Now my question is - should I have different Entities and therefore different tables for these types of projects or should I have all of the possible properties in one table. For example if it is Github there is repository field and if it is work related then there is company field. If it is freelance it has link to the website I was hired on. Also there are different sub-types - web applications, desktop applications, games and so on.
As you can guess the changes are small (1 or 2 properties). I could very easily leave empty some properties and have another property projectType, but is this the right way? Should I have different tables and entities for them?
To give some info - I can work with both MySQL and NoSQL and I havent decided yet on which one should my website be made on. I am currently thinking about NoSQL. This means I am asking on how to store the projects on MySQL and NoSQL (by NoSQL I mean MongoDB). If it helps the languages I am choosing from are PHP (MySQL) and JavaScript (NoSQL)
I know that usually questions without code are downvoted, but this is more of a logic based problem as I know how to do it, but I don't know the best practices for my situation. This being said here is a small code for you -
console.log('Thank you in advance')

MongoDB lends itself very well to this exact situation.
You can create a collection where documents leave out certain fields if they are not needed for that type. The querying parameters of MongoDB allow you to check $exists on fields if you need to, and documents are stored efficiently, only taking up memory where a field is needed.
You can even setup a sparse index which is not required for every document. As long as your core document structure is the same, it is a good idea to keep them in one collection, and vary them based on their type.

Related

Best practice to use several APIs or data sources for one application

I want to build an application that uses data from several endpoints.
Lets say I have:
JSON API for getting cinema data
XML Export for getting data about ???
Another JSON API for something else
A csv-file for some more shit ...
In my application I want to bring all this data together and build views for it and so on ...
MY idea was to set up a database by create schemas for all these data sources, so I can do some kind of "import scripts" which I can call whenever I want to get the latest data.
I thought of schemas because I want to be able to easily adept a new API with any kind of schema.
Please enlighten me of the possibilities and best practices out there (theory and practice if possible :P)
You are totally right on making a database. But the real problem is probably not going to be how to store your data. It's going to be how to make it fit together logically and semantically.
I suggest you first take a good look at what your enpoints can provide. Get several samples from every source and analyze them if you can. How will you know which data is new? How can you match it against existing data and against data from other sources? If existing data changes or gets deleted, how will you detect and handle that? What if sources disagree on something? How and when should you run the synchronization? What will you do if one of your sources goes down? Etc.
It is extremely difficult to make data consistent if your data sources are not. As a rule, if the sources are different, they are not consistent. Thus the proverb "garbage in, garbage out". We, humans, have no problem dealing with small inconsistencies, but algorithms cannot work correctly if there are discrepancies. Even if everything fits together on paper, one usually forgets that data can change over time...
At least that's my experience in such cases.
I'm not sure if in the application you want to display all the data in the same view or if you are going to be creating different views for each of the sources. If you want to display the data in the same view, like a grid, I would recommend using inheritance or an interface depending on your data and needs. I would recommend setting this structure up in the database too using different tables for the different sources and having a parent table related to all them that has a type associated with it.
Here's a good thread with discussion about choosing an interface or inheritance.
Inheritance vs. interface in C#
And here are some examples of representing inheritance in a database.
How can you represent inheritance in a database?

How do I set up the architecture for a "big data" analysis project?

A friend of mine and I are in our senior year and will be starting a senior project soon. We had the idea to do a data analysis and data visualization project for it. Our project involves reading a CSV file that is updated every 2 minutes, parsing that data, then storing it in a database. Once that data is stored we want to run some analysis on it and provide an API through which we could access that data to visualize in some way. Our end goal would be to build an Android app that displays some of the raw data from the CSV and the analysis in a user friendly format. I talked to another CS Major and he explained that I would need a few different servers to accomplish this: One for the storage, another for analysis, and another for some type of queue that would make sure things don't get screwy while we are doing scraping and analysis. The problem is, I don't really know where to start with this. I've done some work with a SQL database before and a PHP front end, but nothing with multiple servers. I've heard of tools to use with big data projects like Hadoop but i'm not exactly sure where it fits in. If someone could point me to a resource of some kind to explain, or explain themselves, how I would start to structure this kind of project, that would be awesome!
Since you don't have much experience with these things you'll probably want to look at projects like Cloudera. Specifically their resources page has a nice set of videos and articles.
Another source of solid information (that I personally use) is by clicking on an Stack Overflow tag and selecting the votes option. Many good questions on a plethora of big data topics already exists.

Loading contents of a webpage into database

I've got several pages about products that I want to load into a database and instead of creating a separate html page for each product, I was thinking of creating a single page that will display whatever product the user clicks on. Each product page will have a similar structure with its name, picture, description, bullet points for features (varies from product to product), price.
My question is if I want to store all those information in a database (I imagine I would need a different field for each paragraph, picture, name, each bullet point, etc) is there a way to get around that? To store all those information in a single field or as few as possible and still keep the formatting. It seems like I would be overloaded with the number of fields I have to manage.
I'm starting to doubt if this was even a good idea to begin with...
Do not store all that information in a single field. If you are going to do that, then just create the HTML page and save yourself the trouble of having a database that you aren't properly utilizing.
What you need to do is identify the relationship between all parts of your page. For example, if a single product can have multiple photos you would want to define a multi table relationship that defines a one-to-many relationship between the Product and ProductImage tables.
Grasping how relational databases relate to the data you are working with can be difficult at first and it might pay off to hire someone for a few hours to go over what you are trying to do and how to implement this effectively using a DB. Since it is a real world example for you it will be an excellent way to learn. Good luck!
You're not the first person to want to do something like this. It's a very common problem that has a well established solution. You need to use what's called a web content management system. WCMSs allow you to use a common template throughout your website while filling in specific stuff for each page. I recommend Joomla because it's easy to setup, easy to use, and most web hosts support it. But you can also look at stuff like Wordpress or Drupal. Wordpress is more blog centric though and Drupal has a steep learning curve.

Multi language in the database with CodeIgniter

I'm developing a CodeIgniter based site that will be very multi language heavy. Plan is to launch with 5 langs but to rapidly expand. A lot of the content will be user generated and split across multiple tables. In the past I have used the built in language files but I don't think they are going to work in this case. What's the best way to do translations in the database. Should I have a translation table for each table in my DB, eg;
ProductsLang
RetailersLang
CategoriesLang
Etc
Or should I look at creating some sort of central dictionary table. Has anyone done this in CI in the past, couldn't find any existing libaries out there. Your thoughts would be much appreciated.
From my view it really depends on the solution you need - it seems like you're developing an online shop? If that's the case, I would combine both options. Static language files for labels (and other content that shouldn't change).
Although - IMO - the product database shouldn't be aware of an actual translation part; I would rather take the category table and put in the necessary languages as main categories and provide with product-specific categories below the language categories.
At this point you'd might be thinking of all the duplicate products attached to any given language category - but I believe this is a flexible solution for each of the languages.
And a simple script could allow for copying a language category to another, making translation available for the exact same products.

When designing a new CMS database, what would be the most important features to add?

While this question asked something similar too, I'm interested in this from another angle. I'm not interested in the GUI part but in the database/domain part.
(My preference is SQL Server with C#/ASP.NET but this Q should be language agnostic.)
When designing a CMS system, data needs to be stored in tables and a business layer needs to provide access to this data. First of all, a user database with user roles, of course. A mailbox for private messages per user would be nice. Allowing users to set up their own profiles with images and eye-candy would make it even more interesting but let's not focus on the users.
What else should there be in the database for a CMS system? And how should it relate to the other tables?
My focus is to get a clear domain model to use as a basis for any CMS system. Something they all share in common. I'm only interested in the design, so I can later evaluate sseveral existing systems with the preferred domain model, to see which one matches the most ideal situation.
You need to decide what features your application needs, and what features beyond that you want to have in your application. From there you can work out what needs to go in your database. You're thinking about this backwards. The database supports the application, not the other way around (unless you're writing phpMyAdmin or similar!).
If it's a CMS for [I assume] dynamically maintaining a website, the first and foremost features should enable the users to:
Add/Edit/Remove pages (aka. nodes) - the structure of the site
Add/Edit/Remove menus, links - the navigation of the site
Add edit media (photos, video, etc.) - the content of the site
Then you can think of the other website-specific management features such as:
Managing products/customers if it's going to manage e-commerce websites
Various kinds of interactivity features -- comment/contact forms, etc.
A CMS is a big project to tackle, specially if you want to make it portable and/or reusable, expandable. I have already come through three versions of my own CMS -- first two versions in PHP and the latest one in .NET.
While design Database for CMS,must be care full about Product/content data tables,registered user data tables and UI related changes data tables.
UI related data is nothing but a themes,skin which is most important.
Most of the CMS application has developed as requirement comes.so most of the times need to break the relation ship in tables and normalization rule.
But if we careful about some basic db structure which is Content pages or content data tables,Product tables,User permission tables(if using aspnet tables then its better),Order management table(for e-commerce sites) etc.
As well as business layer doing most important role when some big changes comes in CMS Project.
Some times tables have a big changes which not effect on pages,need to change in business layer.
What you are asking is a technology question, which is not necessary related to the features the CMS itself has. My advice and the way I would approach it would be to use an ORM. The reason is that when you start designing the database you fall into the niche of DB concepts. The purpose of a CMS is to provide an easy way to manage content.
In this regard, extensibility is far more important than your database design and the user will not be able to see the database design. In short, start with an API, or if that is too much - just start with what you need to expose to the topmost layer. Then just somehow map this to the database without thinking about the structure itself. In this way, you would be concentrating on the problem you are solving rather than on the technology that might be needed to solve it. It feels more natural that way.