SQL table structure for less structured or unpredictable data - mysql

I'm new to (My)SQL and it's been difficult to find good info on best practices of table design.
I want to save sequences of moves on a chess board, say I have an array $a = ['e4 e5', 'Nf3 Nc6', ...]
Being new, my first idea is a dumb little table with 2 columns, one for the Game ID and one for the moves. The moves (array) would be serialized and stored in a string. I guess this would technically work, but reading and writing potentially huge serialized arrays from a DB - perhaps on every page load - seems suboptimal to me.
Caching the array on the user side might not be possible for various reasons and is not something I'm curious about.
I'm interested in learning how to best store data that can't be entirely predicted in it's format (e.g. the number of moves can vary from 1 to 1000).

Rather than storing the entire game history in a single row, why not store the game ID, the move, and the sequence number of that move.
That way you would retrieve the entire history of a given game by doing something like
SELECT *
FROM MovesTable
WHERE gameID = id
ORDER BY sequence

In general, I would use a table in one of the following forms:
GameId, MoveNumber, Move
GameId, MoveNumber, FromSquare, ToSquare
Which one you use will depend on what you will need to query against and how the data will be presented, but I would lean toward the latter suggestion.
You can then combine this with a parent table that contains the GameId and some data about the game itself, such as dates or players.
If you're only going to consume the moves as an entire block - that is you always want the entire move chain and never will query into individual moves - you could store the string as you suggest. This has the added benefit that there is only one row of data to return, which will be very fast. The downside of course is that you will have to deserialize/parse the data once you receive it.

I'd do something like
Game, MoveNumber, Move
----------------------
Game1, 1, e4 e5
Game1, 2, Nf3 Nc6
...
Game2, 1, ....

You will need a number of tables, one for the players, one for games, one for the pieces and one for moves.
A game would have players and colours and things like date and time
A piece would tell you how it moves, is it a knight or a queen etc.
A player would have the player's name etc.
A move would have the piece, game, player, start position and end position
You would link these tables together by relationships based on id fields in each of the tables.

Try two tables.
First table contains information pertaining to each game. Who are the players, where and when did they play, who won, and whatever other information is pertinent to each game and makes each game unique.
Games
Game_id PK
Move_id FK
Black_name
White_name
Game_Location
Game_Date
Game_Time
Winner
The second table contains all the moves for each game. This contains all the pertinent information for each move: was a piece taken, was the other player put in check, was this a regular move or did they castle, etc.
Moves
Move_id PK
Move_number
Who_moved
Piece_moved
Square_from
Square_to
Piece_taken
Move_type
Check_YorN
Now each row in the Games table (game) is joined to many rows (moves) in the Moves table.

Related

What kind of database or way to implementation in mysql

I have created database where I have table where I´m saving game activity of footbal match (goals, assist, red/yellow card, 11m, ...). I want to make another game f.e. basketball, but also two player sports - table tennis, tennis, ... The problem is that there is a lot of things implemented in this database and it is MySQL.
My question is. What is the best way to do. Create new activity table for every sport (becouse there are different activities) or create column where will be something like XML file and every sport module (JAVA) will be working with that XML or there is another solution for that?
Thank you for response.
Do not store junk in your database. normalize the data you put into it, always, for your own good.
Having been a programmer for a long time i've never seen anything good come from storing xml or json blobs in the database a couple of years down the line. Imagine for instance you want to change the structure of the stored xml/json data, or there is an error. Now you need an extra tool to work with it.
I've worked in the sports stats business. You will want to normalize your data for every specific game type. (maybe even give them a separate database). they are all just slightly different, and soon you want to be tracking game,set,and match for tennis and heats for races and halfs for soccer, track rankings within different game types and scoreboards for wildly different game systems.
You don't want to store non-relative data in the same table.
I would suggest either a separate database per sport, and at a minimum a separate table. You obviously don't want to store a Tennis player with the Baseball players.
I would suggest making new tables for each sport. Name the DB something relative to all to all of them like Sports. Each table could be named the sport.

Database + Class design guidance - Is this more complicated than it needs to be?

Novice relational database design question here. This is a "I feel like I'm doing wrong. What is it?" question. What it boils down to is, how can I avoid unnecessary complexity when designing a relational database. I also realise that this is as much a question about class structure design, seeing as I'm using an ORM and I'm really thinking of these tables as objects.
Background
Lets say I want to record the results of a number of competitive games between an unspecified number of "players". These players all belong to a "leaderboard", so, the leaderboard has multiple players and records multiple results. The "score" for each of the players is recorded at the end of each game and belongs to a single "result" instance (see image). A score is also parented by the player to which it belongs.
edit: An example
Each row in the leaderboard table represents a collection of players who together form a league. For example, all of
the players who belong to a tennis league will have the same
leaderboard_id in the player table.
A row in the results table
represents a match that has taken place between players that belong to
a particular league. So the leaderboard_id associated with our
players is recorded in each result in this league. The results table
doesn't hold the score of each player, rather, I've attempted to
normalise (appologies for potentially inappropriate use of that term)
these into a score table.
Bringing this all together. We have a
league in the Leaderboard table, in which a game has taken places
between two players. These players belong to the league in
question. The two players have just played a match and their scores
are recorded as rows in the score table. These rows are collected
together under a single results_id, refereing to a row in the
results table.
Question
Q1. Does this make sense? Is there anything glaringly obvious that is wrong with this design?
As is, I can easily query the scores a particular player has accumulated over time, look up the players that played in a particular result, etc. However, there are some actions that really feel like they should be simple, but, to me, feel overly complicated.
For instance, if I want to return the most recent results for a particular player (ie not the player's scores, rather the results that contain a score that belongs to our player).
So, hand-wavey Q2. Maybe this is just lack of experience with SQL, but, I shouldn't have to do a JOIN to look up this, should I? But then, what's the alternative? Should I create a one-to-many composition between player and results, so that I can simply look up a player results?
With the current design to find the most recent result for a player I would need to do something like this (python sqlalchemy)
Score.query.join(Result, Player)\
.filter(Player.id ==player_id)\
.order_by(Result.timestamp.desc()).first()
Is this bad?

MYSQL Relational Schema Planning

I'm trying to figure out the best way to manage this data storage problem....
I have a table of players, teams, and competitions.
A team may be involved in let's say 3 competitions.
A player belongs to a team, but may only be eligible to play in 2 of the 3 competitions that his or her team plays in. Likewise another player of the same team may be eligible for all 3.
I don't want to add a column to the player table for each competition as I'm then moving away from the relational model. Do I need another table 'competition_eligiblity' - this seems like a lot of work though!
Any suggestions?
Thanks,
Alan.
Yes, you do need a table for competition eligibility.
It really is no more work to put it there. Actually, it will be less work:
Adding a new competition in the future will be a real pain if it involves adding a new column to a table.
If the competition eligibility is stored in columns, performing a query to get information on eligibility becomes a nightmare.
Suppose you wanted to list all the competitions players are eligible for. Here would be your query:
select player, "competition1" from players where competition1_eligible = 1
union all
select player, "competition2" from players where competition2_eligible = 1
union all
select player, "competition3" from players where competition3_eligible = 1
union all
select player, "competition4" from players where competition4_eligible = 1
Sounds like fun, eh? Whereas, if you have an eligibility table, this information will be very simple to get.
Update: storing all the eligibility info in a single value would be even more of a nightmare, because imagine trying to extract that information back out of the string. That is beyond the limits of a sane SQL query.
Creating a new table is really a trivial piece of work, and you only have to do that once. Everything after that will be much easier if you do it that way.

Database design to store lottery information

I am designing a system where I am supposed to store different types of Lottery(results + tickets).
Currently focusing on US Mega Millions and Singapore Pool Toto. They both have a similar format.
Mega Millions: Five different numbers from 1 to 56 and one number from 1 to 46.
Toto: 6 numbers from 1 to 45
I need to come up with an elegant database design to store the user tickets and corresponding results.
I thought of two ways to go about it.
Just store 6 six numbers in 6 columns.
OR
Create another table(many to many) which has ball-number and ticket_id
I need to store the ball-numbers for the results as well.
For TOTO if you your numbers match 4 or more winning numbers, you win a prize.
For Mega millions there is a similar process.
I'm looking for the pros and cons or possibly a better solution?
I have done a lot of research and paper work, but I am still confused which way to go about it.
Two tables
tickets
ball_number
ticket_id
player
player_id
ticket_id
// optional
results
ball_number
lottery_id
With two tables you could use a query like:
select ticket_id, count(ball_number) hits
from tickets
where ball_number in (wn1, wn2, ...) // wn - winning number
group by ticket_id
having hits = x
Of course you could take winning numbers from lottery results table (or store them in the balls_table under special ticket numbers).
Also preparing statistics would be easier. With
select count(ticket_id)
from tickets
group by ball_number
you could easily see which numbers are mostly picked.
You might also use some field like lottery number to be able to narrow down the queries as most of them would concern just one lottery.
One table
Using one table with a column for each number might make the queries much more complex. Especially that, as I believe, the numbers are sorted, and there are be prizes for hitting all but one (or two) numbers. Than you might have to compare 1, 2, 3, ... with 2, 3, 4, ... which is not as short as straightforward as the queries above.
One column
Storing all entries in a string in just one column violates all normalization practices, forces you to split the column for most of the queries and takes away all optimization carried out by the database. Also storing numbers requires less disk space than storing text.
Since this is a once a day thing, I think I'd store the data in an easy to edit, maintain, visualize way. Your many-many approach would work. Mainly, I'd want it easy to find users that chose a particular ball_number.
users
id
name
drawings
id
type # Mega Millions or Singapore (maybe subclass Drawing)
drawing_on
wining_picks
drawing_id
ball_number
ticket
drawing_id
user_id
correct_count
picks
id
ticket_id
ball_number
Once you get the numbers in, find all user_ids that pick a particular number in a drawing
Get the drawing by date
drawing = Drawing.find_by_drawing_on(drawing_date)
Get the users by ball_number and drawing.
picked_1 = User.picked(1,drawing)
picked_2 = User.picked(2,drawing)
picked_3 = User.picked(3,drawing)
This is a scope on User
class User < ActiveRecord::Base
def self.picked(ball_number, drawing)
joins(:tickets => :picks).where(:picks => {:ball_number => ball_number}, :tickets => {:drawing_id => drawing.id})
end
end
Then do quick array intersections to get the user_ids that got 3,4,5,6 picks correct. You'd loop through the winning numbers to get the permutations.
For example if the winning numbers were 3,8,21,24,27,44
some_3_correct_winner_ids = picked_3 & picked_8 & picked_21 # Array intersection
For each winner - update the ticket with correct count.
I may potentially store winners separately, but with an index on correct_count, and not too much data in tickets, this would probably be ok for now.
I would just concatenate them using a convention and store them in one column.
Something like '10~20~30~40~50~!60'
~ separates numbers
! indicates special number ( powerball, etc)
Have a sql table valued function split the result if you really need to have it in columns.
Firstly, let me say that I'm an Oracle person, not a MySQL person.
Secondly, I'd usually say to go for a normalised design, but I'm tempted here to think of a very unconventional alternative which I'll float out here for comment.
How about you denormalised it to the extent of using one column for all the number choices?
ticket_id integer
nums bit(56)
special_number integer
It would be a pretty compact representation, and you could perhaps use bit-wise operations to find the winners or potential winners.
No idea if this is workable ... open for comments.

Is this an acceptable situation to store an array in a database?

I've been developing an application and I've run into a situation where I would like to take a snapshot of the current data.
For example, in this application, users will have varying stats and be able to enter matches. How they place in the matches depends on their stats. When the matches are determined the application will pull all of the user's current stats and determine their points to see who wins.
Now after a match is over I want users to be able to view past matches and the problem arises when I want to display what the participants points were at the time of the match. I would think it would be acceptable to store an array structured like so:
array(
array(username, points),
array(username, points),
etc.
)
Now normalizing the data may be the best practice normally but in this situation:
There can be anywhere between 2 and 25 participants in a match.
The data will never be updated, only read.
I would think having it in an array structure in the database will save me time from having to construct an array in my back-end code.
EDIT: The data is not permanent. Match records will be deleted 7 days after the match has ended.
Can anyone tell me if this solution will provide any problems?
EDIT
I would be saving the data after serializing the array so in my database I would have a table called 'matches' and it would have a column called 'results'.
The rows for this column would contain serialized arrays. So if the array looked as such:
$array["a"] = "Foo";
$array["b"] = "Bar";
$array["c"] = "Baz";
$array["d"] = "Wom";
Then the row in the database would look like this:
a:4:{s:1:"a";s:3:"Foo";s:1:"b";s:3:"Bar";s:1:"c";s:3:"Baz";s:1:"d";s:3:"Wom";}
This solution wouldn't pose any problems in the short term - but say you wanted to eventually add in functionality to show all of the games a user has played in, or their highest scoring games... having this data in an inaccessible-from-sql array would not allow you to have those features.
I'm thinking a table like this would be perfect:
CREATE TABLE game_scores(
id int AUTO_INCREMENT NOT NULL PRIMARY KEY,
game_id int,
user_id int,
final_score int,
KEY(game_id),KEY(user_id)
)
At the end of every game, you'd simply insert a row for every user that was playing that round with their corresponding score and the game id. Later, you'd be able to select all of the scores for a certain game:
SELECT * FROM game_scores WHERE game_id=?
... or show all scores by a certain user:
SELECT * FROM game_scores WHERE user_id=?
etc. Have fun with it!
If you're really committed to the use cases you've outlined in the question along with the qualification in your comment to Sean Johnson, then I don't see any problems with your approach.
I still might qualify that by suggesting that you normalize the data if you think there's a chance you'll want to be able to mine historical information, but dumping an array into the database as a long lived (relatively speaking) sort of cache might make sense. In other words, store it in both formats, but the main line of the use case you've outlined would just hit the array format, but you'd still have the data in a queryable form if you ever wanted it.