mysql - storing a range of values - mysql

I have a resource that has a availability field that lists what hours of a day its available for use?
eg. res1 available between 0-8,19-23 hours on a day, the range here can be comma separated values of hour ranges. e.g are 0-23 for 24 hour access, 0-5,19-23 or 0-5,12-15,19-23
What's the best way to store this one? Is char a good option? When the resource is being accessed, my php needs to check the current hour with the hour defined here and then decide whether to allow this access or not. Can I ask mysql to tell me if the current hour is in the range specified here?

I'd store item availability in a separate table, where for each row I'd have (given your example):
id, startHour, endHour, resourceId
And I'd just use integers for the start and end times. You can then do queries against a join to see availability given a certain hour of the day using HOUR(NOW()) or what have you.
(On the other hand, I would've preferred a non-relational database like MongoDb for this kind of data)

1) create a table for resource availability, normalized.
CREATE TABLE res_avail
{
ra_resource_id int,
ra_start TIME,
ra_end TIME
# add appropriate keys for optimization here
)
2) populate with ($resource_id, '$start_time', '$end_time') for each range in your list (use explode())
3) then, you can query: (for example, PHP)
sql = "SELECT ra_resource_id FROM res_avail where ('$time' BETWEEN ra_start AND ra_end)";
....

I know this is an old question, but since v5.7 MySQL supports storing values in JSON format. This means you can store all ranges in one JSON field. This is great if you want to display opening times in your front-end using JavaScript. But it's not the best solution when you want to show all places that are currently open, because querying on a JSON field means a full table scan. But it would be okay if you only need to check on for one place at the time. For example, you load a page showing the details of one place and display whether it's open or closed.

Related

Rails - Saving Daily Metrics

I'm currently developing a Rails application, on top of PostgreSQL, that stores daily data for our company. We run ads on Facebook, and we have a few hundred ads running at any one time. I pull metrics every day, and import to my application, which then either creates or updates based on if it exists. However, I want to be able to see daily performance over the course of, say a week or month. What would be the easiest way to accomplish this?
My facebook_ad model has X amount of rows, 1 for each ad campaign. Each column denotes a specific metric, i.e. amount spent, clicks, etc. Should I create a new table for each date? Is there a way to timestamp every entry and include the time in my queries? I've made good progress up until here, and no amount of searching has brought me to a strategy I could use.
Side note, we are hoping to access to their API, which would probably solve most of this. But we want to build something in the interim, so we can be as efficient as possible until then, which could be 6 months or more.
Edited::
I want to query and graph the data based on the daily data. For example, grab the metrics from 10/01/14 - 10/08/14 for one ad, and be able to see 10/01/14: MetricA = 1, MetricB = 2; 10/02/14: MetricA = 4, MetricB = 5; 10/03/14: MetricA = 6, Metric B = 3, etc. We want to be able to see trends and see how changes affect performance.
I would definitely not recommend creating a new table for each date -- that would be a data management nightmare. There shouldn't be any reason you can't have each ad campaign in the same table based on what you've said above. You could have a created and updated column in the table which defaults to now(), and if you update it for any reason, set the updated column to now() again. (I like to add those columns to just about every table I create -- it's often useful for a variety of queries).
You could then query that table based on the desired timeframe to get your performance statistics. Depending upon the exact nature of what you want to query, Window Functions may prove to be quite useful.

MySQL Partitioning, Delete old data from multiple related tables

I am new to MySQL partitioning, therefore any example will be appreciated.
I am trying to create a sort of an ageing mechanism for a data that is distributed between several MyISAM tables.
My question will actually include several sub-questions.
The relevant tables are:
First table contains raw data with high input frequency (next to each record there is an auto incremented id).
Second table contains processed results, there is a result record per every raw data record (result record contains the source id record of the auto incremented field of raw data record)
Questions:
I need to be able to partition the raw data table and result data table similarly so that both of them will include only 10 weeks of data in single partition (each raw data record contains unixtimestamp field), how do i do it , can someone write small example case for two such tables?.
I want to be able to change the 10 weeks constraint on the fly.
I want that when ever the current partition will be filled or a new partition is created , the previous (10 weeks before) partition will be deleted automatically.
I don't want the auto increment id integer to be overflown, as much as i understand the ids are unique for the partition only, so if i am not wrong the auto increment id will start from zero for the next partition? but what if the previous partition still exist, will i have 2 duplicated ids , how i know to reference only for the last id when i present a result record?
I want to load raw data using LOAD DATA INTO... instead of multiple inserts , is MySQL partitioning functionality affected?
And the last question, would you suggest some other approach to implement aging mechanism (i am writing Java implementation product that processes around 1 GB or raw data per day and stores the results in MySQL)
It's hard to give a real answer on this question since it depends on your data. But let me give you some things to think about.
I assume we're talking about some kind of logs with recent data (so not spanning multiple years). You can partition by range. You could add one field to your table with the year/week number (ie 201201, 201202, etc). If this question is related to your question about importing into multiple tables, you can easily do this is that import script.
On the fly as in, repartition your data on the fly (70GB?). I would not recommend it. But you could do it if you had the weeknumber in there. If you later want to change it to 12 days, you could add a column for the date and partition by that.
Well it won't be deleted automatically but a cron job can handle that right? Just check how many partitions there are, and if there are 3(?) delete the first one.
The partition needs to have a primary index on the field that you partition (if you want to use auto increment). Therefor you can never fully rely on the auto increment id alone. I don't see a way around this.
I'm not sure what you mean.
If your data is just some logs in chronological order then you might just use separate tables for each period. Then before you start the new period (at 00:00) check the last id of the last table, create a new table and set the auto increment to that value +1. Then your import will decide when a new period will begin so it can be easily changed. Your import script can use a small table in where it can store the next period.
LOAD DATA is really quite fast. I would just have two steps(in no partic order) - LOAD DATA and then 'delete .. where date < 10 weeks'. Autoincrement will go on for as long as the datatype you're using. If you wanted to be super careful you could push it back to zero periodically.
Once the data is in the 'raw' table run your routine to create the 'processed' table. We use a v similar process where I work. We keep a separate table that has 'write' and 'parse' pointers to all of our 'raw' tables. As new data comes in and gets parsed the appropriate row pointers get set. If the 'raw' table gets truncated you can reset the 'write' pointer but leave the 'parse' pointer. (we store the offset in another table when this happens - just to be sure).
And if I recommend , creating the index column for each of the related columns can also enhanced the performance Delete old data from multiple related tables since we have just compared the index numbers rather than strings.
I wonder if your tables are being sorted or not.

Tridion 2009 embedded metadata storage format in the broker

I'm fairly new to Tridion and I have to implement functionality that will allow a content editor to create a component and assign multiple date ranges (available dates) to it. These will need to be queried from the broker to provide a search functionality.
Originally, this was only require a single start and end date and so were implemented as individual meta data fields.
I am proposing to use an embedded schema within the schema's 'available dates' metadata field to allow multiple start and end dates to be assigned.
However, as the field is now allowing multiple values, the data is stored in the broker as comma separated values in the 'KEY_STRING_VALUE' column rather than as a date value in the 'KEY_DATE_VALUE' column as it was when it was only allowed a single start and end values.
eg.
KEY_NAME | KEY_STRING_VALUE
end_date | 2012-04-30T13:41:00, 2012-06-30T13:41:00
start_date | 2012-04-21T13:41:00, 2012-06-01T13:41:00
This is now causing issues with my broker querying as I can no longer use simple query logic to retrieve the items I require for the search based on the dates.
Before I start to write C# logic to parse these comma separated dates and search based on those, I was wondering if anyone had had similar requirements/experiences in the past and had implemented this in a different way to reduce the amount of code parsing required and to use the broker querying to complete the search.
I'm developing this on Tridion 2009 but using the 5.3 Broker (for legacy reasons) so the query currently looks like this (for the single start/end dates):
query.SetCustomMetaQuery((KEY_NAME='end_date' AND KEY_DATE_VALUE>'" + startDateStr + "') AND (ITEM_ID IN(SELECT ITEM_ID FROM CUSTOM_META WHERE KEY_NAME='start_date' AND KEY_DATE_VALUE<'" + endDateStr + "')))";
Any help is greatly appreciated.
Just wanted to come back and give some details on how I finally approached this should anyone else face the same scenario.
I proposed the set number of fields to the client (as suggested by Miguel) but the client wasn't happy with that level of restriction.
Therefore, I ended up implementing the embeddable schema containing the start and end dates which gave most flexibility. However, limitations in the Broker API meant that I had to access the Broker DB directly - not ideal, but the client has agreed to the approach to get the functionality required. Obviously this would need to be revisited should any upgrades be made in the future.
All the processing of dates and the available periods were done in C# which means the performance of the solution is actually pretty good.
One thing that I did discover that caused some issues was that if you have multiple values for the field using the embedded schema (ie in this case, multiple start and end dates) then the meta data is stored in the KEY_STRING_VALUE column in the CUSTOM_META table. However, if you only have a single value in the field (i.e. one start and end date) then these are stored as dates in the KEY_DATE_VALUE column in the same way as if you'd just used single fields rather than an embeddable schema. It seems a sensible approach for Tridion to take but it serves to make it slightly more complicated when writing the queries and the parsing code!
This is a complex scenario, as you will have to go throughout all the DCPs and parse those strings to determine if match the search criteria
There is a way you could convert that metadata (comma separated) in single values in the broker, but the name of the fields need to be different Range1, Range2, ...., RangeN
You can do that with a deployer extension where you change the XML Structure of the package and convert each those strings in different values (1,2, .., n).
This extension can take some time if you are not familiar with deployer extensions and doesn't solve 100% your scenario.
The problem of this is that you still have to apply several conditions for retrieve those values and there is always a limit you have to set (Versus the User that can add as may values as wants)
Sample:
query.SetCustomMetaQuery((KEY_NAME='end_date1'
query.SetCustomMetaQuery((KEY_NAME='end_date2'
query.SetCustomMetaQuery((KEY_NAME='end_date3'
query.SetCustomMetaQuery((KEY_NAME='end_date4'
Probably the fastest and easiest way to achieve that is instead to use an multi-value field, use different fields. I understand that is not the most generic scenario and there are Business Requirements implications but can simplify the development.
My previous comments are in the context of use only the Broker API, but you can take advantage of a search engine if is part of your architecture.
You can index the Broker Database and massage the data.
Using the Search Engine API you can extract the ids of the Components/Component Templates and use the Broker API to retrieve the proper information

MySQL and Scheduled Updates by User Preference?

I'm developing an application that
stores an e-mail address (to a user) in a table.
stores the number of days the user would like to stay in the table.
takes the user off the table when the number of days is up.
I don't really know how to approach this, so here are my questions:
Each second, do I have the application check through every table entry for the time that's currently stored in, let's say, the time_left column?
Wouldn't (1) be inefficient if I'm expecting a significant number (10,000+) users?
If not (2), what's the best algorithm to implement for such a task?
What's the name of what I'm trying to do here? I'd like to do some more research on it before and while I'm writing the script, so I need a good search query to start with.
I plan on writing this script in Perl, although I'm open to suggestions with regards to language choice, frameworks, etc... I'm actually new to web development (both on the back-end and front-end), so I'd appreciate it if you could advise me precisely.
Thank you!
*after posting, Topener asked a valid question:
Why would you store users if they won't get requested?
Assume the user is just sitting in the database.
Let's say I'm using the user's e-mail address every 5 minutes from the time the user was added to the database (so if the user's entry was born at 2:00PM-October 18, the user would be accessed at 2:05, 2:10, etc...).
If the user decides that they want out of the database in 10 days, that means their entry is being accessed normally (every 5 minutes from 2:00PM-October 18) until 2:00PM-October 28.
So to clarify, based on this situation:
The system would have to constantly compare the current time with the user's expiration date, wouldn't it?
you should not store the time_left variable, bt you should store vaildTo. This way, whenever the user is requested from the database, you can check if it is valid.
If not, then do whatever you want with it.
This approach wont let you make any cronjobs, or will cost you extramload.
Hey Mr_spock I like the above answer from Topener. Instead of storing a number of days the user would like to be valid, store the day the user would like to be be removed.
Adding a field like validToDate, which would be a DATETIME field type, you can do a query like
delete from tablename where validToDate <= NOW()
where
the italicized text is a SQL query
tablename is the name of the table in question
NOW() is a valid sql function that returns the current DATETIME
validToDate is a field of type DATETIME
This has what ever efficiency SQL server promises, I think it is fairly good.
You could write a separate program/script which makes the delete query on a set interval. If you are on a Linux machine you can create a cron job to do it. Doing it every second may become very resource intensive for slower machines and larger tables, but I don't believe that will become an issue for a simple delete query.

Which is a better design, storing days in database

I would like to let user see the date about the course. For example, a computer course will held on
1/9, 2/9, 3/9.
So, how to store these day in the database?
one column the store a string like this:
1/9/2011,2/9/2011,3/9/2011
or a separate table like this:
event_id date
1 1/9/2011
1 2/9/2011
1 3/9/2011
Thank you.
The separate table is the right design, because that's how your schema will be normalized. A table column should hold a single type of value.
First normal form of database normalization states:
Every row-and-column intersection contains exactly one value from the
applicable domain (and nothing else).
Almost every database under the sun has a DATE datatype, which will do exactly this.
Also: use data normalisation: The separate table is the way to go.
I would have a table with (at least) 3 columns, using the date type for the start and end date
event_id start_date end_date
1 1/9/2011 3/9/2011
A column should store exactly one value only. Separate table.
As a CSV, how can you
find course starting 2/9/2001
order by
delete 2/9/2001, add 4/9/2011 etc
both methods has it own benefits
first method will suit for simple data structure,
that's mean your data size could be small,
your project job scope is small,
and you don't wish to spent too many effort on that
the bad is hard to maintain in long run
second method will better for normalization
but require more code / JOIN to get the same piece of information
(you can do in one step in the first method)
Is BAD for date string in d/m/Y,
let the presentation language to determine the locale
ISO format YYYY-MM-DD will be the better choice