I have a table as follows
Table item {
ID - Primary Key
content - String
published_date - When the content was published
create_date - When this database entry was created
}
Every hour (or specified time interval) I run a process to update this table with data from different sources (websites). I want to display the results according to the following rules.
1. The entries created each time the process runs should be grouped together. So the entries from the 2nd process run will always be after the entries from the first process run even if the published_date of an entry from the first run is after the published_date of an entry from the 2nd run.
2. Within the grouping by run, the entries by sorted by published_date
3. Another restriction is that I prefer that data from the same source not be grouped together. If I do the sort by create_date, published_date I will end up with data from source a, data from source b etc. I prefer that the data within each hour be mixed up for better presentation
If I add a column to this table and store a counter which increments each time the process is run, it is possible to create a query to sort first by counter and then by published_dt. Is there a way to do it without adding a field? I'm using Hibernate over MySQL.
e.g.
Hour 1 (run 1)
4 rows collected from site a (rows 1-4)
3 rows collected from site b (rows 5-7)
hour 2 (run 2)
2 row collected from site a (rows 8-9)
3 rows collected from site b (rows 10-12)
...
After each run, new records are added to the database from each website. The create date is the time when the record was created in the database. The published date is part of the content and is read in from the external source.
When the results are displayed I would like rows to be grouped together based on the hour they were published in. So rows 1-7 would be displayed before rows 8-12. Within each hourly grouping, I would like to sort the results by published date (timestamp). This is necessary so that the posts from all the sites collected in that hour are not grouped together but rather mixed in with each other.
If you add a counter, you can definitely order items by counter first and then published date:
from Item item order by item.counter desc, item.publishedDate
Related
I am trying to create a database application for daily task. I am converting a paper form I use.
Each row is a task and each column is a date. Every day I go through and complete the task and initial the cell that corresponds with the date. But not every task is required daily. I included an example of how it will appear in the browser.
How do I structure the database?
You could do:
TasksDefinition
Id PK
TaskName NN
TasksWhen
Id PK
TaskId FK, TasksDefinition.id
Day NN --> what day should task Id be completed
History
Id PK
TaskId FK, TasksDefinition.Id
Date NN
Done NN, boolean, default False
PK: primary key
FK: foreign key
NN: not null
Each task is defined in TasksDefinition
TasksWhen stores the information on what day(s) should each task be completed. One entry per task/day of the month (ex. 1 to 31). OR 0-6 if you want to use week days. Using a table allows you to have some tasks completed on many days. Ex. for task X, on day 1, 4, and 28 would require 3 entries in TasksWhen.
At 0001 each day, your application does:
Add each tasks that have to be completed that day to the History table, with the current date and Done == False.
When you have completed the task, change History.Done to True.
When you build your interface, you query the history table only. This will give you which tasks have been done (or not) on each day. The status of completion goes to the History table as well.
You can use day of month or week day to specify which tasks must be done on each day. You could even use a mix of both. As long as your application can figure it out, you would be fine.
The monthly report is built from data in the history table.
Hi I would like to find a query for the below, I am trying to calculate data between two columns however based on another column which needs to be a selected group of the same values
Unfiltered
Start Time________Disconnect Time______Signalling IP
12:59:00.3________13:26:03.3___________1.1.1.1
10:59:00.3________11:03:03.3___________2.2.2.2
19:59:00.3________20:02:03.3___________1.1.1.1
Filtered
Start Time________Disconnect Time______Signalling IP
12:59:00.3________13:26:03.3___________1.1.1.1
19:59:00.3________20:02:03.3___________1.1.1.1
If you see the table above, I want the selected IP only which is 1.1.1.1, and then from there, calculate the total duration of time from the Start Time and Disconnect Time for that Egress IP.
So column 3 has multiple values, however I need to select the same value, then from there calculate the sum of column 1 and 2 based on column 3.
Please let me know if you have anything in mind, as I have tried multiple queries but can't get the correct one
to calculate difference between to times.
you can use time_to_sec to convert each time value to seconds
and subtract start time from end time to get time period in seconds.
you cat turn it back to time format with SEC_TO_TIME
example
select
column3,
SEC_TO_TIME(sum(TIME_TO_SEC(column2) - TIME_TO_SEC(column1))
from
table
group by column3
I'm afraid I with this situation:
I have a MySQL table with just 3 columns: ID, CREATED, TOTAL_VALUE.
A new TOTAL_VALUE is recorded roughly every 60 seconds, so about 1440 times a day.
I am using PHP to generate some CanvasJS code that plots the MySQL records into line graph - this so that I can see how TOTAL_VALUE changes over time.
it works great for displaying 1 day worth of data, but when doing 1 week(7*1440=10080 plot points) things get really slow.
And a date range of for example 1-JAN-2016 and 1-SEP-2016 just leads to time outs in the PHP script.
How can I write some MySQL that still selects records between a date range but limit the rows returned to ie max 1000 rows?
I need to optimize this by limiting the number of data points that need to be plotted.
Can MySQL do some clever stuff where it decides to skip 1 every so many rows and return 1000 averaged values - this so that my line graph would by approximation still be correct- but using fewer data points?
I have a table with the following structure:
Entry ID | Date | Approved
Whenever a new entry is made, Entry ID auto increments and date is set to whenever the entry was made through the web application. These entries are not necessarily made every day, so there are gaps between dates.
I need to find all "missing" entries, meaning that there is no entry for that date. For instance, if there was an entry for 2015-06-01 and the next one didn't come until 2015-06-07, I need a query that returns the list of dates from 2015-06-02 to 2015-06-06 and an indication of their approved status from that field. I've been looking for a while but can't seem to find a method to get a list of entries that don't exist. Is there a method for this, or should I restructure?
Create a temp table with all possible dates and do
SELECT Date FROM temp_table WHERE Date NOT IN (SELECT Date FROM your_table);
Question about SQL performance when selecting a 'blog post' based on user views by date.
I want to record the user views of each post, and i ll select everyone of them using 'daily' and 'monthly' as parameters:
PS:
Most viewed posts of the day, or month.
To record the views, i created a table to insert, after every page load, the date of each view.
And them select them (count them) by DAY() and MONTH() when needed.
The problem here is, when the table or the amount of users requiring this information grows the select starts to be slower, due to the amount of rows(views) multiplied for the amount of posts.
One alternative that i thought was, create a table for daily records, and another table for monthly records, then on every page load the code checks if there is a row for the selected date, if the rows exist the script increment the views count on it, if it doesn't, the script insert the row with views count = 1;
Ps:
Daily Views
Post ID | Views | Date
1 | 898 | 2014-07-11
2 | 676 | 2014-07-11
1 | 333 | 2014-07-10
This way every post can have only one row per day.
Is there any better option? what do you think about my alternative? there is no need for my suggestion?
I think the best solution is:
Create a table with statistical data with fields:
id
date (store date m-d-y)
day
month
year
views (store number of visits)
page (store blog post)
One unique row per day, and update programmatically as needed.
Then you can make queries using day, month, year fields, even you can add weeknum field to make queries to obtain statistics grouped by weeks.
As addition you can add a second table to store the full date (m-d-y h:m:s) for each visit, you can add fields like browser, ip, etc... to this table.