Current situation
I have a desktop application (C++ Win32), and I wish to track users' usage analytics anonymously (actions, clicks, usage time, etc.)
The tracking is done via designated web services for specific actions (install, uninstall, click) and everything is written by my team and stored on our DB.
The need
Now we're adding more usage types and events with a variety of data, so we need define the services.
Instead of having tons of different web services for each action, I want to have a single generic service for all usage types, that is capable of receiving different data types.
For example:
"button_A_click" event, has data with 1 field: {window_name (string)}
"show_notification" event, has data with 3 fields: {source_id (int), user_action (int), index (int)}
Question
I'm looking for an elegant & convenient way to store this sort of diverse data, so later I could query it easily.
The alternatives I can think of:
Storing the different data for each usage type as one field of JSON/XML object, but it would be extremely hard to pull data and write queries for those fields
Having extra N data fields for each record, but it seems very wasteful.
Any ideas for this sort of model? Maybe something like google analytics? please Advise...
Technical: The DB is MySQL running under phpMyAdmin.
Disclaimer:
There is a similar post, which brought to my attention services like DeskMetrics and Tracker bird, or try to embed google analytics to C++ native application, but I'd rather the service to by my own, and better understand how to design this sort of model.
Thanks!
This seems like a database normalization problem.
I am also going to assume that you also have a table named events where all events will be stored.
Additionally, I am going to assume you have to the following data attributes (for simplicity's sake): window_name, source_id, user_action, index
To achieve normalization, we will need the following tables:
events
data_attributes
attribute_types
This is how each of the tables should be structured:
mysql> describe events;
+------------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------+------------------+------+-----+---------+----------------+
| id | int(11) unsigned | NO | PRI | NULL | auto_increment |
| event_type | varchar(255) | YES | | NULL | |
+------------+------------------+------+-----+---------+----------------+
mysql> describe data_attributes;
+-----------------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-----------------+------------------+------+-----+---------+----------------+
| id | int(11) unsigned | NO | PRI | NULL | auto_increment |
| event_id | int(11) | YES | | NULL | |
| attribute_type | int(11) | YES | | NULL | |
| attribute_name | varchar(255) | YES | | NULL | |
| attribute_value | int(11) | YES | | NULL | |
+-----------------+------------------+------+-----+---------+----------------+
mysql> describe attribute_types;
+-------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------+------------------+------+-----+---------+----------------+
| id | int(11) unsigned | NO | PRI | NULL | auto_increment |
| type | varchar(255) | YES | | NULL | |
+-------+------------------+------+-----+---------+----------------+
The idea is that you will have to populate attribute_types with all possible types you can have. Then, for each new event, you will add an entry in the events table and corresponding entries in the data_attributes table to map that event to one or more attribute types with the appropriate values.
Example:
"button_A_click" event, has data with 1 field: {window_name "Dummy Window Name"}
"show_notification" event, has data with 3 fields: {source_id: 99, user_action: 44, index: 78}
would be represented as:
mysql> select * from attribute_types;
+----+-------------+
| id | type |
+----+-------------+
| 1 | window_name |
| 2 | source_id |
| 3 | user_action |
| 4 | index |
+----+-------------+
mysql> select * from events;
+----+-------------------+
| id | event_type |
+----+-------------------+
| 1 | button_A_click |
| 2 | show_notification |
+----+-------------------+
mysql> select * from data_attributes;
+----+----------+----------------+-------------------+-----------------+
| id | event_id | attribute_type | attribute_name | attribute_value |
+----+----------+----------------+-------------------+-----------------+
| 1 | 1 | 1 | Dummy Window Name | NULL |
| 2 | 2 | 2 | NULL | 99 |
| 3 | 2 | 3 | NULL | 44 |
| 4 | 2 | 4 | NULL | 78 |
+----+----------+----------------+-------------------+-----------------+
To write a query for this data, you can use the COALESCE function in MySQL to get the value for you without having to check which of the columns is NULL.
Here's a quick example I hacked up:
SELECT events.event_type as `event_type`,
attribute_types.type as `attribute_type`,
COALESCE(data_attributes.attribute_name, data_attributes.attribute_value) as `value`
FROM data_attributes,
events,
attribute_types
WHERE data_attributes.event_id = events.id
AND data_attributes.attribute_type = attribute_types.id
Which yields the following output:
+-------------------+----------------+-------------------+
| event_type | attribute_type | value |
+-------------------+----------------+-------------------+
| button_A_click | window_name | Dummy Window Name |
| show_notification | source_id | 99 |
| show_notification | user_action | 44 |
| show_notification | index | 78 |
+-------------------+----------------+-------------------+
EDIT: Bugger! I read C#, but I see you are using C++. Sorry about that. I leave the answer as-is as its principle could still be useful. Please regard the examples as pseudo-code.
You can define a custom class/structure that you use with an array. Then serialize this data and send to the WebService. For example:
[Serializable()]
public class ActionDefinition {
public string ID;
public ActionType Action; // define an Enum with possible actions
public List[] Fields; //Or a list of 'some class' if you need more complex fields
}
List AnalyticsCollection = new List(Of, Actiondefinition);
// ...
SendToWS(Serialize(AnalyticsCollection));
Now you can dynamically add as many events as you want with the needed flexibility.
on server side you can simply parse the data:
List[of, ActionDefinition] AnalyticsCollection = Deserialize(GetWS());
foreach (ActionDefinition ad in AnalyticsCollection) {
switch (ad.Action) {
//.. check for each action type
}
}
I would suggest adding security mechanisms such as checksum. I imagine the de/serializer would be pretty custom in C++ so perhaps as simple Base64 encoding can do the trick, and it can be transported as ascii text.
You could make a table for each event in wich you declare what param means what. Then you have a main table in wich you only input the events name and param1 etc. An admin tool would be very easy, you go through all events, and describe them using the table where each event is declared. E.g. for your event button_A_click you insert into the description table:
Name Param1
button_A_Click WindowTitle
So you can group your events or select only one event ..
This is how I would solve it.
Related
I need some help with the mysql statements for inserting and updating rows in a new table based on the contents of another table. I am going to use this in automated perl code, but the mysql statements themselves are what I am having trouble with.
My first table named PROFILE looks something like this:
+----------+---------------------------+
| ID | NAME |
+----------+---------------------------+
| 0 | Default profile |
| 04731470 | Development profile |
| 87645420 | Core Base |
| a41401a0 | Core Test |
| ba0e3000 | Development profile child |
| e37fe780 | Test2 |
+----------+---------------------------+
The second called DEPLOYMENT has these columns (and no rows yet):
+------------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+------------+-------------+------+-----+---------+-------+
| PROF_ID | char(36) | NO | PRI | NULL | |
| NAME | varchar(60) | NO | | NULL | |
| ID | tinyint(4) | NO | MUL | NULL | |
+------------+-------------+------+-----+---------+-------+
ID.PROFILE is the foreign key for PROF_ID.DEPLOYMENT and I want all of the values for ID.PROFILE to go in PROF_ID.DEPLOYMENT. Then I want the NAME.DEPLOYMENT and ID.DEPLOYMENT fields to be set based on the words found in the NAME.PROFILE field.
The following shows what I want to do as far as the insert statements goes, but these failed due to "ERROR 1242 (21000): Subquery returns more than 1 row":
INSERT INTO DEPLOYMENT(PROF_ID,NAME,ID) VALUES((select ID from PROFILE where NAME like '%core%'),'Core','2');
INSERT INTO DEPLOYMENT(PROF_ID,NAME,ID) VALUES((select ID from PROFILE where NAME like '%development%'),'Dev','3');
INSERT INTO DEPLOYMENT(PROF_ID,NAME,ID) VALUES((select ID from PROFILE where NAME not like '%development%' and not like '%core%'),'Default','1');
I'm not sure where to start on the update part of this but the ID.DEPLOYMENT and NAME.DEPLOYMENT fields should change as above if the text in the NAME.PROFILE fields changes with any of the words above.
This is the resulting DEVELOPMENT table I am looking for.
+----------+---------------+----+
| PROF_ID | NAME | ID |
+----------+---------------+----+
| 0 | Default | 1 |
| 04731470 | Dev | 3 |
| 87645420 | Core | 2 |
| a41401a0 | Core | 2 |
| ba0e3000 | Dev | 3 |
| e37fe780 | Default | 1 |
+----------+---------------+----+
Then I want statements to update if any of the NAME.PROFILE information changes.
Sorry if this is confusing, I wasn't sure how to explain and I am still learning mysql. Any help is appreciated.
Just get rid of the values keyword, basically:
INSERT INTO DEPLOYMENT(PROF_ID,NAME,ID)
select ID, 'Core','2'
from PROFILE
where NAME like '%core%';
INSERT INTO DEPLOYMENT(PROF_ID,NAME,ID)
select ID, 'Dev', '3'
from PROFILE
where NAME like '%development%';
INSERT INTO DEPLOYMENT(PROF_ID,NAME,ID)
select ID, 'Default', '1'
from PROFILE
where NAME not like '%development%' and not like '%core%';
By the way, you could combine these into one statement, using conditional expressions:
INSERT INTO DEPLOYMENT(PROF_ID,NAME,ID)
select ID,
(case when NAME like '%core%' then 'Core'
when NAME like '%development%' then 'Dev'
else 'Default'
end)
(case when NAME like '%core%' then '2'
when NAME like '%development%' then '3'
else '1'
end)
from PROFILE;
I'm creating an app using CakePHP and have hit a mental barrier when trying to figure out a permission system for the app. I've narrowed it down to a couple different methods, and I'm looking for some information about which would be a) most easily implemented and b) most efficient (obviously there can be trade-off between these two).
The app has many different models, but for simplification I'll just use User, Department, and Event. I want to be able to individually control CRUD permissions for each user, on each model.
Cake ACLs
Though poorly documented, I've got somewhat of an idea of how the ACL system works, and considered creating AROs as follows:
[1]user
create
read
update
delete
[2]department
...
etc. This would require users being in many different groups, and from what I've seen, Cake doesn't easily support this. Is there possibly a better way to do this, or is ACL not suitable for this situation?
Permission Flags in DB
This one is pretty straightforward, obviously having a flag in the user's record for
create_users, read_users, etc. With 4-5 models, this would mean 16-20 fields for permissions, which made me consider either using bit masks, or using a joined table. Is one of these better than the other? Which one is faster with less overhead?
Overall, I guess I really want to know what approach makes the most sense in the scale of the application from an efficiency and ease-of-development standpoint. I'm also open to other suggestions of how to go about this, if you have experience from a past project. Thanks in advance!
This is generally how I set up permissions - you have actions that can be performed, roles that can perform those actions and users who have roles. The examples I've put here are based on what you've requested though I think you'll find it rare you have a user who can do nothing but "create new user records" or "update department records".
actions
id varchar(50)
description varchar(200)
+-------------------+----------------------------------------------+
| id | description |
+-------------------+----------------------------------------------+
| USER_CREATE | Allow the user to create USERS records. |
| USER_DELETE | Allow the user to delete USERS records. |
| USER_READ | Allow the user to read USERS records. |
| USER_UPDATE | Allow the user to update USERS records. |
| DEPARTMENT_CREATE | Allow the user to create DEPARTMENT records. |
| ................. | ............................................ |
+-------------------+----------------------------------------------+
roles
id unsigned int(P)
description varchar(50)
+----+--------------------+
| id | description |
+----+--------------------+
| 1 | Manage users |
| 2 | Manage departments |
| .. | .................. |
+----+--------------------+
roles_actions
id unsigned int(P)
role_id unsigned int(F roles.id)
action_id varchar(50)(F actions.id)
+----+---------+-------------------+
| id | role_id | action_id |
+----+---------+-------------------+
| 1 | 1 | USER_CREATE |
| 2 | 1 | USER_DELETE |
| 3 | 1 | USER_READ |
| 4 | 1 | USER_UPDATE |
| 5 | 2 | DEPARTMENT_CREATE |
| 6 | 2 | DEPARTMENT_DELETE |
| .. | ....... | ................. |
+----+---------+-------------------+
users
id unsigned int(P)
username varchar(32)(U)
password varchar(123) // Hashed, like my potatoes
...
+----+----------+----------+-----+
| id | username | password | ... |
+----+----------+----------+-----+
| 1 | bob | ******** | ... |
| 2 | april | ******** | ... |
| 3 | grant | ******** | ... |
| .. | ........ | ........ | ... |
+----+----------+----------+-----+
users_roles
id unsigned int(P)
user_id unsigned int(F users.id)
role_id unsigned int(F roles.id)
+----+---------+---------+
| id | user_id | role_id |
+----+---------+---------+
| 1 | 1 | 1 |
| 2 | 2 | 2 |
| .. | ....... | ....... |
+----+---------+---------+
To determine if a user has a particular permission you could execute a query like this:
SELECT COUNT( roles_actions.id )
FROM users
LEFT JOIN users_roles ON users.id = users_roles.user_id
LEFT JOIN roles_actions ON users_roles.role_id = roles_actions.role_id
WHERE roles_actions.action_id = '<action.id>'
It might sound silly but Im just curious.
I have a table named posts:
+----------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+----------+------------------+------+-----+---------+----------------+
| id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| title | varchar(50) | YES | | NULL | |
| body | text | YES | | NULL | |
| created | datetime | YES | | NULL | |
| modified | datetime | YES | | NULL | |
+----------+------------------+------+-----+---------+----------------+
The values:
+----+-----------------------+----------------------------------------+---------------------+---------------------+
| id | title | body | created | modified |
+----+-----------------------+----------------------------------------+---------------------+---------------------+
| 2 | A title once again!!! | And the post body follows. Tralalalala | 2013-06-03 13:13:44 | 2013-06-05 09:36:51 |
| 3 | Title strikes back | This is really exciting! Not. | 2013-06-03 13:13:46 | NULL |
| 11 | Tomcat | Tommy boy!!! FFF | 2013-06-04 16:33:22 | 2013-06-04 16:48:40 |
| 12 | FFD | dsfdsf | 2013-06-04 16:48:56 | 2013-06-04 16:55:50 |
| 13 | fdf | dfdsf | 2013-06-04 16:57:47 | 2013-06-05 09:36:54 |
| 14 | GGD | dsfdsf | 2013-06-04 17:02:33 | 2013-06-04 17:02:33 |
| 15 | GG# | dsfdsfff322 | 2013-06-05 09:36:20 | 2013-06-05 09:36:28 |
+----+-----------------------+----------------------------------------+---------------------+---------------------+
Let's say I want to search for row that has the value Th (not case sensitive) regardless of the FIELD. This is like making a quick search function.
Normally I would do something like : SELECT * FROM posts WHERE title LIKE '%Th%' OR body LIKE '%Th%'
I did not include the other fields because obviously they are not gonna accept those values.
I wanna know if there's a shortcut to this? Like SELECT * FROM posts LIKE '%Th%'.
Please advise. Thanks.
Using plain old SQL you need to specify all the column names you wish to include.
If you want more search-box-like behavior, I'd suggest looking at MySQL's fulltext functions; see:
http://dev.mysql.com/doc/refman/5.0/en/fulltext-search.html
The SQL language is based on the presumption of the schema being known. Thus, there is no "search any column" type of functionality. How would it work against non-text columns? What about columns of different collations? Aside from the language not having a feature, specifying the columns expresses your intent to the next developer and that as much as anything should be an overriding consideration.
Other answers have covered that you need to specify all the columns. Here is an alternative formulation that is a bit shorter:
SELECT *
FROM posts
WHERE concat(title, ' ', body) LIKE '%Th%'
If you are looking for an exact match, then you can do:
select *
from posts
where 'Th' in (title, body)
No there is no shortcut for using a where clause. and specifying the columns. Otherwise the query engine can never know what to filter and what column to filter unless you specify them in the where clause.
If you want a custom shortcut - you can write a function which takes a single parameter (the search string) and returns the required fields.
I'm afraid there isn't.
Not sure what your use case is... does this alternative approach work for your use case?
mysql -u{user} -p{password} -h{hostname} {database_name} -B -e "{query}" | grep "{search_string}"
It connects to the database and runs the specified query, returns query results in new lines, fields separated by tab stop. Then use Unix utility grep to filter returned rows.
First of all, I'd like to say that I do see this is a foolish way of doing things - and that I can live without what I'm asking, however it'd be nice to see if this is possible.
Okay, imagine that I have these tables - they're not what I really have but it should still show the same problem:
mutex: (table, row_id)
events: (id, title, location)
files: (id, title, filesize)
groups: (id, name)
groups_to_content: (group_id, table, row_id)
At this current moment in time, if I want to return details about all events and files that a particular user has created, I can do that with a view that is absurdly complicated yet quick. That bit, I have no problem with.
However, the bit that I do have a problem is with groups. In the current view that I have - if a file/event hasn't been assigned to a group, then the group_id will be of course, NULL. If I have assigned an event to a group (via groups_to_content), its group_id will also of course, be populated:
+---------+----------+------------+--------------------+
| table | row_id | group_id | title |
+---------+----------+------------+--------------------+
| event | 1 | NULL | some event |
| event | 2 | NULL | some event |
| event | 3 | 1 | grouped event |
+---------+----------+------------+--------------------+
However, I'd like it so that it would include events owned by the groups that own it (multiple group ownership is possible in the system) and with group_id as NULL, as so:
+---------+----------+------------+--------------------+
| table | row_id | group_id | title |
+---------+----------+------------+--------------------+
| event | 1 | NULL | some event |
| event | 2 | NULL | some event |
| event | 3 | NULL | grouped event |
| event | 3 | 1 | grouped event |
+---------+----------+------------+--------------------+
Is this at all possible?
I'm having some trouble with an advanced SQL query, and it's been a long time since I've worked with SQL databases. We use MySQL.
Background:
We will be working with two tables:
"Transactions Table"
table: expire_history
+---------------+-----------------------------+------+-----+-------------------+-------+
| Field | Type | Null | Key | Default | Extra |
+---------------+-----------------------------+------+-----+-------------------+-------+
| m_id | int(11) | NO | PRI | 0 | |
| m_a_ordinal | int(11) | NO | PRI | 0 | |
| a_expired_date| datetime | NO | PRI | | |
| a_state | enum('EXPIRED','UNEXPIRED') | YES | | NULL | |
| t_note | text | YES | | NULL | |
| t_updated_by | varchar(40) | NO | | | |
| t_last_update | timestamp | NO | | CURRENT_TIMESTAMP | |
+---------------+-----------------------------+------+-----+-------------------+-------+
"Information Table"
table: information
+---------------------+---------------+------+-----+---------------------+-------+
| Field | Type | Null | Key | Default | Extra |
+---------------------+---------------+------+-----+---------------------+-------+
| m_id | int(11) | NO | PRI | 0 | |
| m_a_ordinal | int(11) | NO | PRI | 0 | |
| a_type | varchar(15) | YES | MUL | NULL | |
| a_class | varchar(15) | YES | MUL | NULL | |
| a_state | varchar(15) | YES | MUL | NULL | |
| a_publish_date | datetime | YES | | NULL | |
| a_expire_date | date | YES | | NULL | |
| a_updated_by | varchar(20) | NO | | | |
| a_last_update | timestamp | NO | | CURRENT_TIMESTAMP | |
+---------------------+---------------+------+-----+---------------------+-------+
We have a set of fields in one table that describe the record. Each record is comprised of a m_id (the person) and an ordinal (a person can have multiple records). So for instance, my m_id could be 1, and i could have multiple ordinals, (1, 2, 3, 4, etc), each with their own individual set of data. The m_id and the m_a_ordinal comprise a composite key in the "information" table, and the m_id, m_a_ordinal, and a_expired_date fields in the "transactions" table comprises a composite key as well.
Essentially when we expire a record, the a_state field in the information table is updated to expired. At the same time, a record is created in the transactions table with the m_id, m_a_ordinal, and a_expired_date. We've found in the past that people get impatient and can click a button twice, so through some previous help I've managed to narrow down the most recent transaction for each expired record using the following query:
SELECT e1.m_id, e1.m_a_ordinal, e1.a_expired_date, e1.t_note, e1.t_updated_by
FROM expire_history e1
INNER JOIN (SELECT m_id, m_a_ordinal, MAX(a_expired_date) AS a_expired_date
FROM expire_history GROUP BY m_id, m_a_ordinal) e2
ON (e2.m_id = e1.m_id AND e2.m_a_ordinal = e1.m_a_ordinal AND e2.a_expired_date = e1.a_expired_date)
WHERE e2.a_expired_date > '2008-05-15 00:00:00' ORDER BY a_date_expired;
Seems simple enough, right?
Let's add some complexity. Each record in the "information" table has a "natural expiration date" as well. The original developer of our software, however, didn't code it to change the state of the record to "expired" once it's reached it's natural expiration date. It also does not write a transaction to the transaction table once it's expired (which I understand because this is only to keep records of ones that were expired by a person, as opposed to automagically). Also, when a record is expired manually, the original expiration date does not change. This is why this is so complicated :P~~.
Essentially I need to build a report that shows all aspects of expiration, whether it was expired manually, or naturally.
This report should take the data from the query above, and combines it with another query on the "information table" that says if a_expire_date <= CURDATE show record, except if record exisits in (query above from expire_history), then show record from (query on expire_history).
a rough structure of the raw logic is as follows:
for x in record_total
if (m_id m_a_ordinal) exists in expire_history
display m_id, m_a_ordinal, a_expired_date, a_state)
else if (m_id_a_ordinal) exists in information AND a_expire_date <= CURDATE
display (m_id, m_a_ordinal, a_expire_date, a_state)
end if
x++
I hope that this is concise enough.
Thanks for any help you can provide!
SELECT i.m_id, I.m_a_ordinal,
coalesce(e1.a_expired_date, I.A_Expire_Date) as Expire_DT,
coalesce(e1.t_note,'insert related item column'),
coalesce(e1.t_updated_by, I.A_Updated_by) as Updated_By
FROM Information I
LEFT JOIN expire_history e1
ON E1.M_ID = I.M_ID
AND I.m_a_ordinal=e1.M_a_ordinal
INNER JOIN
(SELECT m_id, m_a_ordinal, MAX(a_expired_date) AS a_expired_date
FROM expire_history GROUP BY m_id, m_a_ordinal) e2
ON (e2.m_id = e1.m_id
AND e2.m_a_ordinal = e1.m_a_ordinal
AND e2.a_expired_date = e1.a_expired_date)
WHERE coalesce(e2.a_expired_date,i.A_Expire_Date) > '2008-05-15 00:00:00'
ORDER BY a_date_expired;
Syntax may be off a bit don't ahve time to test; but you can get the gist of it from this I hope:
Again what coalesce does is simply return the first NON-null value in a series of values. If you're only dealing with two NULLIF may work as well.