Why Facebook Robyn Algorithm needs only unique dates? - regression

My team and I have been exploring the Robyn algorithm to do Market Mix Modeling on my dataset. The dataset is monthly level data of the promotional activity for each customer. In this case the data looks like this-
|Id |Date |Revenue|Channels…|
|-----------|--------|-------|---------|
|Customer 1 |Jan-2021| | |
| ….. | | | |
|Customer 1 |Dec-2021| | |
|Customer 2 |Feb-2021| | |
| ….. | | | |
|Customer 2 |Dec-2021| | |
In this way we have over 1000 customers and their monthly data of the channel activity. We have been able to create models using linear regression to get the impact of each channel. Now when we tried to run this data on Robyn we get a duplicate date error, so does this mean we have to run Robyn algorithm for each customer separately? Then we will have only 12 data points for the model and getting daily or weekly is also not possible for us. Is there anyway to run this kind of data on Robyn? Also why does Robyn restrict us to use unique dates even though it uses ridge regression internally which would not be affected by dates, isn’t having more datapoints better?
About Robyn

You are receiving this duplicate date error because unfortunately, Robyn does not treat panel datasets properly at the moment. You can aggregate the different customer segments in your datasets and run the modeling at a total level which Robyn supports.

Related

MySQL - Multiple Rows or JSON

I'm building an app in Laravel and have a design question regarding my MySQL db.
Currently I have a table which defines the skills for all the default characters in my game. Because the traits are pulled from a pool of skills, and have a variable number, one of my tables looks something like this:
+----+--------+---------+-----------+
| ID | CharID | SkillID | SkillScore|
+----+--------+---------+-----------+
| 1 | 1 | 15 | 200 |
| 2 | 1 | 16 | 205 |
| 3 | 1 | 12 | 193 |
| 4 | 2 | 15 | 180 |
+----+--------+---------+-----------+
Note the variable number of rows for any given CharID. With my Base Characters entered, I'm at just over 300 rows.
My issue is storing User's copies of their (customized)characters. I don't think storing 300+ rows per user makes sense. Should I store this data in a JSON Blob in another table? Should I be looking at a NoSQL solution like Mongo? Appreciate the guidance.
NB: The entire app centers around using the character's different skills. Mostly reporting from them, but users will also be able to update their SkillScore (likely a few times a week).
ps. Should I consider breaking each character out into their own table and tracking user's characters that way? Users won't be able to add/remove the skills from characters, only update them.
TIA.
Your pivot table looks good to me.
I'd consider dropping the ID column (unless you need it), and using a composite primary key:
PRIMARY_KEY(CharID, SkillID)
Primary keys are indexed so you will get efficient lookups.
As for your other suggestions, if you store this in a JSON column, you'll lose the ability to perform joins, and will therefore end up executing more queries.

SSRS - calculations from different datasets

I am new to SSRS and need help in completing this.
I have a SSRS report which has 2 different datasets Dataset 1 & Dataset 2. In Dataset 2 I have to use a calculation in one of the rows, which requires values from Dataset 2 and Dataset 1. Please see the image attached for the layout of the report and for other details. I would request your help in achieving the orange highlighted fields.
It is generally better to perform these calculations in the query if possible, but it is not impossible to include items from datasets other than the specified dataset for a tablix.
Depending on exactly how your datasets are set up, you may be able to use the Lookup function. This assumes you have a one-to-one relationship between the datasets. You can also sort of trick the function into working for datasets that don't have an explicit one-to-one relationship.
It's a little hard to tell your dataset structure from the information provided, I have a feeling your screenshot doesn't accurately depict your structure. Assuming your datasets are structured something more like this:
+------------------------------------+
| Category | Date | Value |
+------------------------------------+
| Gross Revenue | 2017-08-01 | GR8 |
| Gross Revenue | 2017-09-01 | GR9 |
| Gross Revenue | 2017-10-01 | GR10 |
| Profit | 2017-08-01 | P8 |
| Profit | 2017-09-01 | P9 |
| Profit | 2017-10-01 | P10 |
+------------------------------------+
and similar for Dataset 2, you should be able to use something like this to access the value from the other dataset:
=Lookup(Fields!Date.Value & "Cash Flow Rate", Fields!Date.Value & Fields!Category.Value, Fields!Value.Value, "Dataset2")
in SSRS each tablix object can only have one dataset. what you want to do is not possible using two different datasets. i'll suggest to do all calculations at query level in your dataset2.

Using an SQL View to dynamically place field data in buckets

I have a complex(?) SQL query I am needing to build. We have an application that captures a simple data set for multiple clients:
ClientID | AttributeName | AttributeValue | TimeReceived
----------------------------------------------------------------
002 | att1 | 123.98 | 23:02:00 02-03-20017
----------------------------------------------------------------
003 | att2 | 987.2 | 23:02:00 02-03-20017
I need to be able to return a single record per client that looks something like this
Attribute | Hour_1 | Hour_2 | Hour_x |
--------------------------------------
att1 120.67 |
--------------------------------------
att2 | 10 | 89.3 |
The hours are to be determined by a time provided to the query. If the time was 11:00 on 02-03-20017, then hour 1 would be from 10-11 on 02-03-20017, and hour 2 from 9-10 on 02-03-20017. Attributes will be allocated to these hourly buckets based on the hour/date in their time stamp (not all buckets will have data). There will be a limit on the number of hours allocated in a single query. In summary, there are possibly 200-300 attributes and hourly blocks of up to 172 hours. To be honest I am not really sure where to start to build a query like this. Any guidance appreciated.

Many to many relationship with different data types

I am trying to create a Database for different types of events. Each event has arbitrary, user created, properties of different types. For example "number of guests", "special song to play", "time the clown arrives". Not every event has a clown but one user could still have different events with a clown. My basic concept is
propID | name | type
------ | ---- | -----
1 |#guest| number
2 |clown | time
and another table with every event with a unique eventID. The Problem is that a simple approach like
eventID | propID | value
------ | ------ | -----
1 | 1 | 20
1 | 2 | 10:00
does not really work because of the different DataTypes.
Now I thought about some possible solutions but I don't really know which one is best, or if there is an even better solution?
1. I store all values as strings and use the datatype in the property table. I think this is called EAV and is not considered good practice.
2. There are only a limited amount of meaningful datatypes, which could lead to a table like this:
eventID | propID | stringVal | timeVal | numberVal
------ | ------ | --------- | ------- | --------
1 | 1 | null | null | 20
1 | 2 | null | 10:00 | null
3. Use the possible datatypes for multiple tables like:
propDateEvent propNumberEvent
-------------------------- --------------------------
eventID | propId | value eventID | propId | value
--------|--------|-------- --------|--------|--------
1 | 2 | 10:00 1 | 1 | 20
Somehow I think every solution has its ups and downs. #1 feels like the simplest but least robust. #3 seems like the cleanest solution, but pretty complicated if I wanted to add e.g. a priority for the properties per event.
All the options you propose are variations on entity/attribute/value or EAV. The basic concept is that you store entities (in your case events), their attributes (#guest, clown), and the values of those attributes as rows, not columns.
There are lots of EAV questions on Stack Overflow, discussing the benefits and drawbacks.
Your 3 options provide different ways of storing the data - but you don't address the ways in which you want to retrieve that data, or verify the data you're about to store. This is the biggest problem with EAV.
How will you enforce the rule that all events must have "#guests" as a mandatory field (for instance)? How will you find all events that have at least 20 guest, and no clown booke? How will you show a list of events between 2 dates, ordered by date, and number of guests?
If those requirements don't matter to you, EAV is fine. If they do, consider using a document to store this user-defined data (JSON or XML). MySQL can query those documents natively, you can enforce business logic much more easily, and you won't have to write horribly convoluted queries for even the simplest business cases.

Group transactions in one stored procedure, or separate into two stored procedures?

I have a complex web site that handles online games with real money. I use a double-entry database design for my transactions, a simple example is as follows:
John deposits $5
John receives 5000 credits
John uses those 5000 credits to play a game.
The transaction in the database looks as follows:
trans_id | account_id | trans_type | date | amount |
-----------------------------------------------------
1 | John(PayPal)| Debit | date | -5.00 |
2 | System | Credit | date | 5.00 |
3 | SystemGame | Debit | date | -5000 |
4 | JohnGame | Credit | date | 5000 |
I wrote a stored procedure with a transaction in it that inserts transactions 1 and 2, the Debit from John's PayPal account, and the Credit to our System account.
My question is, should I also include the other transactions where John has money transferred from our SystemGame account to his Game account? Or should I have a stored procedure for each group of transactions? All 4 transactions occur simultaneously, John is credited immediately after depositing the $5.
Also, should I separate the transaction tables for game credits from the real money transaction table?
I think this is what you want. But I recommend you doing in 2 different transactions to separate ingame-money and real-money.
The only reason is for if in future you need to change ingame-money management or real-money management you can do it separately.
A transaction must be atomic: all the steps must be completed. If one fails, everything is rolled back.
If those 4 steps you mention must be all completed at once, they must be in a single transaction.
So, for me, your current approach is OK.
However, if you want to split things into 2 stored procedures, you can manage transactions from the language(PHP, C#) you use.
Check this: Transactions in MySQL - Unable to Roll Back and this: PHP + MySQL transactions examples