Reinforcement Learning with DQN approach on stock prices - reinforcement-learning

I have programmed a reinforcement model with a DQN approach that is supposed to make purchase decisions based on stock prices.
For the training I use two stock prices. One has an upward trend and one has a downward trend. The time period for both is 1 year (100,000 data points).
As observation I use the price data of the last 1000 data points.
For the training I first collect 100 episodes (one episode is one run of the complete stock price, where the stock price (upward/downward trend is chosen randomly). Per episode I get about 1000 actions (buy, sell, skip).
Then the training takes place with a batch size of 64.
The problem is that the model specializes on one of the stock prices and generates a good reward there. For the other stock price, however, it is very bad and I get a negative reward.
It seems that the model does not try to optimize the average profit over all episodes (upward/downward trend).
As a reward I simply take the money I make per trade in profit or loss. As descout I have set 1.0.
Does anyone have an idea what the problem could be.

Related

Using Poisson Regression to Estimate Rate of Purchase Among Products of Different "Size" (Non-Integers)

AN organization is interested in modelling the sales rate of cases of product sold each week. The product is a luxury item so distributions of sales tend to be small and right-skewed, A typical month (4 weeks) of sales might look like {0, 1, 1, 4}.
While we were originally developing the analysis, this seemed like an obvious application of GLM--specifically Poisson regression to estimate the mean sales rate.
However, the sales team has recently come back and mentioned that they actually sell the product in many smaller sizes, such as 750-mL bottles and 187-mL samples. I could easily convert the sales into equivalent units (a case contains 9L of product), but this would result in some non-integer sales units. If the previous 4-week sales distribution had all been 750mL bottles, for example, the new distribution would look like {0, 0.0833, 0.0833, 0.333}
I would like to be able to model the sales rate in terms of a common unit (a case, or 9L) and I thought I could use an offset term to do this, but I've run into difficulties whenever there are zero products sold (offset term is also zero).
My understanding is that the non-integer values preclude the direct use of a Poisson likelihood to model these data (without some sort of transformation). I could simply try a normal linear model, but the sales data are still discrete (e.g., they can only occupy a handful of values determined by the volume of product and number of units sold). I still feel like a discrete model would be more appropriate but am stumped at how to account for the different "sizes" of product appearing in the data without simply running a separate model for each product.
Have you ever handled data like these in a similar fashion, and how did you make this accommodation?

When ranking is more important than fitting the value of y in a regression model

Let's say you have a model that predicts the purchases of a specific user over a specific period of time.
It seems to work well when we build a model that predicts whether or not to buy and sort users based on that probability.
In the same way, when a model that predicts the purchase amount is constructed and users are sorted based on the predicted amount, the expected performance does not seem to be achieved.
For me, it is important to predict that A will pay more than B. Matching the purchase amount is not important.
What metrics and models can be used in this case?
I am using lightgbm regression as a base.
There is a large variance in the purchase amount. Most users spend 0 won for a certain period, but purchasers spend from a minimum of $1,000 to a maximum of $100,000.

How to deal with different subsampling time in economic datasets for deep learning?

I am building a deep learning model for macro-economic prediction. However, different indicators varies widely when it comes to its subsampling time, ranging from minutes to annually.
Dataframe example
The picture contains the 'Treasury Rates (DGS1-20)' which is sampled daily and 'Inflation Rate(CPALT...)' which is sampled monthly. These features are essential for the model to train and dropping out the NaN rows would result in too little data.
I've read some books and articles about how to deal with missing data that includes down sampling to monthly time frames, swapping the NaNs with -1, filling it with averages between the last and next value etc. But the methods that I read mostly deals with data sets that has a missing value of about 10% of the whole dataset while in this case of mine, the monthly sampled 'Inflation(CPI)' is missing at 90+% if I combine it with the 'Treasury Rate' dataset.
I was wondering if there was any workaround to handle missing values, particularly for economic data where the sampling time gap ranges so widely. Thank you

Mixed effects model

I'm having a hard time finding a mixed-effects model that will fit my data.
My data has the following structure:
Participants have been divided into two disjointed groups
Each participant performs multiple trials
There are 3 types of trials
All participants perform the same trials
During each trial, we collect a metric every 20 ms.
We want to account for the changes between the groups and trial types.
So far, I've tried
Target~TimeTrialGroup+(1+Trial+Time|Participant)+(1|Participant:Group:Trial)
but the model's NLL value is very low (-230K)
Would appreciate any help,
Thank you

Fitting a warehouse pricing tariff into MySQL tables

Creating the right database structure from a manual tariff
I have been assigned a rather challenging database design and thought someone may be able to give me a few pointers to help get going. We currently have a warehouse goods in and goods out system and now we would like to use the data to calculate storage charges.
The database already holds the following: Goods date in, Goods date out, Consignment weight, Number of pieces, Dimensions, Description of goods, Storage container type (if applicable). The data is held in MySQL which may not be suitable for the tariff structure below.
Here is the charging structure for Band 1,2,3,4. We have about 12 bands dependent on Customer size and importance. All the other bands are derivatives of the following:
BAND 1
On arrival in our facility
€0.04 per kilo + €4.00 per consignment for general cargo
€0.07 per kilo for MAGAZINES – NO STORAGE CHARGE
STORAGE CHARGES AFTER 5 DAYS
€4.00 per intact pallet max size 120x100x160cm (Standard warehouse wooden pallet)
€6.50 per cubic metre on loose cargo or out of gauge cargo.
CARGO DELIVERED IN SPECIFIC CONTAINERS
20FT PALLET ONLY - €50.00
40FT PALLET ONLY - €20.00
BAND 2
0.04 per kilo no min charge
STORAGE CHARGES AFTER 6 DAYS
€2.50 per cubic metre
CONTAINERS
20FT PALLET ONLY - €50.00
40FT PALLET ONLY - €20.00
BAND 3
€0.03 per kilo + €3.00 per consignment up to 2000kg
€0.02 per kilo + €2.00 per consignment over 2000kg
STORAGE CHARGES AFTER 5 DAYS
€4.00 per pallet max size 120x100x160
€0.04 per kilo loose cargo
BAND 4
€5.00 per pallet
STORAGE CHARGES AFTER 4 DAYS
€5.00 per pallet max size 120x100x160
My thoughts so far are to collect the charging band on arrival of the freight then try and fit the tariff into a table with some normalisation such as container type.
Anyone had experience of this type of manual to system conversion?
Probably the algorithm for computing the tariff is too messy to do in SQL. So, let's approach your question from a different point of view.
Build the algorithm in your client language (Java/PHP/VB/...).
As you are doing step 1, think about what data is needed - perhaps a 2-column array of "days" and "Euros"? Maybe something involving "kilos"? Maybe there are multiple patterns -- days and/or kilos?
Build the table or tables necessary to store those arrays.
Decide how to indicate that kilos is irrelevant -- perhaps by leaving out any rows in the kilo table? Or an extra column that gives size/weight?
My point is that the algorithm needs to drive the task; the database is merely a persistent store.
Here's another approach. Instead of having columns for days, kilos, etc, just have a JSON string of whatever factors are needed. In the client code, decode the JSON, then have suitable IF statements to act on kilos if present, ELSE ... etc.
Again, the database is just a persistent store; the client is driving the format by what is convenient to it.
In either implementation, there would be a column for the type of item involved, plus the Band. The SELECT would have ORDER BY Band. Note that there would be no concept of 12; any number could be implemented.
Performance? Fetching all ~12 rows and stepping through them -- this should not be a performance problem. If you had a thousand bands, you might notice a slight delay.