I implement multiagent ppo in rllib with a custom environment, it learns and works well except for the speed performance. I wonder if an underutilized CPU may cause the issue, so I want to know what ray/tune/perf/cpu_util_percent measures. Does it measure only the rollout workers, or is averaged over the learner? And what may be the cause? (All my runs give average of 13% CPU usage.)
run on gcp
ray 2.0
python3.9
torch1.12
head: n1-standard-8 with 1 v100 gpu
2 workers: c2-standard-60
num_workers: 120 # this worker != machine, num_workers = num_rollout_workers
num_envs_per_worker: 1
num_cpus_for_driver: 8
num_gpus: 1
num_cpus_per_worker: 1
num_gpus_per_worker: 0
train_batch_size: 12000
sgd_minibatch_size: 3000
I tried smaller batch size=4096 and smaller number of workers=10, and larger batch_size=480000, all resulted 10~20% CPU usage.
I cannot share the code.
I am training a Deep CNN on a very unbalanced data set for a binary classification problem. I have 90% 0's and 10% 1's. To penalize the misclassification of 1, I am using a class_weight that was determined by sklearn's compute_class_weight(). In the validation tuple passed to the fit_generator(), I am using a sample_weight that was computed by sklearn's compute_sample_weight().
The network seems to be learning fine but the validation accuracy continues to be 90% or 10% after every epoch. How can I solve this data unbalance issue in Keras considering the steps I have already taken to overcome it?
Picture of fit_generator: fit_generator()
Picture of log outputs: log outputs
It's ver y strange that your val_accuracy jumps from 0.9 to 0.1 and back. Do you have right learning rate? Try to lower it even more.
And my advice: use f1 metric also.
How did you split the data - train set classes have the same rate in test set?
I trained FCN32 from the scratch on my data, unfortunately I am getting a black image as output. Here is the loss curve.
I am not sure whether this training loss curve is normal or not, or whether I have done something wrong or not.
I really appreciate experts'idea on this. And
why the output is a black image?
Is the network overfitting?
Should I change lr_mult value in Deconvolution layer, from 0
to any other value?
Thanks a lot
Edited:
I changed the lr_mult value in Deconvolution layer, from 0
to 3
and the following shows the solver:
test_interval: 1000 #1000000
display: 100
average_loss: 100
lr_policy: "step"
stepsize: 100000
gamma: 0.1
base_lr: 1e-7
momentum: 0.99
iter_size: 1
max_iter: 500000
weight_decay: 0.0005
I got the following train-loss curve and again I am getting black image. I do not know what is the mistake and why it is behaving like this, could someone please share some ideas? Thanks
There is an easy way to check if you are overfitting on the training data or just did something wrong in the algorithm. Just predict on the training data and look at the output. If this is very similar or equal to the desired output you are overfitting and you will probably have to apply dropout and weight regularization.
If the output is also black on the training data your labels or your optimization metric is probably wrong.
Should I change lr_mult value in Deconvolution layer, from 0 to any other value?
lr_mult = 0 means this layer does not learn (source, source 2). If you want that layer to learn, you should better set it to a positive value. Depending on your initialization, this might very well be the reason why the image is black.
I want to split my data in 3 partitions training, validation and test: 70% training, 15% validation and 15% test for regression. Python provides a way to do that only for training and testing by cross_validation.train_test_split. Any Ideas?
Use cross_validation.train_test_split, 2 times.
First with (70,30) => (training, validation_test) and secondly use (50,50) -> (validation,test).
This is for a new feature on http://cssfingerprint.com (see /about for general info).
The feature looks up the sites you've visited in a database of site demographics, and tries to guess what your demographic stats are based on that.
All my demgraphics are in 0..1 probability format, not ratios or absolute numbers or the like.
Essentially, you have a large number of data points that each tend you towards their own demographics. However, just taking the average is poor, because it means that by adding in a lot of generic data, the number goes down.
For example, suppose you've visited sites S0..S50. All except S0 are 48% female; S0 is 100% male. If I'm guessing your gender, I want to have a value close to 100%, not just the 49% that a straight average would give.
Also, consider that most demographics (i.e. everything other than gender) does not have the average at 50%. For example, the average probability of having kids 0-17 is ~37%. The more a given site's demographics are different from this average (e.g. maybe it's a site for parents, or for child-free people), the more it should count in my guess of your status.
What's the best way to calculate this?
For extra credit: what's the best way to calculate this, that is also cheap & easy to do in mysql?
ETA: I think that something approximating what I want is Φ(AVG(z-score ^ 2, sign preserved)). But I'm not sure if this is a good weighting function.
(Φ is the standard normal distribution function - http://en.wikipedia.org/wiki/Standard_normal_distribution#Definition)
A good framework for these kinds of calculations is Bayesian inference. You have a prior distribution of the demographics - eg 50% male, 37% childless, etc. Preferrably, you would have it multivariately: 10% male childless 0-17 Caucasian ..., but you can start with one-at-a-time.
After this prior each site contributes new information about the likelihood of a demographic category, and you get the posterior estimate which informs your final guess. Using some independence assumptions the updating formula is as follows:
posterior odds = (prior odds) * (site likelihood ratio),
where odds = p/(1-p) and the likelihood ratio is a multiplier modifying the odds after visiting the site. There are various formulas for it, but in this case I would just use the above formula for the general population and the site's population to calculate it.
For example, for a site that has 35% of its visitors in the "under 20" agegroup, which represents 20% of the population, the site likelihood ratio would be
LR = (0.35/0.65) / (0.2/0.8) = 2.154
so visiting this site would raise the odds of being "under 20" 2.154-fold.
A site that is 100% male would have an infinite LR, but you would probably want to limit it somewhat by, say, using only 99.9% male. A site that is 50% male would have an LR of 1, so it would not contribute any information on gender distribution.
Suppose you start knowing nothing about a person - his or her odds of being "under 20" are 0.2/0.8 = 0.25. Suppose the first site has an LR=2.154 for this outcome - now the odds of being "under 20" becomes 0.25*(2.154) = 0.538 (corresponding to the probability of 35%). If the second site has the same LR, the posterior odds become 1.16, which is already 54%, etc. (probability = odds/(1+odds)). At the end you would pick the category with the highest posterior probability.
There are loads of caveats with these calculations - for example, the assumption of independence likely being wrong, but it can provide a good start.
The naive Bayesian formula for you case looks like this:
SELECT probability
FROM (
SELECT #apriori := CAST(#apriori * ratio / (#apriori * ratio + (1 - #apriori) * (1 - ratio)) AS DECIMAL(30, 30)) AS probability,
#step := #step + 1 AS step
FROM (
SELECT #apriori := 0.5,
#step := 0
) vars,
(
SELECT 0.99 AS ratio
UNION ALL
SELECT 0.48
UNION ALL
SELECT 0.48
UNION ALL
SELECT 0.48
UNION ALL
SELECT 0.48
UNION ALL
SELECT 0.48
UNION ALL
SELECT 0.48
UNION ALL
SELECT 0.48
) q
) q2
ORDER BY
step DESC
LIMIT 1
Quick 'n' dirty: get a male score by multiplying the male probabilities, and a female score by multiplying the female probabilities. Predict the larger. (Actually, don't multiply; sum the log of each probability instead.) I think this is a maximum likelihood estimator if you make the right (highly unrealistic) assumptions.
The standard formula for calculating the weighted mean is given in this question and this question
I think you could look into these approaches and then work out how you calculate your weights.
In your gender example above you could adopt something along the lines of a set of weights {1, ..., 0 , ..., 1} which is a linear decrease from 0 to 1 for gender values of 0% male to 50% and then a corresponding increase up to 100%. If you want the effect to be skewed in favour of the outlying values then you easily come up with a exponential or trigonometric function that provides a different set of weights. If you wanted to then a normal distribution curve will also do the trick.