Accurate car fuel consumption with OBD II - obd-ii

I want to calculate fuel consumption of a car quite accurately with a raspberry pi.
The car I use is a Citroen and has not many available PID for OBD II.
Here are the available PIDs (in decimal) : 1, 3-7, 11-15, 17, 19-21, 18, 31-33, 35, 46, 48, 49, 51, 60, 64, 66, 68, 71, 73, 74, 76-80
So I use a method based on MAF (Mass air flow) estimation described here : http://www.lightner.net/obd2guru/IMAP_AFcalc.
I have ~20% error for the average consumption. But I'm more interested in the immediate consumption, especially during accelerations and decelerations, and this method doesn't work well : when I use the engine brake, I have a >2L/100km consumption instead of 0 (probably because there is an air flow but no fuel flow), and consumption increases with speed at a slower rate than estimated on the dashboard.
I think I need a live estimation of air/fuel ratio.
I would like to find values similar to those displayed on the dashboard, but can't find how to do it. Does the OBC use sensors that are not available via OBD-II ?
2 questions :
Can I get a better consumption estimation with the data I have on OBD-II ?
Is there a way to get other data via OBD-II, or via another mean ? (after all the OBC can do it...)
Thanks !

All vehicles are continually trimming fuel rate based on many factors (mostly the #1 O2 sensor) to try to hit 0.5% lean of stoichiometric while operating to meet a power demand. So you're right, that method is correct in principle, but naiive.
I bet you'd go a long way by towards accuracy by logging the Commanded Air-Fuel Equivalence Ratio as well. Please ping me and let me know how it goes.

Related

Fuel Consumtion data via OBD2 is wrong - can you help me out?

So I am trying to get real time fuel consumtions data from my Car (2021 Kia Sorento PHEV) via OBD2. I've read up on the topic and it seems to be simple enough.
Fuel Consumtion in Liters per Hour (PID 5E(hex)/94(dec) "Engine fuel rate") divided by Speed in Km/h == Liters/100km.
The problem is: The results are... absurd. When i coast around town #50km/h and the gauge cluster reads an instant fuel consumtion ~3-4 Liters/100km the OBD2 Data suggest an usage of ~17-21 Liters/100km.
I've started to calculate the fuel rate in l/h manually using MAP AFR etc. Data from the OBDII Port and arrive at the same value for Liters/Hour and therefor for the same absurd instant fuel consumtion values.
OBD2 Bluetooth Dongles and popuplar Apps like "Car Scanner" or Torque also report this insanely high instant fuel consumtion.
So I am asking you guys: Is there some alternate formula for fuel consumtion I (And the developers of all those android apps) am not aware of?
Thanks :)
Instantaneous consumption can show some "wild" results.
Top Gear's Richard Hammond made reference to this in one series when he pointed out he was getting 99mpg going downhill.
If you want an accurate check of fuel consumption then the most accurate that I know of is to "brim" the tank, drive then "brim" the tank again. You then have distance and fuel consumption.

Does any H2O algorithm support multi-label classification?

Is deep learning model supports multi-label classification problem or any other algorithms in H2O?
Orginal Response Variable -Tags:
apps, email, mail
finance,freelancers,contractors,zen99
genomes
gogovan
brazil,china,cloudflare
hauling,service,moving
ferguson,crowdfunding,beacon
cms,naytev
y,combinator
in,store,
conversion,logic,ad,attribution
After mapping them on the keys of the dictionary:
Then
Response variable look like this:
[74]
[156, 89]
[153, 13, 133, 40]
[150]
[474, 277, 113]
[181, 117]
[15, 87, 8, 11]
Thanks
No, H2O only contains algorithms that learn to predict a single response variable at a time. You could turn each unique combination into a single class and train a multi-class model that way, or predict each class with a separate model.
Any algorithm that creates a model that gives you "finance,freelancers,contractors,zen99" for one set of inputs, and "cms,naytev" for another set of inputs is horribly over-fitted. You need to take a step back and think about what your actual question is.
But in lieu of that, here is one idea: train some word embeddings (or use some pre-trained ones) on your answer words. You could then average the vectors for each set of values, and hope this gives you a good numeric representation of the "topic". You then need to turn your, say, 100 dimensional averaged word vector into a single number (PCA comes to mind). And now you have a single number that you can give to a machine learning algorithm, and that it can predict.
You still have a problem: having predicted a number, how do you turn that number into a 100-dim vector, and from there in to a topic, and from there into topic words? Tricky, but maybe not impossible.
(As an aside, if you turn the above "single number" into a factor, and have the machine learning model do a categorization, to predicting the most similar topic to those it has seen before... you've basically gone full circle and will get a model identical to the one you started with that has too many classes.)

Sarsa and Q Learning (reinforcement learning) don't converge optimal policy

I have a question about my own project for testing reinforcement learning technique. First let me explain you the purpose. I have an agent which can take 4 actions during 8 steps. At the end of this eight steps, the agent can be in 5 possible victory states. The goal is to find the minimum cost. To access of this 5 victories (with different cost value: 50, 50, 0, 40, 60), the agent don't take the same path (like a graph). The blue states are the fail states (sorry for quality) and the episode is stopped.
enter image description here
The real good path is: DCCBBAD
Now my question, I don't understand why in SARSA & Q-Learning (mainly in Q learning), the agent find a path but not the optimal one after 100 000 iterations (always: DACBBAD/DACBBCD). Sometime when I compute again, the agent falls in the good path (DCCBBAD). So I would like to understand why sometime the agent find it and why sometime not. And there is a way to look at in order to stabilize my agent?
Thank you a lot,
Tanguy
TD;DR;
Set your epsilon so that you explore a bunch for a large number of episodes. E.g. Linearly decaying from 1.0 to 0.1.
Set your learning rate to a small constant value, such as 0.1.
Don't stop your algorithm based on number of episodes but on changes to the action-value function.
More detailed version:
Q-learning is only garranteed to converge under the following conditions:
You must visit all state and action pairs infinitely ofter.
The sum of all the learning rates for all timesteps must be infinite, so
The sum of the square of all the learning rates for all timesteps must be finite, that is
To hit 1, just make sure your epsilon is not decaying to a low value too early. Make it decay very very slowly and perhaps never all the way to 0. You can try , too.
To hit 2 and 3, you must ensure you take care of 1, so that you collect infinite learning rates, but also pick your learning rate so that its square is finite. That basically means =< 1. If your environment is deterministic you should try 1. Deterministic environment here that means when taking an action a in a state s you transition to state s' for all states and actions in your environment. If your environment is stochastic, you can try a low number, such as 0.05-0.3.
Maybe checkout https://youtu.be/wZyJ66_u4TI?t=2790 for more info.

deep autoencoder training, small data vs. big data

I am training a deep autoencoder (for now 5 layers encoding and 5 layers decoding, using leaky ReLu) to reduce the dimensionality of the data from about 2000 dims to 2. I can train my model on 10k data, and the outcome is acceptable.
The problem arises when I am using bigger data (50k to 1M). Using the same model with the same optimizer and drop out etc does not work and the training gets stuck after a few epochs.
I am trying to do some hyper-parameter search on the optimizer (I am using adam), but I am not sure if this will solve the problem.
Should I look for something else to change/check? Does the batch size matter in this case? Should I solve the problem by fine tuning the optimizer? Shoul I play with the dropout ratio? ...
Any advice is very much appreciated.
p.s. I am using Keras. It is very convenient. If you do not know about it, then check it out: http://keras.io/
I would have the following questions when trying to find a cause of the problem:
1) What happens if you change the size of the middle layer from 2 to something bigger? Does it improve the performance of the model trained on >50k training set?
2) Are 10k training examples and test examples randomly selected from 1M dataset?
My guess is that your training model is simply not able to decompress your 50K-1M data using just 2 dimensions in the middle layer. So, it's easier for the model to fit their params for 10k data, activations from middle layer are more sensible in that case, but for >50k data activations are random noise.
After some investigation, I have realized that the layer configuration I am using is somehow ill for the problem, and this seems to cause -at least parts of the- problem.
I have been using sequence of layers for encoding and decoding. The layer sizes where chosen to decrease linearly, for example:
input: 1764 (dims)
hidden1: 1176
hidden2: 588
encoded: 2
hidden3: 588
hidden4: 1176
output: 1764 (same as input)
However this seems to work only occasionally and it is sensitive to the choice of hyper parameters.
I tried to replace this with an exponentially decreasing layer size (for encoding) and the other way for decoding. so:
1764, 128, 16, 2, 16, 128, 1764
Now in this case the training seems to be happening more robustly. I still have to make a hyper parameter search to see if this one is sensitive or not, but a few manual trials seems to show its robustness.
I will post an update if I encounter some other interesting points.

Should calls to kernels be encoded to fit the "GPU layout", or the "algorithm layout"?

I'm just learning about CUDA/OpenCL, but I have a conceptual question. Suppose I am encoding an algorithm to do a Breadth-First-Search on a graph. Suppose my target device is a GPU with only 2 work-groups of 2 processing elements (i.e., 4 cores). Intuitivelly, I guess the search could be done in parallel by keeping an array of "nodes to visit". For each pass, each node is visited in parallel and added to the next "nodes to visit" array. For example, this graph could spawn the following search:
start: must visit A
parallel pass 1: must visit B, C, D
parallel pass 2: must visit E, F, G, H, I, J, K, L, M
parallel pass 3: must visit N, O, P
parallel pass 4: must visit Q
parallel pass 5: done
A way to do it, I suppose, would be to call a ND range kernel 5 times (example in OpenCL):
clEnqueueNDRangeKernel(queue, kernel, 1, 0, 1, 1, 0, NULL, NULL);
clEnqueueNDRangeKernel(queue, kernel, 1, 0, 1, 3, 0, NULL, NULL);
clEnqueueNDRangeKernel(queue, kernel, 1, 0, 1, 9, 0, NULL, NULL);
clEnqueueNDRangeKernel(queue, kernel, 1, 0, 1, 3, 0, NULL, NULL);
clEnqueueNDRangeKernel(queue, kernel, 1, 0, 1, 1, 0, NULL, NULL);
(Of course, this is hard-coded for this case, so, to avoid that, I guess I could keep a counter of nodes to visit.) This sounds wrong, though: it doesn't fit the layout of a GPU, at all. For example, on the third call, you are using 1 work group of 9 work items, when your GPU has 2 work groups of 2 work items...!
Another way I see could be the following (example in OpenCL):
while (work not complete...)
clEnqueueNDRangeKernel(queue, kernel, 1, 0, 2, 2, 0, NULL, NULL);
This would call "clEnqueueNDRangeKernel" continously in a way that fits perfectly the GPU layout, until it receives a signal that the work is done. But this time, the ids received by the kernel wouldn't fit the layout of the algorithm!
What is the right way to do this? Am I missing something?
Your question is one of the most unobvious & interesting.
IMO, you should implement "pure" & easy-to-analyze algorithm if want your application to run on different hardware. If someone else will come across your implementation, at least, it will be easy to tweak.
Otherwise, if you put performance first, hardcode every single piece of software to achieve optimal performance on single target platform. Other man, who may work with your code later, will have to learn hardware peculiarities anyway.
From what I can infer from your question, I'd say that the short answer is both.
I say that because the way you want to solve a problem is always linked to how you posed that problem.
The best illustration is the amount of sorting algorithms that exist. From the pure sorting point of view ("I need my data sorted"), the problem is solved for a while. However it didn't stop researchers to investigate new algorithms because they added constraints and/or new input to the problem: "I need my data sorted as fast as possible" (the reason why algorithms are categorized with the big O notation), "I need my data sorted as fast as possible knowing that the data structure is..." or "knowing that there is X % chance that..." or "I don't care about the speed but I care about memory", etc.
Now your problem seems to be: I want a breadth first search algorithm that runs efficiently (this I'm guessing - why to learn OCL/CUDA otherwise?) on GPUs.
This simple sentence hides a lot of constraints. For instance you have to take into account that:
it takes a lot of time to send the data through the PCIe bus (for discrete GPUs).
Access (global) memory latency is high.
Threads work in lock steps (number varies with vendors).
Latency is hidden with throughput, and so on.
Note also that it is not necessarily the same that "I want a parallel breadth first search algorithm" (which could run on CPU with again different constraints).
A quick Google search with these key words : "breadth first search parallel GPU" returns me, among others these articles that seems promising (I just went through the abstracts):
Efficient Parallel Graph Exploration on Multi-Core CPU and GPU (Use CPU and GPU)
An Effective GPU Implementation of Breadth-First Search
Scalable GPU Graph Traversal (From NVIDIA)