How to create a bar chart with a specific time frame on x, % of efficiency on y and columns width depending on duration? - bar-chart

I am studying the utilization of machines and would like to make a special bar chart.
My data set I created is made like this"
Program Name, Start Timestamp, Finish Timestamp, % of efficiency
I manually added a date column and duration thinking it would be useful in the future.
The chart I need would ideally have "Time" on x axis, "%" on y axis and the column width would vary depending on the duration. This would allow me to visually see gaps between programs and identify how to optimize the machine schedule.
The possibility to make a date filter would help.
A time filter to drill in the data could be nice too.
I can use R, Python, Excel, PowerBI ...
Chart Needed
Tried scripting in R, the closest I get is on excel but with a huge table which is not practical and complex to reproduce in the future.

Related

Anylogic: How to create an objective function using values of two dataset (for optimization experiment)?

In my Anylogic model I have a population of agents (4 terminals) were trucks arrive at, are being served and depart from. The terminals have two parameters (numberOfGates and servicetime) which influence the departures per hour of trucks leaving the terminals. Now I want to tune these two parameters, so that the amount of departures per hour is closest to reality (I know the actual departures per hour). I already have two datasets within each terminal agent, one with de amount of departures per hour that I simulate, and one with the observedDepartures from the data.
I already compare these two datasets in plots for every terminal:
Now I want to create an optimization experiment to tune the numberOfGates and servicetime of the terminals so that the departure dataset is the closest to the observedDepartures dataset. Does anyone know how to do create a(n) (objective) function for this optimization experiment the easiest way?
When I add a variable diff that is updated every hour by abs( departures - observedDepartures) and put root.diff in the optimization experiment, it gives me the eq(null) is not allowed. Use isNull() instead error, in a line that reads the database for the observedDepartures (see last picture), but it works when I run the simulation normally, it only gives this error when running the optimization experiment (I don't know why).
You can use the absolute value of the sum of the differences for each replication. That is, create a variable that logs the | difference | for each hour, call it diff. Then in the optimization experiment, minimize the value of the sum of that variable. In fact this is close to a typical regression model's objectives. There they use a more complex objective function, by minimizing the sum of the square of the differences.
A Calibration experiment already does (in a more mathematically correct way) what you are trying to do, using the in-built difference function to calculate the 'area between two curves' (which is what the optimisation is trying to minimise). You don't need to calculate differences or anything yourself. (There are two variants of the function to compare either two Data Sets (your case) or a Data Set and a Table Function (useful if your empirical data is not at the same time points as your synthetic simulated data).)
In your case it (the objective function) will need to be a sum of the differences between the empirical and simulated datasets for the 4 terminals (or possibly a weighted sum if the fit for some terminals is considered more important than for others).
So your objective is something like
difference(root.terminals(0).departures, root.terminals(0).observedDepartures)
+ difference(root.terminals(1).departures, root.terminals(1).observedDepartures)
+ difference(root.terminals(2).departures, root.terminals(2).observedDepartures)
+ difference(root.terminals(3).departures, root.terminals(2).observedDepartures)
(It would be better to calculate this for an arbitrary population of terminals in a function but this is the 'raw shape' of the code.)
A Calibration experiment is actually just a wizard which creates an Optimization experiment set up in a particular way (with a UI and all settings/code already created for you), so you can just use that objective in your existing Optimization experiment (but it won't have a built-in useful UI like a Calibration experiment). This also means you can still set this up in the Personal Learning Edition too (which doesn't have the Calibration experiment).

Can I find price floors and ceilings with cuda

Background
I'm trying to convert an algorithm from sequential to parallel, but I am stuck.
Point and Figure Charts
I am creating point and figure charts.
Decreasing
While the stock is going down, add an O every time it breaks through the floor.
Increasing
While the stock is going up, add an X every time it breaks through the ceiling.
Reversal
If the stock reverses direction, but the change is less than a reversal threshold (3 units) do nothing. If the change is greater than the reversal threshold, start a new column (X or O)
Sequential vs Parallel
Sequentially, this is pretty straight forward. I keep a variable for the floor and ceiling. If the current price breaks through the floor or ceiling, or changes more than the reversal threshold, I can take the appropriate action.
My question is, is there a way to find these reversal point in parallel? I'm fairly new to thinking in parallel, so I'm sorry if this is trivial. I am trying to do this in CUDA, but I have been stuck for weeks. I have tried using the finite difference algorithms from NVidia. These produce local max / min but not the reversal points. Small fluctuations produce numerous relative max / min, but most of them are trivial because the change is not greater than the reversal size.
My question is, is there a way to find these reversal point in parallel?
one possible approach:
use thrust::unique to remove periods where the price is numerically constant
use thrust::adjacent_difference to produce 1st difference data
use thrust::adjacent_difference on 1st difference data to get the 2nd difference data, i.e the points where there is a change in the sign of the slope.
use these points of change in sign of slope to identify separate regions of data - build a key vector from these (e.g. with a prefix sum). This key vector segments the price data into "runs" where the price change is in a particular direction.
use thrust::exclusive_scan_by_key on the 1st difference data, to produce the net change of the run
Wherever the net change of the run exceeds a threshold, flag as a "reversal"
Your description of what constitutes a reversal may also be slightly unclear. The above method would not flag a reversal on certain data patterns that you might classify as a reversal. I suspect you are looking beyond a single run as I have defined it here. If that is the case, there may be a method to address that as well - with more steps.

mysql query to vb.net line chart

I'm creating a trend (line chart) in vb.net populating with data from a mysql database. Currently, I have x-axis as datetime and y-axis as just a test value (integer). As in most HMI/SCADA trends you can pause, go back, fwd (scroll) in the trend (I just have buttons). I have my y-axis interval changeable. So, what I do is look at the Id from the DB as min/max. Then when I want to scroll left/right I just increment/decrement the min/max and then use these new values for my where clause in the sql query. All of this works very well. It seems to have very little overhead and is very fast/responsive. So far I have tested with over 10 million rows and don't see any noticeable performance difference. However, at some point in time my Id (being auto increment) will run out. I know it's a large number but depending on my poll rate it could fill up in just a few years.
So, does anyone have a better approach to this? Basically I'm trying to eliminate the need to query more points than what I can view in the chart at one time. Also, if I have millions of rows I don't want to load these points if I don't have to. I also need to have this somewhat future proof. Right now I just don't feel comfortable with it.

Rescale data to be sinuosoidal

I have some time series data I'm looking at in Python that I know should follow a sine2 function, but for various reasons doesn't quite fit it. I'm taking an FFT of it and it has a fairly broad frequency spread, when it should be a very narrow single frequency. However, the errors causing this are quite consistent--if I take data again it matches very closely to the previous data set and gives a very similar FFT.
So I've been trying to come up with a way I can rescale the time axis of the data so that it is at a single frequency, and then apply this same rescaling to future data I collect. I've tried various filtering techniques to smooth the data or to cut frequencies from the FFT without much luck. I've also tried fitting a frequency varying sine2 to the data, but haven't been able to get a good fit (if I was able to, I would use the frequency vs time function to rescale the time axis of the original data so that it has a constant frequency and then apply the same rescaling to any new data I collect).
Here's a small sample of the data I'm looking at (the full data goes for a few hundred cycles). And the resulting FFT of the full data
Any suggestions would be greatly appreciated. Thanks!

Statistical Process Control Charts in SQL Server 2008 R2

I'm hoping you can point me in the right direction.
I'm trying to generate a control chart (http://en.wikipedia.org/wiki/Control_chart) using SQL Server 2008. Creating a basic control chart is easy enough. I'd just calculate the mean and standard deviations and then plot them.
The complex bit (for me at least) is that I would like the chart to reset the mean and the control limits when a step change is identified.
Currently I'm only interested in a really simple method of identifying a step change, 5 points appearing consecutively above or below the mean. There are more complex ways of identifying them (http://en.wikipedia.org/wiki/Western_Electric_rules) but I just want to get this off the ground first.
The process I have sort of worked out is:
Aggregate and order by month and year, apply row numbers.
Calculate overall mean
Identify if each data item is higher, lower or the same as the mean, tag with +1, -1 or 0.
Identify when their are 5 consecutive data items which are above or below the mean (currently using a cursor).
Recalculate the mean if 5 points are above or 5 points are below the mean.
Repeat until end of table.
Is this sort of process possible in SQL server? It feels like I maybe need a recursive UDF but recursion is a bit beyond me!
A nudge in the right direction would be much appreciated!
Cheers
Ok, I ended up just using WHILE loops to iterate through. I won't post full code but the steps were:
Set up a user defined table data type in order to pass data into a stored procedure parameter.
Wrote accompanying stored procedure that uses row numbers and while loops to iterate along each data value in the input table and then uses the current row number to do set based processing on a subset of the input data (to check if following 5 points are above/below mean and recalculate the mean and standard deviations when this flag is tripped).
Outputs table with original values, row numbers, months, mean values, upper control limit and lower control limit.
I've also got one up and running that works based on full Nelson rules and will also state which test the data has failed.
Currently it's only been used by me as I develop it further so I've set up an Excel sheet with some VBA to dynamically construct a SQL string which it passes to a pivot table as the command text. That way you can repeatedly ping the USP with different data sets and also change a few of the other parameters on how the procedure runs (such as adjusting control limits and the like).
Ultimately I want to be able to pass the resulting data to Business Objects reports and dashboards that we're working on.