Can and should I normalize, center, and standardize my imagery dataset? - deep-learning

I am working with a data set for medicine and applying it to a CNN. But I have two questions:
First, can I apply more than one preprocessing technique to my data (normalize, center, and standardize)? For example, centering my data (by subtracting the mean) and then normalizing or standardizing it.
Second, in case it's a good idea to apply more than one preprocessing technique, is there any particular order I should consider in applying these processes? For example, would it be okay if I normalize the data and then center it, or should I first center it and then normalize it?
I am particularly interested in normalizing and centering my data. My intuition tells me that I should first center the data and then normalize it but I'm not sure, and I don't even know if it's correct to apply more than one of these techniques in the same database.

Related

SSRS Charts: solution for better assignment of colors to the legend

If you have a line chart in SSRS with many lines, it is nearly impossible to identify which line belongs to which item in the legend, as the colors are nearly the same. Is there a better solution?
bad example of line chart legend
Some suggestions that may help:
Group some of the values into an Other group. It looks like you have
some values that come and go, or don't run for the full timeframe of the
report, lumping these into an Other group will mean less legend items.
Move the legend to the bottom of the chart. This can sometimes make
the legend easier to see; this is not a good option when have a lot
more legend items than what you have now.
Use more than one chart; one chart for each line is possible. This
may be a good option for you. Use more than one chart, and only
display certain values in each. Perhaps you have some natural
grouping in the data that isn’t obvious from what you have provided
in the question. If you do, use that to separate the values into
different charts.
Use a different color theme. The theme you are using now would leave
any color-blind person wondering what was in the chart at all.
Make the chart larger. You just never know, this may work.
Use a column chart rather than a line chart. The bars are wider, and
can be easier to see. Plus, with the way your values come and go, it
may be a better way to visualize the data.
Limit the timeframe of the data being displayed. Having less data may
make this look better, but that may defeat the purpose of the report.
Still, it’s an option.
Good luck.
All good ideas by R Richards. I often end up with charts looking like yours. The first thing I do is ask. Is this of any use to the end user, if not I'll try to rationalise the chart. Some of the ideas in the earlier answer are things I try but also you can try the following without reducing the amount of data in the chart.
Simply make the lines thicker, it's much easier to identify the colours with thicker lines.
Add tooltips to the data points so that the user can hover over the
lines and get info about the line and/or point.
Use a custom pallet, the default palette does not have many colours in (7 I think), so colours are repeated. Creating a custom palette with more colours will make it easier to identify each line. It also means, if you can ensure the order of series in your data that you can you produce consistent charts were a colour always represents a specific business object.
If you have breaks in the data, change the chart to use an average
to give you a continuous line. I think your x axis has to be set as
a time type for this to work, I can't remember off the top of my
head.
Here's a before and after the first two ideas were applied to a sample chart I built.
If you think you need to reduce the data, group line with smaller values together and then add a drill down chart to show these lines.

igraph: layout by node attribute

Is there a way to weigh layout according to node attributes in igraph? In other words, how to get nodes that share the same characteristics (but do not have edges between them) cluster more closely together?
While many layout functions can take edge weight into account, the nodes that I want to be closer to each other do not have edges between them. An example of such situation is if the graph is bipartite. Using layouts such as fruchterman.reingold is not very informative as vertices of the two different types are interspersed. However, I do not want it be be as extreme as the layout.bipartite option either as it would be rather messy when there are lots of vertices. What I wish is to have a layout that is somewhere between these two, having vertices of the same type to be on one side, and also cluster according to certain attributes, with edges between the two types.
Any idea or suggestion will be greatly appreciated. Thanks!
igraph layouts are simply matrices with 2 columns and N rows, so you can easily re-use one layout with another graph as long as the two graphs share the same number of nodes. You can make use of this here: create a graph where you connect the nodes that you want to be placed close to each other, calculate the layout using this graph, and then plot your original graph with the layout you have calculated.

Fastest way to draw a grid

I need to draw a (possibly large) grid of squares. I wonder what sort of layout is the fastest to render.
each square positioned absolutely
div for each row, filled with floating squares
an actual table
some other?
Not sure there will be much difference in rendering time - they all use a similar amount of code.
The absolute positioning is likely to create the most css, so I'd personally avoid that one.
Is it for showing tabular data or just decoration? If the former, use a table.
If the latter, could you achieve it with a tiled graphic as the background of a single cell instead?
Also: A javascript loop to create them is likely to keep your code tidier than writing each square manually.

Equally distribute objects across a bezier curve

Can somebody walk me through how this madness works:
http://www.youtube.com/watch?v=KL8QLLmUvbg
Specifically I'm interested in equally distributing a given number of squares along a path. I'm also wondering if this would work with multiple line segments-- this is one curved segment and I need a solution to distribute objects across one big line with multiple curves in it.
Basically I'm trying to make a tail that realistically follows a character.
Thanks
First a Bezier spline is a curve parametrized by t. However t is not arc-length along the curve. So the procedure is this.
Calculate the length of the bezier curve.
Find the t values that divide the curve into N equal length segments.
However these two steps are tricky.
The first has a closed form solution only for quadratic Beziers. (You can find the solution here )
Otherwise you use a subdivide and approximate approach, or a numerical integration approach (and in some sense these are equivalent - I'd go the numerical integration approach as this has better provable behavior at the cost of slightly trickier implementation, but you may or may not care about that.)
The second is basically a guess a t value, and improve approach (using the same style of calculation at each step as step 1). I'd implement this using a secant style search, as I suspect the derivatives required to use a Newton's method search would be too expensive to calculate.
Once you've got the positions of the objects, you need to use the curve tangent and cotangent to create a local reference frame for the object. This allows the objects to sit nicely in the path of the curve, rather than all having the same orientation. Note that this only works nicely in 2D - in 3D you can still get some weird behavior with object orientation.
You can start by looking into how a bezier curve is calculated. Wikipedia has some nice animations with the explanation and this link has some as3 code.
but if you're trying to create a tail, there are simpler ways of doing that, like using following behaviour or a physics library
I ended up creating a following behavior system like Daniel recommended for simplicities sake. But to elaborate on Michael's awesome answer I stumbled onto this tutorial which details the the spline technique.
http://gamedev.tutsplus.com/tutorials/implementation/create-a-glowing-flowing-lava-river-using-bezier-curves-and-shaders/

Design ideas for displaying large amounts of data in an html table

I have an html table that literally has like 30 columns of data, and I'm having a hard time framing it in such a way that it can be visible without massive left/right scrolling.
One thing I was wondering is if anyone has ever seen anything clever with column headers? Some of them just can't be abbreviated down enough, but the column header is something like "Interview" and the value is numeric (lots of wasted space for the header alone). Granted, I could try and name these columns like INT or whatever, but there are lots of similarly named columns that it could become confusing.
Maybe some sort of auto collapsing columns based on mouse movement? Not sure.. I just need some creative suggestions on how to display this data!
Most likely the user will have a devil of a time comprehending 30 columns of data, regardless of scrolling.
I would recommend showing the most fundamental columns (things like name, description, identifying numbers -- core stuff, hopefully there are only 10 of them or less), and then letting the user toggle on or off whatever columns they need. A bit like google squared.
Use Jquery and CSS to accomplish this in a clean fashion. There may also be Javascript UI libraries that do this for you (Jquery UI, YUI, others...)
create images for the column names and rotate the text in the image 90 degrees. you can then have a long name with equally small widths.
Josh
I agree with the answer from ferocious, toggling columns is a good idea. Also, depending on the data, I would recommend only having a few columns displayed, and when the user clicks on the row they are interested in, it moves to a new page dedicated to the data in that record. This will work for some types of data and not for others