controlling over hover column in plotly choropleth map - hover

I am plotting data using plotly express choropleth function. I am trying to display only top line (variable name = country1) while hovering over the map.
trying to eliminate "Remittances (Million USD) 101 to 300" and "iso alpha=USA".
code is given as:
import plotly.express as px
df = pd.read_csv('remi_top10.csv')
fig = px.choropleth(df, locations="iso_alpha",
color='Remittances (Million USD)', # Remittance Received in July 2022
hover_name="country1", # column to add to hover information
title= "Remittances Received During the Month of July 2022",
#color_continuous_scale=px.colors.sequential.Plasma,
#color_discrete_sequence= px.colors.sequential.Plasma_r,
color_discrete_sequence=["red", "navy", "blue", "goldenrod", "magenta", "purple", "pink", "green"],
width=1000,
height=550)
fig.update_traces( #hovertemplate=None,
# hovertext = '%{country1}',
#hoverinfo='skip' #all, skip, none
)
fig.update_layout(
font_family="Courier New",
font_color="blue",
title_font_family="Times New Roman",
title_font_color="green",
legend_title_font_color="green",
#title_x=0.5 #to center title
)
fig.update_xaxes(title_font_family="Arial")
fig.show()

Related

Niftis being plotted skewed

When I plot single images they appear to be skewed, but doesn't appear that way when I look at the images in 3DSlicer or another viewer. I'm not sure if there's something I should be adjusting that I'm not aware of. The below is how I converted from DICOM:
dicom2nifti.convert_directory(path_to_dicom_before, path_to_dicom_before_converted, compression=True, reorient=True)
dicom2nifti.convert_directory(path_to_dicom_post, path_to_dicom_post_converted, compression=True, reorient=True)
print(glob(path_to_dicom_before_converted + '*.nii.gz'))
nii_before = nib.load(glob(path_to_dicom_before_converted + '*.nii.gz')[0])
nii_after = nib.load(glob(path_to_dicom_post_converted + '*.nii.gz')[0])
nii_before_data = nii_before.get_fdata()
nii_after_data = nii_after.get_fdata()
fig, ax = plt.subplots(figsize=[10, 5])
plotting.plot_img(nii_before, cmap='gray', axes=ax)
plt.show()
fig, ax = plt.subplots(figsize=[10, 5])
plotting.plot_img(nii_after, cmap='gray', axes=ax)
plt.show()
plt.imshow(nii_before_data[100], cmap='bone')
plt.axis('off')
plt.show()
Affine of the first:
[[-3.19454312e-01 7.17869774e-02 3.95075195e-02 6.01478424e+01]
[ 5.83867840e-02 2.97792435e-01 -2.28872180e-01 1.27874863e+02]
[ 4.69673797e-02 1.18071720e-01 5.53225577e-01 1.12181287e+03]
[ 0.00000000e+00 0.00000000e+00 0.00000000e+00 1.00000000e+00]]
As you can see in this answer you are plotting the row 100 with all columns and all slices! Also you need to plot the pixel array nii_before_data and not the whole Nifti image nii_before which contains other types of data.
you can try:
nii_before = nib.load(glob(path_to_dicom_before_converted + '*.nii.gz')[0])
nii_after = nib.load(glob(path_to_dicom_post_converted + '*.nii.gz')[0])
nii_before_data = nii_before.get_fdata()
nii_after_data = nii_after.get_fdata()
## Same goes for nii_after_data
if(len(nii_before_data.shape)==3):
for slice_Number in range(nii_before_data.shape[2]):
plt.imshow(nii_before_data[:,:,slice_Number ])
plt.show()
if(len(nii_before_data.shape)==4):
for frame in range(nii_before_data.shape[3]):
for slice_Number in range(nii_before_data.shape[2]):
plt.imshow(nii_before_data[:,:,slice_Number,frame])
plt.show()
If you can provide a sample Nifti Image the solution might be more precise according to your data.

Plotly Express: Prevent bars from stacking when Y-axis catgories have the same name

I'm new to plotly.
Working with:
Ubuntu 20.04
Python 3.8.10
plotly==5.10.0
I'm doing a comparative graph using a horizontal bar chart. Different instruments measuring the same chemical compounds. I want to be able to do an at-a-glance, head-to-head comparison if the measured value amongst all machines.
The problem is; if the compound has the same name amongst the different instruments - Plotly stacks the data bars into a single bar with segment markers. I very much want each bar to appear individually. Is there a way to prevent Plotly Express from automatically stacking the common bars??
Examples:
CODE
gobardata = []
for blended_name in _df[:20].blended_name: # should always be unique
##################################
# Unaltered compound names
compound_names = [str(c) for c in _df[_df.blended_name == blended_name]["injcompound_name"].tolist()]
# Random number added to end of compound_names to make every string unique
# compound_names = ["{} ({})".format(str(c),random.randint(0, 1000)) for c in _df[_df.blended_name == blended_name]["injcompound_name"].tolist()]
##################################
deltas = _df[_df.blended_name == blended_name]["delta_rettime"].to_list()
gobardata.append(
go.Bar(
name = blended_name,
x = deltas,
y = compound_names,
orientation='h',
))
fig = go.Figure(data = gobardata)
fig.update_traces(width=1)
fig.update_layout(
bargap=1,
bargroupgap=.1,
xaxis_title="Delta Retention Time (Expected - actual)",
yaxis_title="Instrument name(Injection ID)"
)
fig.show()
What I'm getting (Using actual, but repeated, compound names)
What I want (Adding random text to each compound name to make it unique)
OK. Figured it out. This is probably pretty klugy, but it consistently works.
Basically...
Use go.FigureWidget...
...with make_subplots having a common x-axis...
...controlling the height of each subplot based on number of bars.
Every bar in each subplot is added as an individual trace...
...using a dictionary matching bar name to a common color.
The y-axis labels for each subplot is a list containing the machine name as [0], and then blank placeholders ('') so the length of the y-axis list matches the number of bars.
And manually manipulating the legend so each bar name appears only once.
# Get lists of total data
all_compounds = list(_df.injcompound_name.unique())
blended_names = list(_df.blended_name.unique())
#################################################################
# The heights of each subplot have to be set when fig is created.
# fig has to be created before adding traces.
# So, create a list of dfs, and use these to calculate the subplot heights
dfs = []
subplot_height_multiplier = 20
subplot_heights = []
for blended_name in blended_names:
df = _df[(_df.blended_name == blended_name)]#[["delta_rettime", "injcompound_name"]]
dfs.append(df)
subplot_heights.append(df.shape[0] * subplot_height_multiplier)
chart_height = sum(subplot_heights) # Prep for the height of the overall chart.
chart_width = 1000
# Make the figure
fig = make_subplots(
rows=len(blended_names),
cols=1,
row_heights = subplot_heights,
shared_xaxes=True,
)
# Create the color dictionary to match a color to each compound
_CSS_color = CSS_chart_color_list()
colors = {}
for compound in all_compounds:
try: colors[compound] = _CSS_color.pop()
except IndexError:
# Probably ran out of colors, so just reuse
_CSS_color = CSS_color.copy()
colors[compound] = _CSS_color.pop()
rowcount = 1
for df in dfs:
# Add bars individually to each subplot
bars = []
for label, labeldf in df.groupby('injcompound_name'):
fig.add_trace(
go.Bar(x = labeldf.delta_rettime,
y = [labeldf.blended_name.iloc[0]]+[""]*(len(labeldf.delta_rettime)-1),
name = label,
marker = {'color': colors[label]},
orientation = 'h',
),
row=rowcount,
col=1,
)
rowcount += 1
# Set figure to FigureWidget
fig = go.FigureWidget(fig)
# Adding individual traces creates redundancies in the legend.
# This removes redundancies from the legend
names = set()
fig.for_each_trace(
lambda trace:
trace.update(showlegend=False)
if (trace.name in names) else names.add(trace.name))
fig.update_layout(
height=chart_height,
width=chart_width,
title_text="∆ of observed RT to expected RT",
showlegend = True,
)
fig.show()

Running timeseries graphing function in Rmd producing cluttered x-axis labels (not present in test code)

I have a folder of xx .csv timeseries that I want to graph and knit into a clean HTML document. I have a ggplot code that produces the plot that I want using a single timeseries.csv. However, when I try to put the bones of that ggplot code in a function inside of a for loop to run each of the timeseries.csv files through the function I get a some plots with pretty different formatting.
Plot generated with my test ggplot code:
Plot generated with function and for loop:
Changes I'm trying to make to the ugly Rmd plot:
Nicely space the x-axis tick marks to whole mins (i.e. "11:14:00", "11:15:00")
Connect the data points (solved with subbing geom_line() with geom_path())
Example Rmd Code Below. Please Note that the graphs produced still have nice formatting, I'm not sure how to reproduce this problem sort of posting a 500 row dataframe. I also don't know how to post my rmd code without SO using the formatting commands in this post, so I threw in at 3 of " around my header formatting and at the end of the code to disable it.
Edits and Updates
I am getting a persistent error geom_path: Each group consists of only one observation. Do you need to adjust the group
aesthetic?.
As suggested by the commenters I tried removing plot() and using the the createChlDiffPlot() directly and replacing plot() with print(). Both produce the same ugly plots as before.
Replaced geom_line() with geom_path(). The points are now connected! x-axis cluttering is still there.
Time variable is reading as hms num
Many thanks for any help on this!
```
---
title: "Chl Filtration"
output:
flexdashboard::flex_dashboard:
theme: yeti
orientation: rows
editor_options:
chunk_output_type: console
---
```{r setup}
library(flexdashboard)
library(dplyr)
library(ggplot2)
library(hms)
library(ggthemes)
library(readr)
library(data.table)
#### Example Data
df1 <- data.frame(Time = as_hms(c("11:22:33","11:22:34","11:22:35","11:22:38","11:23:00","11:23:01","11:23:02")),
Chl_ug_L_Up = c(0.2,0.1,0.25,-0.2,-0.3,-0.15,0.1),
Chl_ug_L_Down = c(0.5,0.4,0.3,0.2,0.1,0,-0.1))
df2 <- data.frame(Time = as_hms(c("08:02:33","08:02:34","08:02:35","08:02:40","08:02:42","08:02:43","08:02:49")),
Chl_ug_L_Up = c(-0.2,-0.1,-0.25,0.2,0.3,0.15,-0.1),
Chl_ug_L_Down = c(-0.1,0,0.1,0.2,0.3,0.4,0.1))
data_directory = "./" # data folder in R project folder in the real deal
output_directory = "./" # output graph directory in R project folder
write_csv(df1, file.path(data_directory, "SO_example_df1.csv"))
write_csv(df2, file.path(data_directory, "SO_example_df2.csv"))
#### Function to create graphs
createChlDiffPlot = function(aTimeSeriesFile, aFileName, aGraphOutputDirectory, aType)
{
aFile_Mod = aTimeSeriesFile %<>%
select(Time, Chl_ug_L_Up, Chl_ug_L_Down) %>%
mutate(Chl_diff = Chl_ug_L_Up - Chl_ug_L_Down)
one_plot = ggplot(data = aFile_Mod, aes(x = Time, y = Chl_diff)) + # tried adding 'group = 1' in aes to connect points
geom_path(size = 1, color = "green") +
geom_point(color = "green") +
theme_gdocs() +
theme(axis.text.x = element_text(angle = 45, hjust = 1),
legend.title = element_blank()) +
labs(x = "", y = "Chl Difference", title = paste0(aFileName, " - ", "Filtration"))
one_graph_name = paste0(gsub(".csv", "", aFileName), "_", aType, ".pdf")
ggsave(one_graph_name, one_plot, dpi = 600, width = 7, height = 5, units = "in", device = "pdf", aGraphOutputDirectory)
return(one_plot)
}
"``` ### remove the quotes when running example
Plots - After Velocity Adjustment
=====================================" ### remove quotes when running example
```{r, fig.width=13.5, fig.height=5}
all_files_Filtration = list.files(data_directory, pattern = ".csv")
# Loop to plot function
for(file in 1 : length(all_files_Filtration))
{
file_name = all_files_Filtration[file]
one_file = fread(file.path(data_directory, file_name))
# plot the time series agains
plot(createChlDiffPlot(one_file, file_name, output_directory, "Velocity_Paired"))
}
"``` #remove quotes when running example
```
I finally figured it out.
1) Replacing geom_line() with geom_path() connected the data points when rendered in Rmd.
2) df1$Time was formatted as a difftime object. When I looked at the dataframe in the global environment, Time :hmsnum 11:11:09 .... This made me think my format was ok, but when I ran class(df1$Time) I got [1] "hms" "difftime". With a quick google I found out difftime objects are not quite the same as hms, and my original time was generated by subtracting times. I added a conversion into my mutate function:
select(Time, Chl_ug_L_Up, Chl_ug_L_Down) %>%
mutate(Chl_diff = Chl_ug_L_Up - Chl_ug_L_Down,
Time = as_hms(Time)) # convert difftime objecct to hms
ggplot I think has some auto-formatting for hms variables, which is why difftime variable was producing ugly crowded x- axes.

Straight line appearing in seaboard barplot

I have a line appearing on one line of my bar plot
this is what it looks like
and my code below
sns.barplot(x='Police force sent NRM referral for Crime Recording', y='Total', data=dfForces1)
sns.despine()
plt. xticks(rotation=90)
import matplotlib.ticker as ticker
ax_data = sns.barplot(x= 'Police force sent NRM referral for Crime Recording', y = 'Total', data=dfForces1) # change as per how you are plotting, just for an example
ax_data.yaxis.set_major_locator(ticker.MultipleLocator(40)) # it would have a tick frequency of 40, change 40 to the tick-frequency you want.
ax_data.yaxis.set_major_formatter(ticker.ScalarFormatter())````

R: getting google finance JSON data into a dataframe

I am trying to get google finance JSON data into a dataframe.
I tried:
library(jsonlite)
dat1 <- fromJSON("http://www.google.com/finance/info?q=NSE:%20AAPL,MSFT,TSLA,AMZN,IBM")
dat1
However I get an error:
Error in feed_push_parser(readBin(con, raw(), n), reset = TRUE) :
parse error: trailing garbage
Thank you for any help.
I could not replicate your error using fromJSON due to proxy issues from my side but the following works using httr
require(jsonlite)
require(httr)
#Set your proxy setting if needed
#set_config(use_proxy(url='hostname',port= port,username="",password=""))
url.name = "http://www.google.com/finance/info?q=NSE:%20AAPL,MSFT,TSLA,AMZN,IBM"
url.get = GET(url.name)
#parsing the content as json results in similar error as you encountered
#url.content = content(url.get,type="application/json")
#Error in parseJSON(txt) : parse error: trailing garbage
# " : "0.57" ,"yld" : "2.46" } ,{ "id": "358464" ,"t" : "MSFT"
# (right here) ------^
#read content as html text
url.content = content(url.get, as="text")
#remove html tags
clean.text = gsub("<.*?>", "", url.content)
#remove residual text
clean.text = gsub("\\n|\\//","",clean.text)
DF = fromJSON(clean.text)
head(DF[,1:10],5)
# id t e l l_fix l_cur s ltt lt lt_dts
#1 22144 AAPL NASDAQ 92.51 92.51 92.51 1 4:00PM EDT May 11, 4:00PM EDT 2016-05-11T16:00:02Z
#2 358464 MSFT NASDAQ 51.05 51.05 51.05 1 4:00PM EDT May 11, 4:00PM EDT 2016-05-11T16:00:02Z
#3 12607212 TSLA NASDAQ 208.96 208.96 208.96 1 4:00PM EDT May 11, 4:00PM EDT 2016-05-11T16:00:02Z
#4 660463 AMZN NASDAQ 713.23 713.23 713.23 1 4:00PM EDT May 11, 4:00PM EDT 2016-05-11T16:00:02Z
#5 18241 IBM NYSE 148.95 148.95 148.95 2 6:59PM EDT May 11, 6:59PM EDT 2016-05-11T18:59:12Z
I got the below code from here. Let me know if this helps. On a side note, I would also recommend netfonds. Netfonds is the only source I've found that provides intra-day tick level data for both historical prices and the open book. I posted some additional links below for pulling the Netfonds data if you're interested.
http://www.blackarbs.com/blog/3/22/2015/how-to-get-free-intraday-stock-data-from-netfonds
http://www.onestepremoved.com/free-stock-data/
import urllib
from datetime import date, datetime
""" googlefinance
This module provides a Python API for retrieving stock data from Google Finance.
"""
_month_dict = {
'Jan': 1,
'Feb': 2,
'Mar': 3,
'Apr': 4,
'May': 5,
'Jun': 6,
'Jul': 7,
'Aug': 8,
'Sep': 9,
'Oct': 10,
'Nov': 11,
'Dec': 12}
# Google doesn't like Python's user agent...
class FirefoxOpener(urllib.FancyURLopener):
version = 'Mozilla/5.0 (X11; U; Linux i686) Gecko/20071127 Firefox/2.0.0.11'
def __request(symbol):
url = 'http://google.com/finance/historical?q=%s&output=csv' % symbol
opener = FirefoxOpener()
return opener.open(url).read().strip().strip('"')
def get_historical_prices(symbol, start_date=None, end_date=None):
"""
Get historical prices for the given ticker symbol.
Returns a nested list. fields are Date, Open, High, Low, Close, Volume.
"""
price_data = [data.split(',') for data in __request(symbol).split('\n')[1:]]
for quote in price_data:
quote[0] = _format_date(quote[0])
return price_data
def _format_date(datestr):
""" Change datestr from google format ('20-Jul-12') to the format yahoo uses ('2012-07-20')
"""
parts = datestr.split('-')
day = int(parts[0])
month = _month_dict[parts[1]]
year = int('20'+ parts[2])
return date(year, month, day).strftime('%Y-%m-%d')
If the Google finance endpoint returns a newline delimited json, the solution in R should be:
library(jsonlite)
dat1 <- stream_in(url("http://www.google.com/finance/info?q=NSE:%20AAPL,MSFT,TSLA,AMZN,IBM"))
But it seems the endpoint is not accepting such request (any more?):
HTTP status was '403 Forbidden'