I'm connecting to BigQuery to get information for a Sankey Diagram in Tableau. However, I am getting this information from 2 different datasets: "audience exited" and "audience entered". I am using the User IDs and the original timestamps to join the 2 datasets. However, the timestamps are in a datetime format and those times do not coincide with each other across datasets given that a user can exit an audience at 2 a.m and only enter a new audience at 4 a.m. Hence, I am using "FORMAT_DATETIME" to remove the time on the original timestamps i.e: from "2021/07/07 23:32" to "2021-Jul-7" as shown in the SQL code below:
SELECT `audience_exited`.`active_customers` AS `active_customers`,
`audience_exited`.`audience_key` AS `audience_key`,
FORMAT_DATETIME("%Y-%b-%d",auience_exited`.`original_timestamp`) AS `original_timestamp`,
`audience_exited`.`received_at` AS `received_at`,
`audience_exited`.`user_id` AS `user_id`,
`audience_entered`.`active_customers` AS `active_customers__audience_entered_`,
`audience_entered`.`audience_key` AS `audience_key__audience_entered_`,
FORMAT_DATETIME("%Y-%b-%d",`audience_entered`.`original_timestamp`) AS `original_timestamp__audience_entered_`,
`audience_entered`.`received_at` AS `received_at__audience_entered_`,
`audience_entered`.`user_id` AS `user_id__audience_entered_`,
"audience_key" AS Vizside
FROM `dial-a-delivery-ke.personas_personas_prod`.`audience_exited` `audience_exited`
FULL JOIN `dial-a-delivery-ke.personas_personas_prod`.`audience_entered` `audience_entered` ON ((`audience_exited`.`user_id` = `audience_entered`.`user_id`) AND (`audience_exited`.`original_timestamp` = `audience_entered`.`original_timestamp`))
I get the following error when I run it in Tableau:
An error occurred while communicating with the data source
Error Code: 015CFBE6 The Google BigQuery service was unable to compile
the query. Syntax error: Expected ")" but got identifier . at [5:46]
And I do not know what to make of this since everything seems fine to me. Please can you assist with this error?
TRY BELOW CODE
SELECT `audience_exited`.`active_customers` AS `active_customers`,
`audience_exited`.`audience_key` AS `audience_key`,
FORMAT_DATETIME("%Y-%b-%d",`auience_exited`.`original_timestamp`) AS `original_timestamp`,
`audience_exited`.`received_at` AS `received_at`,
`audience_exited`.`user_id` AS `user_id`,
`audience_entered`.`active_customers` AS `active_customers__audience_entered_`,
`audience_entered`.`audience_key` AS `audience_key__audience_entered_`,
FORMAT_DATETIME("%Y-%b-%d",`audience_entered`.`original_timestamp`) AS `original_timestamp__audience_entered_`,
`audience_entered`.`received_at` AS `received_at__audience_entered_`,
`audience_entered`.`user_id` AS `user_id__audience_entered_`,
"audience_key" AS Vizside
FROM `dial-a-delivery-ke.personas_personas_prod`.`audience_exited` `audience_exited`
FULL JOIN `dial-a-delivery-ke.personas_personas_prod`.`audience_entered` `audience_entered` ON ((`audience_exited`.`user_id` = `audience_entered`.`user_id`) AND (`audience_exited`.`original_timestamp` = `audience_entered`.`original_timestamp`))
Related
Let me start by saying that I have exhausted all the various options that I could come up with on my own, and researched each option to all visible dead ends.
I have a typical mysql forum database that includes a post table with about 880,000 rows. The post table contains a column for ip address, and my end goal is to create a bubble map of the world based on the geolocation of each post. Even better if I can separate them per month and create an animation of the post frequency around the world for the past 8 years.
Because this is a personal project and accuracy of IP geolocation is not important, I have had to rule out the paid APIs that can batch convert IP to geolocation. I found various questions on stackoverflow that linked to a website with databases of IP geolocations: https://dev.maxmind.com/geoip/geoip2/geolite2/
My initial plan was to load this database onto my forum server, and use my experience with mysql to create a new table with just: postid, date (as a unix timestamp), latitude, longitude, city, country. Then export this table to R and generate all the maps and charts I could ever want. However, the geolocations database is more than 3 million rows across two tables, and my dead forum is on a simple shared hosting plan that doesn't allow LOAD DATA. I tried all the solutions in these questions, to no luck:
How to import CSV file to MySQL table
access denied for load data infile in MySQL
LOAD DATA INFILE within PHPmyadmin
PHPMyAdmin saying: The used command is not allowed with this MySQL version
So my next idea was to export the relevant columns from my post table to .csv or .xml, then upload those to my account at iacademy3.oracle.com. However, I'm not experienced in oracle and the only method I knew is the Data Load/Unload UI in the Data Workshop. The 177MB XML file failed to upload with the following error:
ORA-31011: XML parsing failed ORA-19202: Error occurred in XML processing LPX-00222: error received from SAX callback function
Error loading XML.
Return to application.
The 34MB .csv file failed to upload on two attempts with this error:
Failure of Web Server bridge:
No backend server available for connection: timed out after 10 seconds or idempotent set to OFF or method not idempotent.
Now I'm out of ideas. On a post by post basis, it's a simple query to look at the post IP, compare it to the geolocations database, and have the latitude and longitude. But when working with millions of rows, I don't know how to get to my end result.
Any advice on new approaches or help with my dead ends would be greatly appreciated.
We'll generate some IP addresses, geolocate tem and plot them:
library(iptools)
library(rgeolocate)
library(tidyverse)
Generate a million (way too uniformly distributed) random IPv4 addresses:
ips <- ip_random(1000000)
And, geolocate them:
system.time(
rgeolocate::maxmind(
ips, "~/Data/GeoLite2-City.mmdb", c("longitude", "latitude")
) -> xdf
)
## user system elapsed
## 5.016 0.131 5.217
5s for 1m IPv4s. 👍🏼
Now due to the uniformity, the bubbles will be stupid small, so just for this example we'll round them up a bit:
xdf %>%
mutate(
longitude = (longitude %/% 5) * 5,
latitude = (latitude %/% 5) * 5
) %>%
count(longitude, latitude) -> pts
And, plot them:
ggplot(pts) +
geom_point(
aes(longitude, latitude, size = n),
shape=21, fill = "steelblue", color = "white", stroke=0.25
) +
ggalt::coord_proj("+proj=wintri") +
ggthemes::theme_map() +
theme(legend.justification = "center") +
theme(legend.position = "bottom")
You can see what I mean abt "too uniform". But, you have "real" IPv4s, so you should be gtg.
Consider using scale_size_area(), but, honestly, consider not plotting IPv4s on a geo-map at all. I do internet-scale research for a living and the accuracy claims leave much to be desired. I rarely go below country-level attribution for that reason (and we pay for "real" data).
I have an ssis package which uses SQL command to get data from Progress database. Every time I execute the query, it throws this specific error:
ERROR [HY000] [DataDirect][ODBC Progress OpenEdge Wire Protocol driver][OPENEDGE]Internal error -1 (buffer too small for generated record) in SQL from subsystem RECORD SERVICES function recPutLONG called from sts_srtt_t:::add_row on (ttbl# 4, len/maxlen/reqlen = 33/32/33) for . Save log for Progress technical support.
I am running the following query:
Select max(ROWID) as maxRowID from TableA
GROUP BY ColumnA,ColumnB,ColumnC,ColumnD
I've had the same error.
After change startup-parameter -SQLTempStorePageSize and -SQLTempStoreBuff to 24 and 3000 respectively the problem was solved.
I think, for you the values must be changed to 40 and 20000.
You can find more information here. The name of the parameter in that article was a bit different than in my Database, it depends on the Progress-version witch is used.
So I am new at ST_ functions in MySql and I think that I am missing something. I am trying to save a POLYGON in MySql, the problem is that when using the function ST_GEOMFROMTEXT and giving the coordinates of the POLYGON taken from Google Maps Javascript API it returns the error: Invalid GIS data provided to function st_geometryfromtext.
I've read a lot in Internet but everywhere it mostly says that it's a version problem, the thing here is the I have the most recent one right now (5.7.19)
These are the following queries I've tried
# WORKS
SELECT ST_GEOMFROMTEXT('POLYGON((13.517837674890684 76.453857421875,13.838079936422464 77.750244140625,14.517837674890684 79.453857421875,13.517837674890684 76.453857421875,13.517837674890684 76.453857421875))');
# ALL BELLOW RETURN ERROR
SELECT ST_GEOMFROMTEXT('POLYGON((19.4254572621497 -99.17182445526123, 19.42574056861496 -99.16570901870728, 19.421551629818985 -99.16558027267456, 19.421288552764135 -99.17210340499878))');
SELECT ST_GEOMFROMTEXT('POLYGON((-99.17182445526123 19.4254572621497, -99.16570901870728 19.42574056861496, -99.16558027267456 19.421551629818985, -99.17210340499878 19.421288552764135 ))');
SELECT ST_GEOMFROMTEXT('POLYGON((19.4249108840002 -99.17023658752441, 19.424951356518726 -99.16802644729614, 19.423393157277722 -99.16796207427979, 19.423393157277722 -99.17019367218018))')
Does anyone knows why these queries above are failing? Thank you a lot everyone
Please try these queries -
SELECT ST_GEOMFROMTEXT('POLYGON((19.4254572621497 -99.17182445526123, 19.42574056861496 -99.16570901870728, 19.421551629818985 -99.16558027267456, 19.421288552764135 -99.17210340499878, 19.4254572621497 -99.17182445526123))');
SELECT ST_GEOMFROMTEXT('POLYGON((-99.17182445526123 19.4254572621497, -99.16570901870728 19.42574056861496, -99.16558027267456 19.421551629818985, -99.17210340499878 19.421288552764135, -99.17182445526123 19.4254572621497 ))');
SELECT ST_GEOMFROMTEXT('POLYGON((19.4249108840002 -99.17023658752441, 19.424951356518726 -99.16802644729614, 19.423393157277722 -99.16796207427979, 19.423393157277722 -99.17019367218018, 19.4249108840002 -99.17023658752441))')
Basically, the polygon needs to be "closed"
When using the Invantive Query Tool to request the table GLTransactionlines on Exact Online, my query times out.
When selecting a single column the query returns no data. Specifically, I would like to know from what table I can request my Transaction Lines.
I have used the following query:
select division_code
, gltransaction_date
, gltransaction_journal_code_attr
, glaccount_code_attr
, amount_value
, glaccount_balancetype_attr
from gltransactionlines
where glaccount_balancetype_attr = 'W';
local export results as "${rptoutpath}\TransactionsPLlsc.xlsx" format xlsx
When I select *, the Invantive Query Tool returns that there are too many columns in GLTransactionLines.
The exact error is:
De externe server heeft een fout geretourneerd: (401) Niet gemachtigd.
It occurs after ten minutes. When I let run DebugView along, it shows me that the following URL does not return:
Load Exact Online data using URL 'https://start.exactonline.nl/Docs/XMLDownload.aspx?Topic=gltransactions&Params_details=1&Params_documents=0&_Division_=1362280'
When I try to export another Exact Online table, it works. And sometimes fetching the GLTransactionLines works too.
It seems that the XML API of GL Transaction Lines is slow or malfunctioning on your environment. Please contact your supplier about this. As an alternative, you might want to switch to using the REST API which contains similar data, such as:
select *
from TransactionLines
where financialyear = 2016
and financialperiod = 12
I'm having a problem in the Django admin. I'm using version 1.5.5
I have a Booth model which has a foreign key to my AreaLayout model, which then goes back through another few models with foreign and many2many keys. My model code can be seen on pastebin. The initial indication of the problem in admin is that AreaLayouts are being duplicated in the select dropdown in Booth admin. The MultipleObjectsReturned error (traceback) is being raised when I then try to save a new booth. I was able to trace this back to the SQL query that django is creating to grab the AreaLayout list:
SELECT `venue_arealayout`.`id`, `venue_arealayout`.`name`, `venue_arealayout`.`description`, `venue_arealayout`.`area_id`, `venue_arealayout`.`additional_notes`
FROM `venue_arealayout`
INNER JOIN `venue_area` ON (`venue_arealayout`.`area_id` = `venue_area`.`id`)
INNER JOIN `venue_venue` ON (`venue_area`.`venue_id` = `venue_venue`.`id`)
INNER JOIN `venue_venue_venue_type` ON (`venue_venue`.`id` = `venue_venue_venue_type`.`venue_id`)
INNER JOIN `venue_venuetype` ON (`venue_venue_venue_type`.`venuetype_id` = `venue_venuetype`.`id`)
WHERE (`venue_arealayout`.`id` = 66 )
This query produces a duplicate in MySQL when I run it there. Removing the final 2 JOINs results in a single result being returned (which is the desired result), whereas removing only the last join still results in duplication.
I tried running that query with SELECT * in place of selecting specific fields and the two results in that case are almost equal. The difference is that the venue in question has multiple venuetypes and I'm getting a result for each of those. Is there any way I can tell django not to include those joins for these queries, or is there a way I can get distinct results as far as AreaLayouts go?
I think you're being caught by the bug reported in ticket 11707. One of the comments mentions that it can cause a MultipleObjectsReturned exception.
All I can suggest is that you stop using limit_choices_to. Your models are fairly complex, so I can't immediately see which one is causing the problem.