What is the fromage option in the Indeed XML Job Search API? - json

Here's the link to the API: https://ads.indeed.com/jobroll/xmlfeed
You might need to login to see it, but here's the raw text. fromage is one of the options and I'm trying to figure out what it does precisely, to no avail:
st Site type. To show only jobs from job boards use "jobsite". For jobs from direct employer websites use "employer".
jt Job type. Allowed values: "fulltime", "parttime", "contract", "internship", "temporary".
start Start results at this result number, beginning with 0. Default is 0.
limit Maximum number of results returned per query. Default is 10
fromage Number of days back to search.
highlight Setting this value to 1 will bold terms in the snippet that are also present in q. Default is 0.
filter Filter duplicate results. 0 turns off duplicate job filtering. Default is 1.
latlong If latlong=1, returns latitude and longitude information for each job result. Default is 0.
co Search within country specified. Default is us See below for a complete list of supported countries.

Nevermind I figured it out by just tweaking in query in the url. It acts as the "days ago posted". For example, if you need to get jobs that were posted in the last 24 hours, you use 1. If you need it within this week, you use 7. It's useful to keep the query fresh and up to date.

Related

Read data that is already in output and write back to the output

I have requirement to read the data that is already in the output and join the data to input and write back the data to the same output. This build is scheduled every day.
Input:
ID
Refresh_Date
1
6/8/2022
2
6/8/2022
3
6/8/2022
Historical(Output):
ID
Order Date
Order Closure
Age
1
6/6/2022
6/7/2022
1
2
6/7/2022
3
6/7/2022
4
6/7/2022
The input data will be refreshed with new orders every day, so I have join the input to the historical data and find the closure date and time it took to close the order.
The result of the join should be saved as Historical again
I tried using incremental computation but the output in read mode is always giving me empty dataset.
Your intuition to use #incremental decorator is correct.
It sounds like your problem is related to the mode in which you are accessing the current dataframe. Check out the documentation on incremental modes of inputs and outputs; in particular, the default mode is added while you'd probably want to use current or previous for your implementation, as these are the modes that give you access to the data currently within the dataset.
Also, the documentation on incremental decorator is overall very helpful to understand how to make incremental computation work for you. Have a look at different parameters you can pass to your decorator, in particular snapshot_inputs, as it may affect how you access the input dataset as well.

Trying to pull Max date less than Date on the row

I know this is a tough one but I'm basically trying to say. Give me a service call and its completion date, then give me the Max date for all service calls where the date is less than the date of the service call I'm inquiring about.
Basically the end result I'm looking for is to say was there another service call on this piece of equipment that was within the last 30 days.
So as you can see in the image for say Asset 50698 service call 579032 we have a date of 11/9/2020 the call below that was 10/22/2020 which was less than 30 days. I want to somehow find a way to count how many service calls I have where this has occurred. Is this possible?
I think you're looking for a context operator In, ForEach or ForAll (in in this case)
Add a variable "MaxAssetDate" and assign it a Formula similar to the following based on your column headers.
=Max([Service Call Completion Date] In ([Asset ID];[Service Call])) In (Asset ID])
Then add this as a column. Provided you have a prompt filtering for a given asset or "date" this column will then show the max date for each service call of the same asset ID. Then add a new variable: ServiceCallDaysDiff: Then by using DatesBetween() with "MaxAssetDate" and ServiceCallCompletionDate and DayPeriod; =DatesBetween([ServiceCallCompletionDate];[MaxAssetDate];DayPeriod) you should get a number 0-X. Then add a filter based if the number is between 1 and 30 then you show those records, otherwise hide the rest; or do whatever logic is then needed.
Now if you're dealing with hundreds of thousands of records this isn't ideal as you're putting all the processing on the webi engine when it ideally would occur as an object in the database layer. However if you only have a few thousand records this should be managable.
To add a count of service calls...
add variable: ServiceCallsCount:
=Sum(Sum(If([ServiceCallDaysDiff]=0;0;1)) In ([AssetID]))
this will count the non zero day differents. Note this will extend beyond 30 so if you want to limit by 30 days adjust the if statement to zero out those not between 1 and 30.
This is but one approach: there may be simpler ways.

Tableau's functions - how to find an equivalent to IF EXISTS

I'm creating a Tableau Dashboard with 'buttons' which are coloured red or green based on certain criteria and what is selected in the filters. The filters are just a way to select different offices in different regions and when selecting an office the buttons should change colour depending on whether the targets for the different metrics have been hit for that office or not.
The navigation buttons on Tableau won't accommodate this so I've made a work around. For each 'button' I've created a worksheet with just the text of the metric name on the Label mark and a calculated field on the colour mark. I've then added the worksheet to the Dashboard and added an action to go to the corresponding metric dashboard when the 'button' is clicked on.
The issue I'm having is the conditional colouring of one of these metrics. This metric is based on stock levels. For each office there are multiple categories of stock types, each with a corresponding target, with multiple 'bins' in each category. I want the button to turn red if ANY of the combined total of stock in the bins for one category is over the target for that category for that office.
To try and type it logically-
For the currently filtered data: IF EXISTS(FOR EACH OFFICE( FOR EACH CATEGORY: [SUM(BinValue)< CategoryTarget])) THEN 'Green' ELSE 'Red'
I've tried to translate that logic into Tableau's functions in a calculated field and have the following:
SUM(INT({INCLUDE [Category]:Min([CategoryTarget])} > {INCLUDE [Category]:SUM(BinValue)}))
This colouring is correct when I add the Office Name and Category pills to the worksheet to test my logic however when I remove the pills the colouring isn't correct. Something seems to be going wrong when I try to sum the number of categories that are within target levels over all offices and targets.
I've tried so many iterations of the following functions and have been going around in circles for days now:
INCLUDE, EXCLUDE, FIXED, IF, SUM, INT
If anyone knows how to do this properly or even just a different way of being able to conditionally colour buttons on a dashboard I would be incredibly grateful.
The structure of my data is as follows with some dummy data as an example:
Region
SubRegion
Office
Category
Bin
BinValue
CategoryTarget
North
NorthWest
Manchester
Toys
B123
30
50
North
NorthWest
Manchester
Toys
B456
40
50
So for a Stock Level metric selecting any of ALL/North/NorthWest/Manchester filter options should flag as red due to the total of the bins in one category in an office being higher than the target amount for that category for that office.
I've updated my calculated field however I'm still having issues with the grouping showing as true/false correctly.
This is what it is now:
MAX( {INCLUDE Category, Office:Sum(BinValue)} > {INCLUDE Category, Office:MIN(CategoryTarget)} )
With True showing as Red and False Green (we want to be below target hence the green).
When working on the example to showcase the issue I managed to get it working.
I ended up using the following logic:
max({EXCLUDE [Bin]:SUM([Bin Value])} > [Category Target])
This meant that even if most of the Offices in the filter were within their stock level targets, if there was one with stock levels over target the 'button' showed as red.
I published the example I've used anyway in case it helps others in the future.
Link to the Tableau Public dashboard:
https://public.tableau.com/views/ConditionalColouring/Dashboard1?:language=en-GB&:useGuest=true&:display_count=y&:origin=viz_share_link
Thank you very much for the help!
To work with logical conditions, such as testing whether a condition holds for any (or every) record in a group of data rows, it helps to understand that Tableau treats the boolean value "True" as greater than the boolean value "False".
Once you get comfortable with that idea, you can use the functions MAX() (or MIN()) to test whether a condition holds for any record (or for every record, respectively). So MAX(False, False, True, False) is True.
So to tell if any records have an actual value below their target, test MAX([Actual Value] < [Target Value])
You can then combine this idea with dimensions on the viz (or LOD calcs if necessary) to group the data records appropriately before testing your conditions. If you work with the same conditions repeatedly, this type of calculation can be very useful for defining sets that get used in multiple places.
One technical caveat, if your condition test ever evaluates to NULL, then those null values are ignored by MIN() and MAX() - just like other aggregation functions do. So for example, you could test whether every record satisfies a condition using MIN() and get a possibly misleading result if all the non-null values are True (so MIN() reports True). MIN(TRUE, TRUE, NULL, TRUE) = TRUE. If your condition can evaluate to NULL, and you don't want to ignore nulls, but instead treat it as, say, the same as False, then you can use the IFNULL() function to provide a default value for your condition.
As an example, MIN(IFNULL([Actual Value] > [Target Value], FALSE)) returns True only if every record has a value above its target, treating any records with missing values or targets as failing the condition - i.e. not exceeding the target. The choice of whether to have a default value for a condition, and what it should be, are problem dependent of course. If your data does not have null values, you don't have this complication to consider.
Though the data you have given is very less, yet I think this calculation field you require
IF { FIXED [Region], [Sub-Region], [Office], [Category] : SUM([Bin Value])}
> {FIXED [Region], [Sub-Region], [Office], [Category] : MIN([Category Target])} THEN 'RED' ELSE 'GREEN' END
This is based on assumption that for every group of region/sub-region/office/category target value will be same in each row within the group. Therefore MAX/AVG etc. will all work in place of MIN used in the calculation.
See I added two rows in your data
and result

Querying https://musicbrainz.org for all artists

How can I query for all artists who were born after 1720 and died before 1900 on https://musicbrainz.org?
I need to retrieve their IDs and some information about them.
Is it possible to get data in JSON format?
for those who dont want to read a long post, here is everything the OP asked for in only one query:
http://musicbrainz.org/ws/2/artist/?query=begin:[1720 TO 1900] AND end:[1720 TO 1900] AND type:"person"&fmt=json
This should return perfect results, and has got to be the best answer possible.
- all artists, born after 1720 and dead before 1900, in json format, which retrieves their IDs, and lots of information about them...
The explanation and thought process:
Since Brian's currently accepted answer includes a link to the API document, i can say it is technically complete but I don't consider pointing to the spec a the best possible answer, and can be greatly improved.
Firstly it is easy to return json by adding the json format parameter.
&fmt=json
Secondly while i don't reckon there where many boy bands back in the day, given that OP is asking about births and deaths we may conclude they are interested in only people rather than groups other types of artists.
AND type:"person"
At which point as Brian suggests another call for each end date and then filter the results taking only those who died by 1900.
If you did this you would need to do way more than 180 searches the best answer suggests, but rather one for each birth and each death year combination, so technically 1720 to 1720, all the way through 1900-1900, my math stinks but that is thousands of searches.
But what makes this still such a horrible search is because sometimes dates are either written with only the year, and then sometimes written with month date and year, so for example if you search for begin 1929 and end 1900
So if a date is written to include not only year but month/date you would not get any results for this artist because of the full birthday:
ex:
id "2b8a16a9-468f-49b0-93ea-5e6726f41643" type "Person" life-span
begin "1929-11-10"
end "1990"
ended true
Therefore in order to get any good results using only the year you would need to add the fuzzy search syntax
musicbrainz.org/ws/2/artist/?query=begin:1960~ AND end:1990~ AND
type:"person"&fmt=json
But this does nothing to solve big problem of the magnitude of searches suggested, so knowing its LUCENS based I decided to learn some LUCENS, and realize there is range syntax:
Therefor you can do all of the above with one query:
http://musicbrainz.org/ws/2/artist/?query=begin:[1720 TO 1900] AND
end:[1720 TO 1900] AND type:"person"&fmt=json
PS I recommend to start adding quotes or even url encoding your parameter values to prevent breakage.
For example leaving quotes off begin and end numerals in the example above has no problem but off the type value will fail.
First, Musicbrainz only returns XML, as far as I know, so you'll have to convert the results to JSON.
To answer your question, it doesn't look like you'll be able to get the data you want in a single call. (The following is based off the XML Web Service Search documentation.)
This call will retrieve all artists who were born in a given year:
http://musicbrainz.org/ws/2/artist/?query=begin:1720
I believe you'd need to write 180 calls (one for each year between 1720 and 1900) to get the data you need. You'd also need to manually filter out artists who died after 1900, by looking at the <end> node within <life-span>. This is because the end field will only get you artists who died in a specific year.

Paging Results: Handling invalid passed data

Have a simple page that pulls results from MySQL and displays them in a table. I have enabled paging on the results, and allowed the user to set the number of results being displayed per page. I am passing two querystring values to handle this: 'page' and 'count'.
I am then taking these values to calculate the LIMIT's of my MySQL query, using the SQL_CALC_FOUND_ROWS directive and following that with a call to SELECT FOUND_ROWS(); to get the total number of results. This all works nicely.
Now, I want to validate the querystring values. As I am storing the possible "correct" values for the results/page value of 'count' in an array, I simply check that the passed 'count' value is in that array, and if not set it to the default value. For the 'page' value, I am having a bit of a mental block... in order to determine if there are any results for the passed 'page', meaning it is "correct", I need to go to the database and find the result count first, but since I only want to go to the db once, I need to include the LIMIT's, which are based on the passed 'page' value... chicken and egg. I have a couple thoughts on how to solve this:
Run the query as coded above, and if the (('page' - 1) * 'count') result is greater than the value returned from SELECT FOUND_ROWS();, re-run the query with new LIMIT's set to 0, count.
Get the full result set, verify that the passed page is correct, then do another pull from the database with the LIMIT values.
I'd rather not go back to the database at all, but as I mentioned, having a mental block on this rather common issue.
Thanks,
Paul
I ended up using the first solution above -
Run the query as coded above, and if the (('page' - 1) * 'count') result is greater than the value returned from SELECT FOUND_ROWS();, re-run the query with new LIMIT's set to 0, count.
It's not perfect in that a second database pull is required for cases where the passed page value is bad, but given that is an unexpected case only triggered by the intentional passage of bad data on the part of the user, it's acceptable. If anyone else has a better solution, I'd be happy to re-open the question.