Duplicate #id with json - json

I have placed in the head section json script - "#type" AutmotiveBusiness, describing business with unique "#id" - https://URL/#AutomotiveBusiness. Normally, in this case it will be visible on all of the subwebpages and it is.
Now I want to add breadcrumbs on the webpages in the following way so I can specify that subwebpage belongs to https://URL/#AutomotiveBusiness:
#type ListItem
position 1
item
#id https://URL/#AutomotiveBusiness
#type ListItem
position 2
item
#type AboutPage
The problem is when I do this in the mentioned way, the head section from subwebpage disappear when checked in the google structured data testing tool. I understand that here it's the same rule which apply to ID's in CSS.
I am wondering what can I do so the subwebpage will be clearly assigned to the https://URL/#AutomotiveBusiness in the breadcrumbs.

Your issue is caused because entities with the same ID are considered to be about the same thing, and they are merged.
You are trying to say that a trail in a breadcrumb is also an AutomotiveBusiness.
Change the IDs in the breqadcrumb trail so that they relate to the URLs for the WebPages in the trail.

Related

How to scrape text based on a specific link with BeautifulSoup?

I'm trying to scrape text from a website, but specifically only the text that's linked to with one of two specific links, and then additionally scrape another text string that follows shortly after it.
The second text string is easy to scrape because it includes a unique class I can target, so I've already gotten that working, but I haven't been able to successfully scrape the first text (with the one of two specific links).
I found this SO question ( Find specific link w/ beautifulsoup ) and tried to implement variations of that, but wasn't able to get it to work.
Here's a snippet of the HTML code I'm trying to scrape. This patter recurs repeatedly over the course of each page I'm scraping:
<em>[女孩]</em> 寻找2003年出生2004年失踪贵州省黔西南布依族苗族自治州贞丰县珉谷镇锅底冲 黄冬冬289179
The two parts I'm trying to scrape and then store together in a list are the two Chinese-language text strings.
The first of these, 女孩, which means female, is the one I haven't been able to scrape successfully.
This is always preceded by one of these two links:
forum.php?mod=forumdisplay&fid=191&filter=typeid&typeid=19 (Female)
forum.php?mod=forumdisplay&fid=191&filter=typeid&typeid=15 (Male)
I've tested a whole bunch of different things, including things like:
gender_containers = soup.find_all('a', href = 'forum.php?mod=forumdisplay&fid=191&filter=typeid&typeid=19')
print(gender_containers.get_text())
But for everything I've tried, I keep getting errors like:
ResultSet object has no attribute 'get_text'. You're probably treating a list of items like a single item. Did you call find_all() when you meant to call find()?
I think that I'm not successfully finding those links to grab the text, but my rudimentary Python skills thus far have failed me in figuring out how to make it happen.
What I want to have happen ultimately is to scrape each page such that the two strings in this code (女孩 and 寻找2003年出生2004年失踪贵州省...)
<em>[女孩]</em> 寻找2003年出生2004年失踪贵州省黔西南布依族苗族自治州贞丰县珉谷镇锅底冲 黄冬冬289179
...are scraped as two separate variables so that I can store them as two items in a list and then iterate down to the next instance of this code, scrape those two text snippets and store them as another list, etc. I'm building a list of list in which I want each row/nested list to contain two strings: the gender (女孩 or 男孩)and then the longer string, which has a lot more variation.
(But currently I have working code that scrapes and stores that, I just haven't been able to get the gender part to work.)
Sounds like you could use attribute = value css selector with $ ends with operator
If there can only be one occurrence per page
soup.select_one("[href$='typeid=19'], [href$='typeid=15']").text
This is assuming those typeid=19 or typeid=15 only occur at the end of the strings of interest. The "," between the two in the selector is to allow for matching on either.
You could additionally handle possibility of not being present as follows:
from bs4 import BeautifulSoup
html ='''<em>[女孩]</em> 寻找2003年出生2004年失踪贵州省黔西南布依族苗族自治州贞丰县珉谷镇锅底冲 黄冬冬289179'''
soup=BeautifulSoup(html,'html.parser')
gender = soup.select_one("[href$='typeid=19'], [href$='typeid=15']").text if soup.select_one("[href$='typeid=19'], [href$='typeid=15']") is not None else 'Not found'
print(gender)
Multiple values:
genders = [item.text for item in soup.select_one("[href$='typeid=19'], [href$='typeid=15']")]
Try the following code.
from bs4 import BeautifulSoup
data='''<em>[女孩]</em> 寻找2003年出生2004年失踪贵州省黔西南布依族苗族自治州贞丰县珉谷镇锅底冲 黄冬冬289179'''
soup=BeautifulSoup(data,'html.parser')
print(soup.select_one('em').text)
OutPut:
[女孩]

Cesium - Modify infobox contents

I have n polygons with ids "test-1-1", "test-1-2" .... "test-1-n" which represent a single logical entity. Format of id can be generalized as < entity_name>-< entity_id>-< i>, where i is added to distinguish ids of multiple polygons.
My query here is, I want to display only "test" when any of these polygons is clicked. Currently id of selected polygon is displayed in info-box.
Is there any cesium way to do this? I would not prefer manipulating the strings at runtime.
A Cesium Entity has three fields of interest to the InfoBox (the thing that pops up when an Entity is selected).
entity.id - Each entity in a dataSource is required to have a unique id (a GUID will be auto-generated if no ID is supplied at creation). It is an arbitrary string and does not need to be human-friendly.
entity.name - This is the human-friendly name of the Entity. It does not need to be unique, you may have as many duplicate names as you like. It is half a line or less of plain text (not HTML).
entity.description - This is a sandboxed HTML description of the entity, and can span multiple paragraphs or include tables and other styling.
The InfoBox will attempt to show entity.name on its title bar by default, and will only fall back to show entity.id in the title bar if name is missing (because name is optional, id is not).
The body of the InfoBox only appears below the title bar if entity.description is set (otherwise only the bar is shown). The description is rendered with a sandboxed iframe (to offer some resistance to cross-site scripting for apps that display user-supplied entity descriptions).
I have n polygons with ids "test-1-1", "test-1-2" .... "test-1-n" ...
For this case, I would keep the existing ids, and set name to be the string you wish to see in the InfoBox popup. Multiple entities can have the same name but not the same id.

Spring bean comma separating values, but I want to overwrite

Alright, so I'm pretty new to Spring, but I was asked to resolve a bug. So in our application, we have a page that queries a database based on an id. However, not all entries are unique to the id. The id and date pair, on the other hand, do define unique entries.
So this page takes in an id. If there is only a single entry related to this id, everything works fine. However, if there are multiple entries, the page displays a radio button selection of the various dates that pertain to that id. We use something like:
< form:radiobutton id="loadDate" path="loadDate" value="${date}" label="${date}" />
Later on the same page, we want to display the data for that option. As part of it, we display the date of that selection:
< form:input id="aiLoadDate" path="loadDate" maxlength="22" size="22" class="readonly" readonly="true"/>
The problem is that when this happens, the variable (or bean? I'm not quite sure about Spring yet..) loadDate (a string) ends up being the same date twice, seperated with a comma. I'm guessing the problem here is the "path="loadDate"" that is common to both lines.
Instead of appending the date to the already existing one like a csv, I'd like it to overwrite the current entry intead. Is there a way to do this?
Spring is not the direct cause of your problem. When the elements of an HTML form are submitted, each element will appear in the request as a name=value pair. If two or more elements in the form have the same name (not id, name attribute) then those elements appear in the request as name=value,value (with one value per element with a duplicated name).
Option 1: stop using an input as a display element. Just display the date in a span (or div or paragraph or what ever). If you want the look of an input box (border, etc.) use CSS to create a class that has the look you want and attach the class to the span (or div or paragraph, etc) in which you display the date.
Option2: continue using an input as a display element. Disabled input elements are not added to the request when the form is submitted. in the form:imput set disabled="true".

Ontology: OWL - Creating connections between classes

I ve got an Ontology written in OWL with Protege. But I don't find a solution for creating relations between Classes. Of course, there is a "subclass" relation, but I want to define my own relations. So I have a class hierarchy (which consists out of "subclass"-relations) but I want to create a relation, i.e. "has_Relation", to connect two classes.
My aim is to write a java programm in which I can get the information "which class is parentclass of a class?" and "to which class is a has_Relation connection?"
(I am not talking about individuals - I'm just talking about classes)
Thank you very much for your help in advance!
Best Regards
Natan
The simplest way to do this is to use an annotation property. In Protégé, select the class you want to relate to another class, then click the + beside "Annotations" in the Annotations tab. Then add the has_Relation property with the second button on the top left of the window. Then select the Entity IRI tab and the Classes subtab, select the other class you want to relate to and you're done.
However, you should rather not do this if has_Relation is an object property or a datatype property. If such is the case, you can use "punning", that is, you can make new individuals in the Individuals tab with the same names as the classes you want to relate. Then you relate them as if they were normal individuals. Note that this is allowed and valid in OWL 2 DL.
a bit late, but:
You can also go to the tabs menu and active the object properties tab
(Window> Tabs -> Object Properties )
Then you can create your own object property and assign its domain and range to which ever classes you want ( Description area of the individual property ).

Should I use one form on page or generate a form per item?

I want to display a list of items. Each item would have an edit and a delete icon next to it.
For obvious reasons I want to trigger the delete action with HTTP POST.
With jQuery, I would bind links to trigger form.submit.
However I'm not sure if I should generate a form next to each item or use just one form.
Below are pros and cons of two approaches as I see them.
Form Per Item:
easy to generate;
no need to fiddle in JS to set action and input value.
Single Form:
makes more sense semantically;
requires client JS to set hidden input;
requires client JS to set form action (e.g. id + '/delete/).
What is there to add? What is the preferred pattern in modern HTML apps?
I have used checkboxes in the past. This is better for usability, and each checked checkbox can pass its own ID to the form processing script.
The main disadvantage I see in having a single form enclosing all list elements is that you can end up with a huge POST if the list is long. As an advantage, you could mark multiple elements for deletion (checkboxes, for instance) and perform a single delete request.
I'd go for either
A single form for each list element. This would make deletion of multiple elements impossible, but would keep POST sizes minimal.
Using a single form, but in a way that doesn't include all the list elements. For instance, having a delete only form with a single hidden element in it, into which you would put all the id's marked for deletion with JS manipulation.
As a side note, you could also skip forms and perform the needed interactions through ajax. This would improve user experience notably. Take into account that forms would still be needed to provide fallback mechanisms in case it was required.
In the end, I decided to go with AJAX via jQuery.ajax.
The reason is semantically I don't even have forms—I have buttons.
Therefore, jQuery is an easier solution as it allows to keep posting logic in one place (as opposed to scattering it across HTML and JS).
I assigned row class to each semantical row and put corresponding database IDs in HTML5 data attribute called data-row-id for each row.
<div class="row" data-item-id="{{ product.id }}">
<!-- ... --->
<img src="/img/delete.png" alt="Delete">
</div>
Then I have something alone the lines of
$('.delete-btn').click(function() {
var row = $(this).closest('.row');
var id = row.data('item-id');
$.ajax({
url: id + '/delete/',
type: 'POST'
});
row.fadeOut().slideUp();
return false;
}
in my $() load handler.
This solution scales beautifully across the whole codebase because you only have to set row class and data-item-id attribute and the buttons will “just work”.