How to create a query selector out of a selected element? - google-chrome

In google chrome, especially now with custom elements, it became very cumbersome to select and element by hand nowadays, even though the browser knows the whole path to it already. Or is there a way that leads to a query selected for an element that I'm inspecting?
Situation:
What chrome can tell me:
What chrome is unable to create for me AFAIK:

While building an chrome extension I have found a need to uniquely locate an element when returning to a page. To do this I needed to create a query string for a selected element (custom context menu click)
While searching for a solution I found this unanswered question.
As I could not find an off the shelf solution or API to do the task I wrote the following function. It is untested in the wild, is very rough and ready (using poor node traversing techniques). I posted it in this state lest I forget and this question remains unanswered.
Create Query String For Element
A function to build a query string that will uniquely locate an element from a reference of the element.
const querytStr = createQueryStringForElement(myElement); // return string or undefined
if (querytStr) {
const element = document.querySelector(queryStr);
console.log(element === myElement); // expected result true
}
If the function fails to create a query that uniquely locates an element it returns undefined. else it returns the query string.
Example results
"#editor > div.ace_scroller > div.ace_content > div.ace_layer.ace_text-layer > div.ace_line:nth-child(45) > span.ace_punctuation.ace_operator"
"#buttons" // A UI container
"#buttons > div.buttons" // A sub UI container
"#buttons > div.buttons:nth-child(2)" // A button element by position
"#buttons > div.buttons:nth-child(3)" // A button element by position
How it works
The code assumes that the page is well formed (ids must be unique).
The query string will try to start with an id eg "#elementId" but if an element has no id the query will use the tag and class names. eg "div.my-class".
The tag and class name may not uniquely identify the element. To check if the query is unique, the query string is used to query the DOM from the elements parent.
If needed the query string will use the elements position to refine the query "div.my-class:nth-child(2)". Unfortunately this makes the resultant query string insensitive to changes in element order.
The query string is built up along each parent until it finds an element with an id or there are no more parents.
The final step uses the query to see if the query finds the correct element returning the query if successful.
The code
function createQueryStringForElement(element) {
const getElementSel = element => {
const tName = element.tagName.toLowerCase();
var i = 0, str = element.id ? "#" + element.id : sel = tName;
if (str.includes("#")) { return str}
str += element.classList.length ? "." + [...element.classList.values()].join(".") : "";
if (element.parentElement) {
const res = element.parentElement.querySelector(str);
if (res !== element) {
while (i < element.parentElement.children.length) {
if (element.parentElement.children[i] === element) {
i > 0 && (str += ":nth-child(" + (i + 1) + ")" );
break;
}
i++;
}
}
}
return str;
}
const queryPath = [];
const original = element;
do {
const subQuery = getElementSel(element);
queryPath.push(subQuery);
if (subQuery[0] === "#") { break }
element = element.parentElement;
} while (element);
const query = queryPath.reverse().join(" > ");
try {
const els = document.querySelector(query);
if (els === original) { return query }
} catch(e) { }
}

Related

How to replace words in Google doc while maintaing text style?

I want to scan words in a Google doc from left to right and replace the first occurrences of some keywords with a URL or a bbcode like tag wrapper around them.
I cannot use findText API because it's not simple regex finding but complex pattern matching involving lots of if else conditions involving business logic.
Here is how I want to solve this
let document = DocumentApp.getActiveDocument().getBody();
let paragraph = document.getParagraphs()[0];
let contents = paragraph.getText();
// makeAllTheNecessaryReplacemens has all the business logic to identify which keywords need to changed
let newContents = makeAllTheNecessaryReplacemens(contents);
paragraph.setText(newContents);
The problem here is that text style gets wiped out and also makeAllTheNecessaryReplacemens cannot add hyperlinks to string text.
Please suggest a way to do this.
Proposed function
/**
* This is a wrapper around the attribute functions
* this allows setting one attribute at a time
* based of a complete attribute object obtained
* from another element. This makes it far more
* reliable.
*/
const attributeKey = {
FONT_SIZE : (o,s,e,a) => o.setFontSize(s,e,a),
STRIKETHROUGH : (o,s,e,a) => o.setStrikethrough(s,e,a),
FOREGROUND_COLOR : (o,s,e,a) => o.setForegroundColor(s,e,a),
LINK_URL : (o,s,e,a) => o.setLinkUrl(s,e,a),
UNDERLINE : (o,s,e,a) => o.setUnderline(s,e,a),
BOLD : (o,s,e,a) => o.setBold(s,e,a),
ITALIC : (o,s,e,a) => o.setItalic(s,e,a),
BACKGROUND_COLOR : (o,s,e,a) => o.setBackgroundColor(s,e,a),
FONT_FAMILY : (o,s,e,a) => o.setFontFamily(s,e,a)
}
/**
* Replace textToReplace with replacementText
* Will reatain formatting and hyperlinks
*/
function replaceTextPlus(textToReplace, replacementText) {
// Initializing
let body = DocumentApp.getActiveDocument().getBody();
let searchResult = body.findText(textToReplace);
while (searchResult != null) {
// Getting info about result
let foundElement = searchResult.getElement();
let start = searchResult.getStartOffset();
let end = searchResult.getEndOffsetInclusive();
// This returns a complete attributes object
// Many attributes have null as a value
let attributes = foundElement.getAttributes(start);
// Replacing text
foundElement.deleteText(start, end);
foundElement.insertText(start, replacementText);
// Setting new end index
let newEnd = start + replacementText.length - 1
// Set attributes for new text skipping over null values
// This requires the constant defined at the top.
for (let a in attributes) {
if (attributes[a] != null) {
attributeKey[a](foundElement, start, newEnd, attributes[a]);
}
}
// Modifies the actual searchResult so that the next findText
// starts at the NEW end index.
try {
let rangeBuilder = DocumentApp.getActiveDocument().newRange();
rangeBuilder.addElement(foundElement, start, newEnd);
searchResult = rangeBuilder.getRangeElements()[0];
} catch (e){
Logger.log("End of Document")
return null
}
// searches for next result
searchResult = body.findText(textToReplace, searchResult);
}
}
Extending the findText API
This function relies on the findText API, but it adds in a few more steps.
Find the text.
Get the element containing the text.
Get the start and end indices of the text.
Get the attributes of the text (font, color, hyperlink etc)
Replace the text.
Update the end index.
Use the old attributes to update the new text.
You call it like this:
replaceTextPlus("Bing", "Google")
replaceTextPlus("occurrences", "happenings")
replaceTextPlus("text", "prefixedtext")
How to set the formatting and link attributes.
This relies on the attributes object that gets returned from getAttributes. Which looks something like this:
{
FOREGROUND_COLOR=#ff0000,
LINK_URL=null,
FONT_SIZE=null,
ITALIC=true,
STRIKETHROUGH=null,
FONT_FAMILY=null,
BOLD=null,
UNDERLINE=true,
BACKGROUND_COLOR=null
}
I tried to use setAttributes but it was very unreliable. Using this method almost always resulted in some formatting loss.
To fix this I make an object attributeKey that wraps all the different functions for setting individual attributes, so that they can be called from this loop:
for (let a in attributes) {
if (attributes[a] != null) {
attributeKey[a](foundElement, start, newEnd, attributes[a]);
}
}
This allows null values to be skipped which seems to have solved the unreliability problem. Perhaps the update buffer gets confused with many values.
Limitations
This function gets the formatting of the first character of the found word. If the same work has different formatting within itself. For example, "Hello" (Mixed normal with bold and italic), the replacement word will have the formatting of the first letter. This could potentially be fixed by identifying the word and iterating over every single letter.
References
Text class
Body class
DocumentApp
Element Interface
Attribute Enum

How to get all tags inside a specific tag using xpath

I need to get all the links in the table on this page : https://www.sahibinden.com/en/cars/used
I am using this library: https://www.npmjs.com/package/xpath-html
The issue is that when I print results it only gives me the first link.
let results = xpathT.fromPageSource(data).findElement("//tbody[#class='searchResultsRowClass']//a");
console.log("HERE NOSEDSS");
console.log(results);
console.log("The href value is:", results[1].getAttribute("href"));
How do I get all the links?
According to the doc you have to use findElements with a s. If I modify a bit your XPath, you can have something like this :
const nodes = xpath.fromPageSource(html).findElements("//a[#class='classifiedTitle']");
console.log("Number of ads found:", nodes.length);
console.log("Link[0]:", nodes[0].getAttribute("href"));
console.log("Link[1]:", nodes[1].getAttribute("href"));
...
Of course it's better if you write a loop to print all the links. Something like :
for (let x = 0; x < nodes.length; x++) {
console.log(nodes[x].getAttribute("href"));
}

Warning messages with EZAPI EzDerivedColumn and input columns

When adding a derived column to a data flow with ezAPI, I get the following warnings
"Add stuff here.Inputs[Derived Column Input].Columns[ad_zip]" on "Add
stuff here" has usage type READONLY, but is not referenced by an
expression. Remove the column from the list of available input
columns, or reference it in an expression.
I've tried to delete the input columns, but either the method is not working or I'm doing it wrong:
foreach (Microsoft.SqlServer.Dts.Pipeline.Wrapper.IDTSInputColumn100 col in derFull.Meta.InputCollection[0].InputColumnCollection)
{
Console.WriteLine(col.Name);
derFull.DeleteInputColumn(col.Name);
}
I have the following piece of code that fixes the problem.
I got it from a guy called Daniel Otykier. So he is propably the one that should be credited for it... Unlesss he got it from someone else :-)
static public void RemoveUnusedInputColumns(this EzDerivedColumn component)
{
var usedLineageIds = new HashSet<int>();
// Parse all expressions used in new output columns, to determine which input lineage ID's are being used:
foreach (IDTSOutputColumn100 column in component.GetOutputColumns())
{
AddLineageIdsFromExpression(column.CustomPropertyCollection, usedLineageIds);
}
// Parse all expressions in replaced input columns, to determine which input lineage ID's are being used:
foreach (IDTSInputColumn100 column in component.GetInputColumns())
{
AddLineageIdsFromExpression(column.CustomPropertyCollection, usedLineageIds);
}
var inputColumns = component.GetInputColumns();
// Remove all input columns not used in any expressions:
for (var i = inputColumns.Count - 1; i >= 0; i--)
{
if (!usedLineageIds.Contains(inputColumns[i].LineageID))
{
inputColumns.RemoveObjectByIndex(i);
}
}
}
static private void AddLineageIdsFromExpression(IDTSCustomPropertyCollection100 columnProperties, ICollection<int> lineageIds)
{
int lineageId = 1;
var expressionProperty = columnProperties.Cast<IDTSCustomProperty100>().FirstOrDefault(p => p.Name == "Expression");
if (expressionProperty != null)
{
// Input columns used in expressions are always referenced as "#xxx" where xxx is the integer lineage ID.
var expression = expressionProperty.Value.ToString();
var expressionTokens = expression.Split(new[] { ' ', ',', '(', ')' });
foreach (var c in expressionTokens.Where(t => t.Length > 1 && t.StartsWith("#") && int.TryParse(t.Substring(1), out lineageId)))
{
if (!lineageIds.Contains(lineageId)) lineageIds.Add(lineageId);
}
}
}
Simple but not 100% Guaranteed Method
Call ReinitializeMetaData on the base component that EzApi is extending:
dc.Comp.ReinitializeMetaData();
This doesn't always respect some of the customizations and logic checks that EzAPI has, so test it carefully. For most vanilla components, though, this should work fine.
100% Guaranteed Method But Requires A Strategy For Identifying Columns To Ignore
You can set the UsageType property of those VirtualInputColumns to the enumerated value DTSUsageType.UT_IGNORED using EzApi's SetUsageType wrapper method.
But! You have to do this after you're done modifying any of the other metadata of your component (attaching other components, adding new input or output columns, etc.) since each of these triggers the ReinitializeMetaData method on the component, which automatically sets (or resets) all UT_IGNORED VirtualInputColumn's UsageType to UT_READONLY.
So some sample code:
// define EzSourceComponent with SourceColumnToIgnore output column, SomeConnection for destination
EzDerivedColumn dc = new EzDerivedColumn(this);
dc.AttachTo(EzSourceComponent);
dc.Name = "Errors, Go Away";
dc.InsertOutputColumn("NewDerivedColumn");
dc.Expression["NewDerivedColumn"] = "I was inserted!";
// Right here, UsageType is UT_READONLY
Console.WriteLine(dc.VirtualInputCol("SourceColumnToIgnore").UsageType.ToString());
EzOleDbDestination d = new EzOleDbDestination(f);
d.Name = "Destination";
d.Connection = SomeConnection;
d.Table = "dbo.DestinationTable";
d.AccessMode = AccessMode.AM_OPENROWSET_FASTLOAD;
d.AttachTo(dc);
// Now we can set usage type on columns to remove them from the available inputs.
// Note the false boolean at the end.
// That's required to not trigger ReinitializeMetadata for usage type changes.
dc.SetUsageType(0, "SourceColumnToIgnore", DTSUsageType.UT_IGNORED, false);
// Now UsageType is UT_IGNORED and if you saved the package and viewed it,
// you'll see this column has been removed from the available input columns
// ... and the warning for it has gone away!
Console.WriteLine(dc.VirtualInputCol("SourceColumnToIgnore").UsageType.ToString());
I was having exactly your problem and found a way to solve it. The problem is that the EzDerivedColumn has not the PassThrough defined in it's class.
You just need to add this to the class:
private PassThroughIndexer m_passThrough;
public PassThroughIndexer PassThrough
{
get
{
if (m_passThrough == null)
m_passThrough = new PassThroughIndexer(this);
return m_passThrough;
}
}
And alter the ReinitializeMetadataNoCast() to this:
public override void ReinitializeMetaDataNoCast()
{
try
{
if (Meta.InputCollection[0].InputColumnCollection.Count == 0)
{
base.ReinitializeMetaDataNoCast();
LinkAllInputsToOutputs();
return;
}
Dictionary<string, bool> cols = new Dictionary<string, bool>();
foreach (IDTSInputColumn100 c in Meta.InputCollection[0].InputColumnCollection)
cols.Add(c.Name, PassThrough[c.Name]);
base.ReinitializeMetaDataNoCast();
foreach (IDTSInputColumn100 c in Meta.InputCollection[0].InputColumnCollection)
{
if (cols.ContainsKey(c.Name))
SetUsageType(0, c.Name, cols[c.Name] ? DTSUsageType.UT_READONLY : DTSUsageType.UT_IGNORED, false);
else
SetUsageType(0, c.Name, DTSUsageType.UT_IGNORED, false);
}
}
catch { }
}
That is the strategy used by other components. If you want to see all the code you can check my EzApi2016#GitHub. I'm updating the original code from Microsoft to SQL Server 2016.

How to find specific value in a large object in node.js?

Actually I've parsed a website using htmlparser and I would like to find a specific value inside the parsed object, for example, a string "$199", and keep tracking that element(by periodic parsing) to see the value is still "$199" or has changed.
And after some painful stupid searching using my eyes, I found the that string is located at somewhere like this:
price = handler.dom[3].children[3].children[3].children[5].children[1].
children[3].children[3].children[5].children[0].children[0].raw;
So I'd like to know whether there are methods which are less painful? Thanks!
A tree based recursive search would probably be easiest to get the node you're interested in.
I've not used htmlparser and the documentation seems a little thin, so this is just an example to get you started and is not tested:
function getElement(el,val) {
if (el.children && el.children.length > 0) {
for (var i = 0, l = el.children.length; i<l; i++) {
var r = getElement(el.children[i],val);
if (r) return r;
}
} else {
if (el.raw == val) {
return el;
}
}
return null;
}
Call getElement(handler.dom[3],'$199') and it'll go through all the children recursively until it finds an element without an children and then compares it's raw value with '$199'. Note this is a straight comparison, you might want to swap this for a regexp or similar?

What's the fastest way to search a very long list of words for a match in actionscript 3?

So I have a list of words (the entire English dictionary).
For a word matching game, when a player moves a piece I need to check the entire dictionary to see if the the word that the player made exists in the dictionary. I need to do this as quickly as possible. simply iterating through the dictionary is way too slow.
What is the quickest algorithm in AS3 to search a long list like this for a match, and what datatype should I use? (ie array, object, Dictionary etc)
I would first go with an Object, which is a hash table (at least, storage-wise).
So, for every word in your list, make an entry in your dictionary Object and store true as its value.
Then, you just have to check if a given word is a key into your dictionary to know whether the word the user has choosen is valid or not.
This works really fast in this simple test (with 10,000,000 entries):
var dict:Object = {};
for(var i:int = 0; i < 10000000; i++) {
dict[i] = true;
}
var btn:Sprite = new Sprite();
btn.graphics.beginFill(0xff0000);
btn.graphics.drawRect(0,0,50,50);
btn.graphics.endFill();
addChild(btn);
btn.addEventListener(MouseEvent.CLICK,checkWord);
var findIt:Boolean = true;
function checkWord(e:MouseEvent):void {
var word:String;
if(findIt) {
word = "3752132";
} else {
word = "9123012456";
}
if(dict[word]) {
trace(word + " found");
} else {
trace(word + " not found");
}
findIt = !findIt;
}
It takes a little longer to build the dictionary, but lookup is almost instantaneous.
The only caveat is that you will have to consider certain keys that will pass the check and not necessarily be part of your words list. Words such as toString, prototype, etc. There are just a few of them, but keep that in mind.
I would try something like this with your real data set. If it works fine, then you have a really easy solution. Go have a beer (or whatever you prefer).
Now, if the above doesn't really work after testing it with real data (notice I've build the list with numbers cast as strings for simplicity), then a couple of options, off the top of my head:
1) Partition the first dict into a set of dictionaries. So, instead of having all the words in dict, have a dictionary for words that begin with 'a', another for 'b', etc. Then, before looking up a word, check the first char to know where to look it up.
Something like:
var word:String = "hello";
var dictKey:String = word.charAt(0);
// actual check
if(dict[dictKey][word]) {
trace("found");
} else {
trace("not found");
}
You can eventually repartition if necessary. I.e, make dict['a'] point to another set of dictionaries indexed by the first two characters. So, you'll have dict['a']['b'][wordToSearch]. There are a number of possible variations on this idea (you'd also have to come up with some strategy to cope with words of two letters, such as "be", for instance).
2) Try a binary search. The problem with it is that you'll first have to sort the list, upfront. You have to do it just once, as it doesn't make sense to remove words from your dict. But with millions of words, it might be rarther intensive.
3) Try some fancy data structures from open source libraries such as:
http://sibirjak.com/blog/index.php/collections/as3commons-collections/
http://lab.polygonal.de/ds/
But again, as I said above, I'd first try the easiest and simpler solution and check if it works against the real data set.
Added
A simple way to deal with keywords used for Object's built-in properties:
var dict:Object = {};
var keywordsInDict:Array = [];
function buildDictionary():void {
// let's assume this is your original list, retrieved
// from XML or other external means
// it contains "constructor", which should be dealt with
// separately, as it's a built-in prop of Object
var sourceList:Array = ["hello","world","foo","bar","constructor"];
var len:int = sourceList.length;
var word:String;
// just a dummy vanilla object, to test if a word in the list
// is already in use internally by Object
var dummy:Object = {};
for(var i:int = 0; i < len; i++) {
// also, lower-casing is a good idea
// do that when you check words as well
word = sourceList[i].toLowerCase();
if(!dummy[word]) {
dict[i] = true;
} else {
// it's a keyword, so store it separately
keywordsInDict.push(word);
}
}
}
Now, just add an extra check for built-in props in the checkWords function:
function checkWord(e:MouseEvent):void {
var word:String;
if(findIt) {
word = "Constructor";
} else {
word = "asdfds";
}
word = word.toLowerCase();
var dummy:Object = {};
// check first if the word is a built-in prop
if(dummy[word]) {
// if it is, check if that word was in the original list
// if it was present, we've stored it in keywordsInDict
if(keywordsInDict.indexOf(word) != -1) {
trace(word + " found");
} else {
trace(word + " not found");
}
// not a built-in prop, so just check if it's present in dict
} else {
if(dict[word]) {
trace(word + " found");
} else {
trace(word + " not found");
}
}
findIt = !findIt;
}
This isn't specific to ActionScript, but a Trie is a suitable data structure for storing words.