The webpage I'm scraping has following content:
<html>
<head></head>
<body>
<span id="an_id">
<script type="text/javascript">
var foo = "";
$(function(){$('#bar').something(
{
data: [
{ title: 'x', description:'x_desc' },
{ title: 'y', description:'y_desc' }
],
whatever: 1
});}
);
</script>
</span>
</body>
</html>
I need to get the data in the array, so I'm interested in this:
[
{ title: 'x', description:'x_desc' },
{ title: 'y', description:'y_desc' }
]
I've been trying with page.evaluate() and page.queryObjects(), but to no avail, I just don't understand it well enough.
Anyone who could point me in the right direction?
Edit:
I hacked it together like this, but it's so nasty that it makes me want to throw up:
const page = await browser.newPage();
await page.goto('file://C:/tmp/test.htm');
await page.waitForSelector("#an_id")
const scriptContents = await page.$eval('#an_id', e => e.innerHTML);
console.log(`scriptContents: ${scriptContents}`)
const startDelimiter = "data:"
const endDelimiter = "whatever: 1"
const startDelimiterIndex = scriptContents.indexOf(startDelimiter) + startDelimiter.length;
const endDelimiterIndex = scriptContents.indexOf(endDelimiter);
const rawData = scriptContents.substring(startDelimiterIndex, endDelimiterIndex).trim().slice(0, -1);
console.log(`rawData: --${rawData}--`)
const data = eval(rawData);
console.log(`data:`)
console.log(data)
console.log(`the description of y: ${data[1].description}`)
await browser.close();
Which produces:
scriptContents:
<script type="text/javascript">
var foo = "";
$(function(){$('#bar').something(
{
data: [
{ title: 'x', description:'x_desc' },
{ title: 'y', description:'y_desc' }
],
whatever: 1
});}
);
</script>
rawData: --[
{ title: 'x', description:'x_desc' },
{ title: 'y', description:'y_desc' }
]--
data:
[ { title: 'x', description: 'x_desc' },
{ title: 'y', description: 'y_desc' } ]
the description of y: y_desc
There must be a better way :)
It seems like this should do it (untested):
await page.evaluate(() => {
let html = document.querySelector('#an_id script').innerHTML
return html.match(/\[.*\]/s)[0]
})
Related
I'm using #aws-sdk/client-pinpoint to send an email to a verified user.
async sendEmail(body: any): Promise<void> {
const fromAddress = 'test#domain.com';
const toAddress = 'test#domain.com';
const projectId = 'XXX-XXXX-XXXX';
const subject = 'Amazon Pinpoint Test (AWS SDK for JavaScript in Node.js)';
const body_text = `Amazon Pinpoint Test (SDK for JavaScript in Node.js)`;
const charset = 'UTF-8';
const params = {
ApplicationId: projectId,
MessageRequest: {
Addresses: {
[toAddress]: {
ChannelType: 'EMAIL',
},
},
MessageConfiguration: {
EmailMessage: {
FromAddress: fromAddress,
SimpleEmail: {
Subject: {
Charset: charset,
Data: subject,
},
HtmlPart: {
Charset: charset,
Data: 'body_html',
},
TextPart: {
Charset: charset,
Data: body_text,
},
},
},
},
},
};
try {
const data = await this.pinpointClient.send(new SendMessagesCommand(params));
const { MessageResponse } = data;
if (!MessageResponse || !MessageResponse.Result) throw Error('Failed!');
const recipientResult = MessageResponse?.Result[toAddress];
if (recipientResult.StatusCode !== 200) {
throw new Error(recipientResult.StatusMessage);
} else {
console.log(recipientResult.MessageId);
}
} catch (err) {
console.log(err.message);
}
}
And everything is working fine. But when I try to use a pre-defined template, it is not being send for some reason and no errors were shown as well! I'm lost on how to pass template Name/ARN with substitution. Any idea on how to achieve that?
Cheers!
use Template Configuration in message configuration
TemplateConfiguration: {
EmailTemplate: {
'Name': 'template name',
'Version': 'latest'
}
}
async sendEmail(body: any): Promise<void> {
const fromAddress = 'test#domain.com';
const toAddress = 'test#domain.com';
const projectId = 'XXX-XXXX-XXXX';
const subject = 'Amazon Pinpoint Test (AWS SDK for JavaScript in Node.js)';
const body_text = `Amazon Pinpoint Test (SDK for JavaScript in Node.js)`;
const charset = 'UTF-8';
const params = {
ApplicationId: projectId,
MessageRequest: {
Addresses: {
[toAddress]: {
ChannelType: 'EMAIL',
},
},
MessageConfiguration: {
EmailMessage: {
FromAddress: fromAddress,
SimpleEmail: {
Subject: {
Charset: charset,
Data: subject,
},
HtmlPart: {
Charset: charset,
Data: 'body_html',
},
TextPart: {
Charset: charset,
Data: body_text,
},
},
},
},
TemplateConfiguration: {
EmailTemplate: {
'Name': 'template name',
'Version': 'latest'
}
}
},
};
try {
const data = await this.pinpointClient.send(new SendMessagesCommand(params));
const { MessageResponse } = data;
if (!MessageResponse || !MessageResponse.Result) throw Error('Failed!');
const recipientResult = MessageResponse?.Result[toAddress];
if (recipientResult.StatusCode !== 200) {
throw new Error(recipientResult.StatusMessage);
} else {
console.log(recipientResult.MessageId);
}
} catch (err) {
console.log(err.message);
}
}
I am trying to run a google app script with Whatsapp business API to send messages to my customers directly from google sheets. The below app runs fine but every time I run it, it sends the message again and again to all customers irrespective of the same msg being sent to the same customer earlier.
Is there a way, I can add a column and update it automatically to record if the message has been sent to this customer in which case skip to the next (just like in mail merge scripts).
I have the below code and a screenshot of the image here
const WHATSAPP_ACCESS_TOKEN = "**My whatsapp token**";
const WHATSAPP_TEMPLATE_NAME = "**My template name**";
const LANGUAGE_CODE = "en";
const sendMessage_ = ({
recipient_number,
customer_name,
item_name,
delivery_date,
}) => {
const apiUrl = "**My api url**";
const request = UrlFetchApp.fetch(apiUrl, {
muteHttpExceptions: true,
method: "POST",
headers: {
Authorization: `Bearer ${WHATSAPP_ACCESS_TOKEN}`,
"Content-Type": "application/json",
},
payload: JSON.stringify({
messaging_product: "whatsapp",
type: "template",
to: recipient_number,
template: {
name: WHATSAPP_TEMPLATE_NAME,
language: { code: LANGUAGE_CODE },
components: [
{
type: "body",
parameters: [
{
type: "text",
text: customer_name,
},
{
type: "text",
text: item_name,
},
{
type: "text",
text: delivery_date,
},
],
},
],
},
}),
});
const { error } = JSON.parse(request);
const status = error ? `Error: ${JSON.stringify(error)}` : `Message sent to ${recipient_number}`;
Logger.log(status);
};
const getSheetData_ = () => {
const [header, ...rows] = SpreadsheetApp.getActiveSheet().getDataRange().getDisplayValues();
const data = [];
rows.forEach((row) => {
const recipient = { };
header.forEach((title, column) => {
recipient[title] = row[column];
});
data.push(recipient);
});
return data;
};
const main = () => {
const data = getSheetData_();
data.forEach((recipient) => {
const status = sendMessage_({
recipient_number: recipient["Phone Number"].replace(/[^\d]/g, ""),
customer_name: recipient["Customer Name"],
item_name: recipient["Item Name"],
delivery_date: recipient["Delivery Date"],
});
});
};
In your situation, how about modifying your script as follows? Please modify main as follows.
From:
const data = getSheetData_();
To:
const temp = getSheetData_();
const { data, ranges } = temp.reduce((o, e, i) => {
if (e["Sent"] != "sent") {
o.data.push(e);
o.ranges.push(`E${i + 2}`);
}
return o;
}, { data: [], ranges: [] });
if (ranges.length > 0) {
SpreadsheetApp.getActiveSheet().getRangeList(ranges).setValue("sent");
}
By this modification, this script checks the column "E" of "Sent". And, the row values without "sent" in column "E" are retrieved as data. And, the value of "sent" is put into column "E" of the retrieved rows.
Reference:
reduce()
Data is not going in Database in MERN ---
Everything is looking fine and also got status 200 and alert "post created" but data is not going in my database.
How to debug this error. I have tried all solutions. At least tell me the possible reasons for this error. it will help me a lot.
Schema
const mongoose = require("mongoose");
const postSchema = new mongoose.Schema({
title: {
type: String,
required: true,
},
description: {
type: String,
required: true,
},
picture: {
type: String,
required: false,
},
username: {
type: String,
required: true,
},
category: {
type: String,
required: false,
},
createDate: {
type: Date,
default: Date.now,
},
})
const post = new mongoose.model('post', postSchema);
module.exports = post;
server router
const express = require("express");
const router = express.Router();
const post = require("../schema/post-schema");
router.post('/create', async(req,res) => {
try{
console.log(req.body);
const { title, description, picture, username, category, createDate } = req.body;
const blogData = new post({
title:title,
description:description,
picture:picture,
username:username,
category:category,
createDate:createDate
});
console.log("user " + blogData);
const blogCreated = await blogData.save();
if(blogCreated)
{
return res.status(200).json({
message: "blog created successfully"
})
}
console.log("post "+post);
} catch(error){
res.status(500).json('Blog not saved '+ error)
}
})
module.exports = router;
client file
const initialValues = {
title: '',
description: '',
picture: 'jnj',
username: 'Tylor',
category: 'All',
createDate: new Date()
}
const CreateView = () => {
const [post, setPost] = useState(initialValues);
const history = useHistory();
const handleChange = (e) => {
setPost({...post, [e.target.name]:e.target.value});
}
const savePost = async() => {
try {
const {title, description, picture, username, category, createDate} = post;
console.log(post);
const res = await fetch('/create', {
method: "POST",
headers: {
"Content-Type":"application/json"
},
body: JSON.stringify({title, description, picture, username, category, createDate})
})
console.log(post);
console.log("res is "+ res);
if(res.status===200 )
{
window.alert("post created");
console.log(res);
}
}
catch(e){
console.log(`save post error ${e}`);
}
}
const classes = useStyles();
const url = "https://images.unsplash.com/photo-1543128639-4cb7e6eeef1b?ixid=MnwxMjA3fDB8MHxzZWFyY2h8Mnx8bGFwdG9wJTIwc2V0dXB8ZW58MHx8MHx8&ixlib=rb-1.2.1&w=1000&q=80";
return (
<>
<Box className={classes.container}>
<form method="POST">
<img src={url} alt="banner" className={classes.image}/>
<FormControl className={classes.form}>
<AddIcon fontSize="large" color="action" />
<InputBase placeholder="Title" className={classes.textfield} name="title" onChange={(e)=>handleChange(e)}/>
<Button variant="contained" color="primary" onClick={()=>savePost()}>Publish</Button>
</FormControl>
<TextareaAutosize aria-label="empty textarea" placeholder="Write here..." className={classes.textarea} name="description" onChange={(e)=>handleChange(e)}/>
</form>
</Box>
</>
)
}
export default CreateView;
I think problem is you checking status 200 , so your server returning status 200 in anyway as response . You have to check your server side , and check if is returning code 400 or anything else on failure .
I have lists of URLs... from : http://books.toscrape.com
Let objArray =
[
{"Url": "books.toscrape.com/catalogue/a-light-in-the-attic_1000/index.html"},
{"Url": "books.toscrape.com/catalogue/tipping-the-velvet_999/index.html"},
{"Url": "books.toscrape.com/catalogue/soumission_998/index.html"}
]
As You Can See That All Links Have Similar Scraping.
I want to scrape the Titles, Prices And Stock Availability from above links.
I also try to loop through all of the URLs like this:
for (var i = 0; i < objArray.length; ++i) {
(async() => {
let browser;
try {
browser = await puppeteer.launch({
headless: false,
});
const page = await browser.newPage();
await page.goto(url);
const content = await page.content();
const $ = cheerio.load(content);
const Product_details = []
const instock = $(div[class="col-sm-6 product_main"] p[class="instockavailability"]).text();
const title = $(div[class="col-sm-6 product_main"] ).text();
const price = $(div[class="col-sm-6 product_main"] p[price_color]).text()
Product_details.push({
Stock: instock,
Title: title,
Price: price,
});
fs.writeFileSync("files.json", JSON.stringify(Product_details), "utf8")
console.log(Product_details)
}
Now my above code not working.....I want to get the product details like: titles, prices
You can separate each page logic into a function and try something like this:
(async () => {
let browser;
try {
browser = await puppeteer.launch({
headless: false,
});
const page = await browser.newPage();
const url = "http://books.toscrape.com/";
const Product_details = [];
await page.goto(url);
Product_details.push(await getData(page, Product_details));
while (await page.$('li[class="next"] a')) {
await Promise.all([
page.waitForNavigation(),
page.click('li[class="next"] a'),
]);
Product_details.push(await getData(page, Product_details));
}
fs.writeFileSync("Details.json", JSON.stringify(Product_details), "utf8");
} catch (e) {
console.log('Error-> ', e);
await browser.close();
}
})();
async function getData(page, details) {
console.log(page.url());
const html = await page.content();
const $ = cheerio.load(html);
const statsTable = $('li[col-xs-6 col-sm-4 col-md-3 col-lg-3]');
statsTable.each(function() {
const title = $(this).find('h3').text();
const Price = $(this).find('p[class="price_color"]').text();
details.push({
Title: title,
Price: Price
});
});
}
UPD: Answer for the last edition of the question:
const objArray = [
{ Url: 'books.toscrape.com/catalogue/a-light-in-the-attic_1000/index.html' },
{ Url: 'books.toscrape.com/catalogue/tipping-the-velvet_999/index.html' },
{ Url: 'books.toscrape.com/catalogue/soumission_998/index.html' },
];
(async () => {
let browser;
try {
const Product_details = [];
for (const { Url } of objArray) {
browser = await puppeteer.launch({
headless: false,
});
const page = await browser.newPage();
await page.goto(`http://${Url}`);
const content = await page.content();
const $ = cheerio.load(content);
const instock = $('div[class="col-sm-6 product_main"] p[class="instockavailability"]').text().trim();
const title = $('div[class="col-sm-6 product_main"] h1').text().trim();
const price = $('div[class="col-sm-6 product_main"] p[class="price_color"]').text().trim;
Product_details.push({
Stock: instock,
Title: title,
Price: price,
});
await browser.close();
}
console.log(Product_details);
fs.writeFileSync('files.json', JSON.stringify(Product_details), 'utf8');
} catch (e) {
console.log('Error-> ', e);
await browser.close();
}
})();
i want to group returned json data by libelle i end up with the following
script :
$('.membre').select2({
placeholder: 'Select an item',
ajax: {
url: '/select2-autocomplete-ajax',
dataType: 'json',
delay: 250,
data: function (params) {
return {
membre_id: params.term // search term
};
},
processResults: function (data) {
return {
results: $.map(data, function (item) {
return {
text: item.libelle,
children: [{
id: item.id,
text: item.nom +' '+ item.prenom
}]
}
})
};
},
cache: true
}
});
Output :
is there any possibility to make the group work properly without repeating the libelle ?
JSON output :
[{"id":1,"libelle":"Laboratoire Arithm\u00e9tique calcul Scientifique et Applications","nom":"jhon","prenom":"M"},{"id":2,"libelle":"Laboratoire Arithm\u00e9tique calcul Scientifique et Applications","nom":"JHON","prenom":"jhon"}]
Seems you're looking for something like this https://select2.org/data-sources/formats#grouped-data
// Add this somewhere before the ajax
var groupBy = function(xs, key) {
return xs.reduce(function(rv, x) {
(rv[x[key]] = rv[x[key]] || []).push(x);
return rv;
}, {});
};
processResults: function (data) {
return {
results: $.map(data, function (item,key) {
var children = [];
for(var k in item){
var childItem = item[k];
childItem.text = item[k].nom +' '+ item[k].prenom;
children.push(childItem);
}
return {
text: key,
children: children,
}
})
};