Save specific Json from specifiq request link - json

I want to scrape specific JSON from specific request link "http://website.com/request/api" on page.
I have to scroll to the bottom of the page to get all the articles (already coded). At each scroll, I would like to get the JSON corresponding to the articles just displayed.
So there are 2 problems:
The fact that the same URL query "http://website.com/request/api" is also used to returns other JSON which is not useful for me (other elements of the page).
The fact of having several JSONs to collect and assemble
For problem 1, I thought of adding a condition to my code to get only the JSON beginning with a precise text "Data : object"?
For the problem 2, I should be able to write in a file or the buffer the different JSON selected by assembling them.
Do you know how I could do it?
page.on('response', async(response) => { const request => response.request();
if (request.url().includes('/api/graphql/')){
const text = await response.text();
fs.writeFile('./tmp/response.json', JSON.stringify((text)));
console.log(text);
}
})

i have resolve the problem.
listener = page.on('response', async response => {
const isXhr = ['xhr','fetch','json'].includes(response.request().resourceType())
try {
if (isXhr){
if (response.url().includes('/api/graphql/')) {
const resp = await response.buffer();
if (resp.includes('Data : object')) {
fs.writeFileSync('./tmp/response.json', resp, { flag: 'a+' })
}
}
}}
catch(e){}
})

Related

I can't fill a request response using axios in state variable in React.js with Next.js

I'm working with React.js and I have the following problem:
import axios from "axios";
export default function Home() {
const [products, setProducts] = useState([]);
const ax = axios.create({ headers: { Accept: 'application/json' }});
function test() {
const res = ax.get("https://vtexstore.codeby.com.br/api/catalog_system/pub/products/search").then((response) => {
// expected the setProducts to be filled with the return of this request
setProducts(response.data);
});
}
test();
// and when I get here to see if the products have been filled, I get an empty array [ ]
console.log(products);
/*
as the products variable was not filled within the axios promise by setProducts,
there is no way to throw the products array here in the HTML to make a forEach or
a map to look cute together with the tags
*/
return (
<sup>how sad, with the product array empty, I can't put the data here '-'</sup>
);
}
<script src="https://cdnjs.cloudflare.com/ajax/libs/react/18.2.0/umd/react.production.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/react-dom/18.2.0/umd/react-dom.production.min.js"></script>
See how the result comes out in the IDE console:
I'm in Visual Studio not knowing what to do, I'm new to ReactJS with NextJS and from an early age I've been trying to see if I could solve this problem, but without success.
What can I do to bring the products to the HTML page?
UPDATE: As per the solution below, I created a possible workaround that indicates a path that could have returned a solution
ax.get("https://vtexstore.codeby.com.br/api/catalog_system/pub/products/search/", {})
.then((response) => setProducts(response.data))
.catch((error) => {
console.log(error); // AxiosError {message: 'Network Error', name: 'AxiosError', ...}
console.log(error.status); // undefined
console.log(error.code); // ERR_NETWORK
});
useEffect(() => {
console.log(products);
}, []);
<script src="https://cdnjs.cloudflare.com/ajax/libs/react/18.0.2/umd/react.production.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/react-dom/18.0.2/umd/react-dom.production.min.js"></script>
and I'm getting the same error that I put in the comments of the first answer below:
but when I change the setProducts by the console.log to see if it returns the same result, this appears in the terminal where my next.js application is running
that:
ax.get("https://vtexstore.codeby.com.br/api/catalog_system/pub/products/search/", {})
.then((response) => console.log(response.data.length)) // returns the length of the products array
returns this when I update my app:
NOTE: That's why I'm not able to understand my application in Next.js. I'm following all the correct guidelines, writing the code perfectly using axios and when I run the application on the website it gives a network error and doesn't show exactly the amount of products that were displayed in the terminal where my application is running.
I've already configured all the request headers correctly, enabling CORS to allow external requests with other API's, and I still don't succeed in returning the data to my application's page.
Wrap the stuff you have to fetch products inside useEffect hook
useEffect(()=>{
const ax = axios.create({ headers: { Accept: 'application/json' }});
function test() {
const res = ax.get("https://vtexstore.codeby.com.br/api/catalog_system/pub/products/search").then((response) => {
// expected the setProducts to be filled with the return of this request
setProducts(response.data);
console.log(response.data)
});
}
test();
},[])
Then in your return of the component, you can use map on products array with null and undefined checks
Like
{products && products.map(product=>{})}

How can I loop through Nuxt.js generate routes to allow all my pages to render while using Woocommerce API?

Hello and thanks for the help in advance.
I'm trying to get my Nuxt app to automatically loop through my Woocommerce API automatically so it can generate the pages without much work.
How do I get the loop to function. Right now, I'm having issues and get a Nuxt Fatal Error:
TypeError: Cannot read property 'forEach' of undefined
Screenshot of Error + Code
I'm using Woocommerce API and, as you can see in the screenshot above, the Woocommerce code is imported into this code I need help with using a standard import.
import WooCommerce from './woocommerce.js';
generate: {
routes() {
WooCommerce.get("products").then((response) => {
let totalPages = response.headers['x-wp-totalpages'];
let page = 1;
while(page <= totalPages) {
WooCommerce.get("products", page).then((response) => {
response.data.map(product => {
return '/product/' + product.slug
});
})
page++;
}
})
}
},
You are not returning any routes in your routes function. Because of that, nuxt fails as it tries to iterate over them in a later step.
Assuming your way of accessing your API is correct, you would only need to add an array to which you push your routes and then return it.
I'm usually using async/await, which is why my code looks slightly different. It is a bit easier in this case I think.
// Declare the routes function asynchronous
async routes() {
const productsResponse = await WooCommerce.get('products');
const totalPages = productsResponse.headers['x-wp-totalpages'];
// Add an array to collect your routes
const routes = [];
let page = 1;
while (page <= totalPages) {
const pagesResponse = await WooCommerce.get('products', page);
// The 'map' function returns the routes for this set of pages
const productRoutes = pagesResponse.data.map((product) => {
return '/product/' + product.slug;
});
// Push your routes to the created array-
routes.push(...productRoutes);
page++;
}
// Return your routes
return routes;
};

React constant with parameters using square brackets

I'm new to React.
I have the code below with a function, but when I run it, it returns an error:
TypeError: renderJson[item.node] is not a function.
How can I fix the renderJson function?
export const readItem = item => {
printlog(item);
return renderJson[item.node](item);
};
const renderJson = {
"heading": item => <h1>{item.map(item => readItem(item))}</h1>
};
If you're trying to create a single React functional component that takes a JSON, and outputs the items in the JSON as a header, it would be more like this:
// If you're getting this JSON from an external source using something like a GET request, put the request inside a "useEffect()" hook
const myJson = {
"heading": ["My First Header", "My Second Header"]
};
export const Header = () => {
console.log(myJson);
return <h1>{myJson.heading.map(header => header}</h1>
};
I apologize if this is a misinterpretation of your question. If it is, any additional details would be helpful.

get post title after Infinite scroll finished

I manage to show all the post on a site where it has load_more button to go to the next page, but something is missing,
I got error of
e Error: Node is either not visible or not an HTMLElement
at ElementHandle._clickablePoint (/Users/minghann/Documents/productnation_scraper/node_modules/puppeteer/lib/ExecutionContext.js:331:13)
at <anonymous>
at process._tickCallback (internal/process/next_tick.js:188:7)
Which doesn't happen if I don't load all the post. It's hard to debug because I don't know which post is missing what. Full code as below:
const browser = await puppeteer.launch({
devtools: true
});
const page = await browser.newPage();
await page.goto("https://example.net");
await page.waitForSelector(".load_more_btn");
const load_more_exist = !!(await page.$(".load_more_btn"));
while (load_more_exist > 0) {
await page.click(".load_more_btn");
}
const posts = await page.$$(".post");
let result = [];
for (const post of posts) {
result = [
...result,
{
title: await post.$eval(".post_title a", e => e.innerText)
}
];
}
console.log(result);
browser.close();
There are multiple ways and best way is to combine the following two different ways.
Look for Ajax
Wait for request instead. Whenever you click on Load More, it will do a simple ajax request to ?ajax-request=jnews. We can use .waitForRequest or .waitForResponse for this use case. Here is a working example,
await Promise.all([
page.waitForRequest(response => response.url().includes('?ajax-request=jnews') && response.status() === 200),
page.click(".load_more_btn")
])
Clean DOM and wait for new Element
Refer to these answers here and here.
Basically you can remove the dom elements that you collected, so next time you collect more data, there won't be any duplicates.
So, once you remove all current elements like document.querySelectorAll('.jeg_post'), you can simply do another page.waitFor('.jeg_post') later if you need.

nodejs piping stream after modifying data

I am learning about streaming with nodejs, I understand the examples shown in the request npm module;
request(url).pipe(fs.createWriteStream('./filename.json'))
But there are two parts of my problem.
Case 1:
function fetchSitemaps() {
return requestAsync(url).then(data => {
const $ = cheerio.load(data);
let urls = [];
$("loc").each((i, e) => urls.push($(e).text()));
fs.writeFileSync('./sitemaps.json', JSON.stringify(urls))
})
}
I want to convert the above from writeFileSync to createWriteStream, but how do I keep appending data to an array which is in JSON format?
Case 2:
function fetchLyricUrls() {
let sitemaps = JSON.parse(fs.readFileSync('./sitemaps.json'));
sitemaps.forEach((sitemap, i) => {
let fileName = i + '.json';
if(url_pat.exec(sitemap)) {
fileName = url_pat.exec(sitemap)[1] + '.json';
}
requestAsync(url).then(data => {
const $ = cheerio.load(data);
let urls = [];
$("loc").each((i, e) => urls.push($(e).text()));
return urls;
}).then(urls => {
let allUrls = [];
urls.map(u => {
return requestAsync(u).then(sm => {
const $ = cheerio.load(sm);
$("loc").each((i, e) => allUrls.push($(e).text()))
fs.writeFileSync('./lyrics.json', JSON.stringify(allUrls))
return allUrls;
});
});
});
});
}
The first part of the problem is same, appending to a json data using writeStream, but this time, I want to parse the the html data and get some text, which I want to send using stream, not the html data as a whole.
So let's split up the answers
Case 1
First of all I'd try to keep the data as a stream and try not to accumulate it. So in essence, instead of loading the whole sitemap and then parsing it, I'd use something like the xml-nodes so that the nodes are a separate stream. Then my module scramjet would come to transform
const request = require('request');
const xmlNodes = require('xml-nodes');
const writable = fs.createWritableStream('./sitemaps.json');
const cheerio = require('cheerio');
const scramjet = require('scramjet');
writable.write('[');
let first = 0;
request('http://example.com/sitemap.xml')
// this fetches your sitemap
.on('end', () => writable.end("]"))
// when the stream ends, this will end the sitemaps.json
.pipe(xmlNodes('loc'))
// this extracts your "loc" nodes
.pipe(new scramjet.DataStream())
// this creates a mappable stream
.map((nodeString) => cheerio('loc', nodeString).text())
// this extracts the text as in your question
.map((url) => (first++ ? ',' : '') + JSON.stringify(url))
// this makes sure that strings are nicely escaped
// and prepends them with a comma on every node, but first one
.pipe(writable, {end: false})
// and this will push all your entries to the writable stream
Case 2
Here you'll need to do something similar, although if case 1 is an immediate step, then I'd suggest to store the files in lines of JSONs, not an array. It'd make easier to stream that way.