In this blog post, we will write about a scraper that will extract data from eBay. We all know that eBay is the online auction website where people place their stuff for selling on the auction.
Like earlier, we would be writing two scripts from which one for fetching list URs as well as store in the text file as well as other for parsing those links. Different data would be saved in the JSON format for more processing.
We would be using a Scraper API service to parse the objectives that makes us free from worries blocking as well as rendering dynamic websites as that takes care of all the things.
The initial script is fetching category listings. So it’s time to do that!
import requests from bs4 import BeautifulSoup if __name__ == '__main__': API_KEY = None links_file = 'links.txt' links = [] with open('API_KEY.txt', encoding='utf8') as f: API_KEY = f.read() URL_TO_SCRAPE = 'https://www.ebay.com/b/Computer-Components-Parts/175673/bn_1643095' payload = {'api_key': API_KEY, 'url': URL_TO_SCRAPE, 'render': 'false'} r = requests.get('http://api.scraperapi.com', params=payload, timeout=60) if r.status_code == 200: html = r.text.strip() soup = BeautifulSoup(html, 'lxml') entries = soup.select('.s-item a') for entry in entries: if 'p/' in entry['href']: listing_url = entry['href'].replace('&rt=nc#UserReviews', '') links.append(listing_url) if len(links) > 0: with open(links_file, 'a+', encoding='utf8') as f: f.write('\n'.join(links)) print('Links stored successfully.')
Then, let’s write a parse script to parse individual lists information. Please understand that we are not here for writing the whole script and everything. We have already created a lot of tutorials about that and you may check it here.
Just go through the parse script given below:
import requests from bs4 import BeautifulSoup if __name__ == '__main__': record = {} price = title = seller = image = None with open('API_KEY.txt', encoding='utf8') as f: API_KEY = f.read() URL_TO_SCRAPE = 'https://www.ebay.com/p/5034585650?iid=202781903791' payload = {'api_key': API_KEY, 'url': URL_TO_SCRAPE, 'render': 'false'} r = requests.get('http://api.scraperapi.com', params=payload, timeout=60) if r.status_code == 200: html = r.text soup = BeautifulSoup(html, 'lxml') title_section = soup.select('.product-title') if title_section: title = title_section[0].text.strip() selleer_section = soup.select('.seller-persona') if selleer_section: seller = selleer_section[0].text.replace('Sold by', '').replace('Positive feedbackContact seller', '') selller = seller[:-6] price_section = soup.select('.display-price') if price_section: price = price_section[0].text image_section = soup.select('.vi-image-gallery__enlarge-link img') if image_section: image = image_section[0]['src'] record = { 'title': title, 'price': price, 'seller': seller, 'image': image, } print(record)
When we run a script it will print the following code:
{ 'title': 'AMD Ryzen 3 3200G - 3.6GHz Quad Core (YD3200C5FHBOX) Processor', 'price': '$99.99', 'seller': 'best_buy\xa0(698388)97.2% ', 'image': 'https://i.ebayimg.com/images/g/ss8AAOSwsbhdmy2e/s-l640.jpg' }
It’s completely simple.
Conclusion
In this blog post, you have learned about how you can extract eBay data very easily through using a Scraper API in Python. You may improve this script according to your requirements like writing the price monitoring scripts.
Writing data scrapers is a remarkable journey however, you could hit a wall in case, the website blocks your IPs. Being an individual, you just can’t afford to have expensive proxies also. Retailgators offers you an easy-to-utilize and affordable API, which will help you extract sites without any problem. You should not worry about being blocked as a Scraper API uses proxies by default for accessing websites. You should not think about Selenium also as Scraper API gives the facilities of a headless browser. We have also written regarding how to utilize it.
Leave a Reply
Your email address will not be published. Required fields are marked