I’m trying to do some web-scraping with the URL: text to collect some data on the move history of a chasis.
However, the time range selector of this web page is kind of weird, and it’s really hard to do it using Python. For example, if I want to collect the data for the move history for a specific chassis under a custom time range manually, say from 01/1/23 -12/31/23, I have to click the selector by hand multiple times (if this is unclear, please click the link above and select the time range).
I don’t know if I can achieve this goal using Selenium or if I should use other tools to get there.
The HTML code:
<div class="date_wrapper">
<div id="daterange" class="datepickerrange">
<svg width="24" height="24" viewBox="0 0 24 24" fill="black" xmlns="http://www.w3.org/2000/svg">
<path d="M17 4H21C21.2652 4 21.5196 4.10536 21.7071 4.29289C21.8946 4.48043 22 4.73478 22 5V21C22 21.2652 21.8946 21.5196 21.7071 21.7071C21.5196 21.8946 21.2652 22 21 22H3C2.73478 22 2.48043 21.8946 2.29289 21.7071C2.10536 21.5196 2 21.2652 2 21V5C2 4.73478 2.10536 4.48043 2.29289 4.29289C2.48043 4.10536 2.73478 4 3 4H7V2H9V4H15V2H17V4ZM15 6H9V8H7V6H4V10H20V6H17V8H15V6ZM20 12H4V20H20V12Z"></path>
</svg>
<span>07/9/24 - 08/7/24</span> <i class="toggle-icon"></i>
</div>
</div>
I tried to do this using Selenium:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.chrome.options import Options
import time
import os
import pandas as pd
from selenium.webdriver.common.keys import Keys
from datetime import datetime
import csv
from bs4 import BeautifulSoup
chrome_options = Options()
# chrome_options.add_argument("--headless")
# chrome_options.add_argument("--disable-gpu")
download_directory = '/Users/temp/py_scraping_selenium_test'
prefs = {'download.default_directory' : '/Users/temp/py_scraping_selenium_test'}
chrome_options.add_experimental_option('prefs', prefs)
filename = '/Users/temp/py_scraping_selenium_test/chasis_info_test_1.csv'
driver = webdriver.Chrome(options=chrome_options)
# Navigate to the initial page
initial_url = 'https://dcli.com/track-a-chassis/?0-chassisType=vin&searchChassis=LJRC41269G1020818' # Update this URL
# Navigate to the initial page
driver.get(initial_url)
# Set up an explicit wait (up to 60 seconds)
wait = WebDriverWait(driver, 600)
# Locate and click the date range picker to open it using JavaScript to ensure it is clicked
date_range_picker = driver.find_element(By.ID, 'daterange')
driver.execute_script("arguments[0].click();", date_range_picker)
# Wait for the date picker to be available
wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, 'input[name="startDate"]')))
# Locate the start and end date input fields
start_date_input = driver.find_element(By.CSS_SELECTOR, 'input[name="startDate"]') # Adjust selector as needed
end_date_input = driver.find_element(By.CSS_SELECTOR, 'input[name="endDate"]') # Adjust selector as needed
# Clear any existing values
start_date_input.clear()
end_date_input.clear()
# Set the desired start and end dates
start_date = '01/01/2023' # Adjust to your desired start date
end_date = '12/31/2023' # Adjust to your desired end date
start_date_input.send_keys(start_date)
end_date_input.send_keys(end_date)
# Submit the form or trigger the update
submit_button = driver.find_element(By.XPATH, '//button[text()="Search"]') # Adjust selector as needed
submit_button.click()
# Wait for the updated results to load
wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, 'table')))
But I got a lot errors…
What I expect is a table (a csv) of the move history of a specific chassis under a custom time range (01/1/23 -12/31/23).
Any help? Thanks in advance!!
Sean Liu is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.