본문 바로가기

데이터 분석

selenium, xpath (2024-05-21)

1. 셀레니움

셀레니움은 브라우저를 컨트롤 할 수 있도록 지원하는 라이브러리

더보기
!pip install selenium

!pip install chromedriver_autoinstaller

from selenium import webdriver
from seleniuhttp://m.webdriver.common.keys import Keys



driver = webdriver.Chrome()
driver.get('https://www.google.com')
search = driver.find_element('name', 'q')
search.send_keys('미세먼지')
search.send_keys(Keys.RETURN)

 

 

2. 네이버웹툰

더보기
!pip install bs4
from bs4 import BeautifulSoup

driver = webdriver.Chrome()
driver.get('https://comic.naver.com/webtoon/detail?titleId=783053&no=134&week=tue')

soup = BeautifulSoup(driver.page_source)

comment_area = soup.findAll('span', {'class' : "u_cbox_contents"})
print('*********** 베스트 댓글 ************')
for i in range(len(comment_area)) :
    comment = comment_area[i].text.strip()
    print(comment)
    print('-'*50)

 

 

3. 인스타그램

 

더보기
!pip install chromedriver_autoinstaller
from selenium import webdriver
from selenium.webdriver.common.keys import Keys

# 로그인
def login(id,pw):
    input_id = driver.find_element('xpath', '/html/body/div[2]/div/div/div[2]/div/div/div[1]/section/main/article/div[2]/div[1]/div[2]/form/div/div[1]/div/label/input')
    input_pw = driver.find_element('xpath', '/html/body/div[2]/div/div/div[2]/div/div/div[1]/section/main/article/div[2]/div[1]/div[2]/form/div/div[2]/div/label/input')
    input_id.send_keys(id)
    input_pw.send_keys(pw)
    driver.find_element('xpath', '/html/body/div[2]/div/div/div[2]/div/div/div[1]/section/main/article/div[2]/div[1]/div[2]/form/div/div[3]/button/div').click()


#해시태그 검색
def search(hashtag):
    url = f'https://www.instagram.com/explore/tags/{hashtag}/'
    driver.get(url)


# 좋아요 및 댓글달기
def like_and_comment(comment) :
    xpath = '/html/body/div[2]/div/div/div[2]/div/div/div[1]/div[1]/div[2]/section/main/article/div/div[2]/div/div[1]/div[1]/a'
    driver.find_element('xpath', xpath).click()

    reply_xpath = '/html/body/div[8]/div[1]/div/div[3]/div/div/div/div/div[2]/div/article/div/div[2]/div/div/div[2]/section[3]/div/form/div/textarea'
    driver.find_element('xpath', reply_xpath).click()
    driver.find_element('xpath', reply_xpath).send_keys('좋은 정보 감사합니다!')
    
ㅡㅡㅡㅡㅡㅡㅡㅡㅡㅡㅡㅡㅡㅡㅡㅡㅡㅡㅡㅡㅡㅡㅡㅡㅡㅡㅡㅡㅡㅡㅡㅡㅡㅡㅡㅡㅡㅡㅡㅡㅡㅡㅡㅡ
    
# 실행
driver = webdriver.Chrome()
url = 'https://www.instagram.com/'
driver.get(url)
driver.implicitly_wait(3) # HTML이 다 읽어졌는지 확인, 안읽어졌다면 3초까지 기다림

id = 'kkm7900@daum.net'
pw = 'kk94673720.'

login(id, pw)
time.sleep(4)

hashtag = '사과'
search(hashtag)
time.sleep(4)

comment = '안녕하세요! 잘 보고 갑니다!'
like_and_comment(comment)

 

 

4.픽사베이

더보기
!pip install chromedriver_autoinstaller
import time
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from urllib.request import Request, urlopen

# 접속
driver = webdriver.Chrome()
url = 'https://pixabay.com/'
driver.get(url)

# 검색
xpath = '/html/body/div[1]/div[1]/div[1]/div[3]/div[1]/div/form/input'
search = driver.find_element('xpath', xpath)
search.send_keys('강아지')
search.send_keys(Keys.RETURN)

# url 생성
img_xpath = '/html/body/div[1]/div[1]/div/div[2]/div[3]/div/div/div[4]/div[7]/div/a/img'
image_url = driver.find_element('xpath', img_xpath).get_attribute('src')
print('image_url', image_url) // image_url https://cdn.pixabay.com/photo/2016/01/05/17/51/maltese-1123016_640.jpg

# 다운 및 생성
image_byte = Request(image_url, headers={'User-Agent' : 'Mozilla/5.0 (Windows NT 10.0; Win64; x64)'})
f = open('dog.jpg', 'wb')
f.write(urlopen(image_byte).read())
f.close()

 

'데이터 분석' 카테고리의 다른 글

Pandas, Series, DataFrame 2 (2024-05-24  (1) 2024.05.27
Pandas, Series, DataFrame (2024-05-23)  (0) 2024.05.23
Numpy (2024-05-22)  (0) 2024.05.22
과제 여러개 파일 수집 (2024-05-22)  (0) 2024.05.22
Crawling, Scraping(2024-05-20)  (0) 2024.05.20