[Python] 크롤링 예제 (Feat. 다음 뉴스)

Notice

Recent Posts

Recent Comments

Link

현지님_블로그

« 2025/07 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Tags more

Archives

Today

Total

관리 메뉴

에코프로.AI

[Python] 크롤링 예제 (Feat. 다음 뉴스) 본문

AI Tutorial

[Python] 크롤링 예제 (Feat. 다음 뉴스)

AI_HitchHiker 2024. 8. 21. 20:39

다음뉴스

다음뉴스의 뉴스리스트 가져오기
링크정보 : https://news.daum.net

사이트 전체 정보 가져오기

import requests

url = 'https://news.daum.net/'
response = requests.get(url)

if response.status_code == 200:
    result = response.text
    print(result)
else:
    print('Failed : ', response.status_code)

BeautifulSoup 객체로 변환

from bs4 import BeautifulSoup

soup = BeautifulSoup(result, 'html.parser')
print(soup.prettify())

뉴스의 리스트 가져오기
- 검색 된 <li> 태그 안에 뉴스의 항목(타이틀, 링크, 언론사, 카테고리..) 등의 정보가 들어있음.

items = soup.select(".list_newsissue > li")
print(type(items))
print(len(items),'\r\n', items)

각각의 <li> 태그에서 뉴스타이틀, 뉴스링크, 언론사, 카테고리 분류
- css_selector 의 내용은 네이버사이트의 변경에 따라 바뀔 수 있습니다.
  - 크롬 - F12 입력 후, 우측의 'Elements' Selector를 이용하여 태그를 찾으시면 됩니다.

lst = []
for news in items:
    title = news.select_one('.tit_g > a').text.strip()                  # 뉴스타이틀
    link = news.select_one('.tit_g > a').attrs['href'].strip()          # 뉴스링크
    corp = news.select_one('.logo_cp').text.strip()                     # 언론사
    cate = news.select_one('.txt_category').text.strip()                # 카테고리
    
    lst.append([title, link, corp, cate])

print(lst)

Pandas의 DataFrame 으로 변환

import pandas as pd
df = pd.DataFrame(lst, columns = ['뉴스타이틀', '뉴스링크', '언론사', '카테고리'])
df

.xlsx 엑셀파일로 저장

df.to_excel('daumnews.xlsx', index=False)

끝~

저작자표시 비영리 변경금지 (새창열림)

'AI Tutorial' 카테고리의 다른 글

[Tensorflow] Tensorflow 소개 및 간단한 모델링 (0)	2024.08.22
[Python] 크롤링 예제 (Feat. 멜론 차트) (0)	2024.08.21
[Python] BeautifulSoup 라이브러리 소개 및 기본활용 (0)	2024.08.20
[Python] Requests 라이브러리 소개 및 활용(Feat. xml) (0)	2024.08.20
[Python] Requests 라이브러리 소개 및 활용(Feat. json) (0)	2024.08.18

'AI Tutorial' Related Articles

에코프로.AI

[Python] 크롤링 예제 (Feat. 다음 뉴스) 본문

[Python] 크롤링 예제 (Feat. 다음 뉴스)

다음뉴스

'AI Tutorial' 카테고리의 다른 글

티스토리툴바