naver 주식 web parsing

금융 2022. 12. 7. 20:18

728x90

naver 주식의 종목을 자동으로 긁어오고 싶었다.

그런데, 지난번 posting에서는 (참조 : https://engineerer.tistory.com/2) 체크박스를 제어하는 기능을 넣지 않았다...

아래와 같이 원하는 체크박스들을 설정하기 위해 새로운 코드를 만들어보았다.

아래와 같이 급등 종목 / 하락 종목 / 거래량 상위 종목을 pasring 할 수 있도록 옵션을 넣었으며

check box는 option_list 변수에서 알아서 체크하도록 추가하였다. ( 이를 위해 selenium을 사용)

import pandas as pd
from selenium import webdriver
from selenium.webdriver.common.by import By
from bs4 import BeautifulSoup as BS

def dataframe_from_naver(종류):
    if 종류=='급등':
        # 급등 종목
        url = "https://finance.naver.com/sise/sise_low_up.naver?sosok=1"
    elif 종류=='하락':
        # 하락 종목
        url = "https://finance.naver.com/sise/sise_fall.naver?sosok=1"

    elif 종류=='거래량':
        # 거래량 급등 종목
        url = "https://finance.naver.com/sise/sise_quant.naver?sosok=1"


    # 옵션 생성
    options = webdriver.ChromeOptions()
    # 창 숨기는 옵션 추가
    options.add_argument("headless")

    ### option 설정
    driver = webdriver.Chrome('chromedriver', options=options)
    driver.get(url)

    option_list = ['true', None, None, None, None, None,
                   'true', None, None, None, None, None,
                   'true', 'true', None, None, None, None,
                   'true', 'true', None, None, None, None,
                   None, None, None]
    for index, flag in enumerate(option_list):
        check = driver.find_element(By.ID, "option" + str(index + 1))
        if check.get_attribute('checked') != flag:
            check.click()

    button = driver.find_element(By.XPATH,'//a[@href="javascript:fieldSubmit()"]')
    button.click()


    ###

    # agent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.45 Safari/537.36"
    # response = requests.get(url=url, headers={"User-Agent": agent})
    # res_text = response.text

    res_text = driver.page_source
    parsed_res_text = BS(res_text, 'html.parser')

    # print(parsed_res_text)

    tables = parsed_res_text.find_all("table")


    # print(tables[1])

    # temp=tables[1].find_all('td', {"class":"no"})
    lines=tables[1].find_all('tr')

    항목=lines[0].find_all('th')

    name =[]
    for index, l in enumerate(항목):
        name = name + [l.text]
    # print(name)

    data = [[],[],[],[],[],[],[],[],[],[],[],[]]
    종목코드=[]

    for index, l in enumerate(lines):

        temp_data=l.find_all('td')
        for index2, l2 in enumerate(temp_data):
            # print(l2)
            text_temp = l2.text
            text_temp = text_temp.replace("\t", "")
            text_temp = text_temp.replace("\n", "")
            text_temp = text_temp.replace("+", "")
            text_temp = text_temp.replace("-", "")
            text_temp = text_temp.replace("%", "")
            # print(text_temp)
            if text_temp!="":
                data[index2]=data[index2]+[text_temp]

            temp_url=l2.find('a')
            if temp_url!=None:
                text_string = temp_url['href']
                code_tmp = text_string.split("?code=")
                종목코드 = 종목코드+[code_tmp[1]]



    # print(data)

    df=pd.DataFrame({"종목코드": 종목코드})
    for i,temp_name in enumerate(name):
        if i>0:
            temp_dataframe=pd.DataFrame({temp_name:data[i]})
            # print(temp_dataframe)
            df=pd.concat([df,temp_dataframe],axis=1)

    index=df[df['매도총잔량']=='0'].index
    df.drop(index,inplace=True)
    # print(df)

    매수매도잔량비=df['매수총잔량'].str.replace(',', '').astype(float)/df['매도총잔량'].str.replace(',', '').astype(float)
    temp_dataframe=pd.DataFrame({'매수매도잔량비':매수매도잔량비})
    df = pd.concat([df, temp_dataframe], axis=1)
    # print(df)
    return df

if __name__ == "__main__":
    option = '급등'
    data=dataframe_from_naver(option)
    print(data)

동작 결과

아래와 같이 원하는 정보를 가져올 수 있었다.

      종목코드    등락률      종목명     현재가  ...      저가    매수총잔량    매도총잔량   매수매도잔량비
1   366030  29.03     공구우먼  14,000  ...  10,850  113,896  136,050  0.837163
2   290720  21.61     푸드나무  23,350  ...  19,200    2,364    1,861  1.270285
3   227610  21.21   아우딘퓨쳐스   2,200  ...   1,815   29,439   25,342  1.161668
4   030350  18.06   드래곤플라이   1,085  ...     919  183,059  219,154  0.835298
5   140070  17.11  서플러스글로벌   3,560  ...   3,040   26,385   24,442  1.079494
..     ...    ...      ...     ...  ...     ...      ...      ...       ...
95  204840   5.77     지엘팜텍     751  ...     710    8,652   16,274  0.531646
96  066910   5.69      손오공   2,230  ...   2,110   83,052  155,756  0.533219
97  032790   5.66     비엔지티   2,800  ...   2,650   17,788   20,464  0.869234
98  039740   5.61   한국정보공학   3,670  ...   3,475   43,952    6,980  6.296848
99  219130   5.60    타이거일렉  21,700  ...  20,550    3,973    6,874  0.577975

* 매수매도 잔량비 : 매수총잔량/매도총잔량을 추가해보았다. 의미가 있지 않을까???

728x90

'금융' 카테고리의 다른 글

오전 9시 주식 단타, 정말 다를까? (2)	2022.12.10
naver 주식 자동 저장 (0)	2022.12.08
This version of ChromeDriver only supports Chrome version 106 (0)	2022.12.05
pykiwoom pyqt5 import 문제 (0)	2022.11.30
kiwoom API 비밀번호 저장 창 안 뜰 때 (0)	2022.11.30

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

인기포스트

ABOUT ME

이것저것 이것저것

'금융' 카테고리의 다른 글

티스토리툴바

단축키

내 블로그

블로그 게시글

모든 영역

인기포스트

ABOUT ME

'금융' 카테고리의 다른 글

관련글 관련글 더보기

티스토리툴바

단축키

내 블로그

블로그 게시글

모든 영역