<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0">
  <channel>
    <title>느리지만 꾸준히</title>
    <link>https://slowin.tistory.com/</link>
    <description>기록</description>
    <language>ko</language>
    <pubDate>Thu, 18 Jun 2026 17:03:53 +0900</pubDate>
    <generator>TISTORY</generator>
    <ttl>100</ttl>
    <managingEditor>slowin</managingEditor>
    <image>
      <title>느리지만 꾸준히</title>
      <url>https://tistory1.daumcdn.net/tistory/7205707/attach/5143121f5e264accae1a06765455cb64</url>
      <link>https://slowin.tistory.com</link>
    </image>
    <item>
      <title>[AI 부트캠프 회고]Upstage AI Lab 7개월간의 도전과 성장, 그리고 수료</title>
      <link>https://slowin.tistory.com/69</link>
      <description>&lt;p data-ke-size=&quot;size16&quot;&gt;지난해 11월부터 어제까지, 약 7개월 동안 &lt;b&gt;Upstage AI Lab 6기 부트캠프&lt;/b&gt;에 참여했다. 그리고 2025년 6월 18일 수요일, 마침내 그 여정을 마무리했다. 이 글은 그동안 내가 어떤 고민을 했고, 어떤 경험을 했으며, 무엇을 배웠는지를 기록하기 위한 회고록이다.&lt;/p&gt;
&lt;hr data-ke-style=&quot;style1&quot; /&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;AI에 대한 첫 관심&lt;/h2&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;AI에 처음 관심을 가지게 된 계기는, 한국 사람이라면 누구나 기억할 &lt;b&gt;이세돌 vs 알파고&lt;/b&gt; 대국이었다. 그때 나는 웹 개발자로 일하고 있었고, &quot;AI를 배워야겠다&quot;는 생각으로 아무런 준비도 없이 머신러닝 책 『케라스』를 구매했다. 그러나 책장을 펴자마자 이해할 수 없는 내용에 당황했고, 그대로 책을 책상 깊숙한 곳에 넣어두었다.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;그 이후에도 &lt;b&gt;데이터 엔지니어링 강의&lt;/b&gt;, &lt;b&gt;유튜브 AI 콘텐츠&lt;/b&gt; 등을 종종 보며 흥미는 계속 이어졌지만, &quot;아직은 너무 Low-Level 기술이다&quot;라는 판단으로 본격적인 도전은 미뤄두었다. 그러다 &lt;b&gt;ChatGPT의 등장&lt;/b&gt;으로 기술이 빠르게 고도화되는 걸 직접 체감했고, 다시 한번 진지하게 AI를 공부해봐야겠다고 결심했다.&lt;/p&gt;
&lt;hr data-ke-style=&quot;style1&quot; /&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;부트캠프 선택기&lt;/h2&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;시간이 많지 않았기 때문에 &lt;b&gt;빠르게 시작할 수 있는 과정&lt;/b&gt;과 &lt;b&gt;신뢰할 수 있는 교육 기관&lt;/b&gt;이 중요했다. 그 기준에서 눈에 들어온 두 가지 프로그램이 있었다.&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;b&gt;Boostcamp (네이버 커넥트재단)&lt;/b&gt;&lt;/li&gt;
&lt;li&gt;&lt;b&gt;Upstage AI Lab&lt;/b&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;Boostcamp가 1순위였지만 일정이 맞지 않아, 한 달 후 시작하는 Upstage AI Lab을 선택했다.&lt;/p&gt;
&lt;hr data-ke-style=&quot;style1&quot; /&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;부트캠프 초기: 기초 다지기&lt;/h2&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;처음엔 &lt;b&gt;방통대 편입학&lt;/b&gt;도 함께 고려했지만, 부트캠프의 첫 한 달 수업(파이썬, 선형대수, 통계 등)을 듣고 나서 두 가지를 병행하는 건 무리라는 판단에 방통대는 졸업 후로 미뤘다.&lt;/p&gt;
&lt;hr data-ke-style=&quot;style1&quot; /&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;첫 대회: 서울 아파트 가격 예측 (2024.12.23 ~ 2025.01.07)&lt;/h2&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;내가 Upstage 부트캠프를 선택한 가장 큰 이유는 &lt;b&gt;대회와 프로젝트 중심의 커리큘럼&lt;/b&gt;이었다.&lt;br /&gt;하지만 첫 대회에서는 &lt;b&gt;전혀 성과를 내지 못했다.&lt;/b&gt;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;로컬 평가 점수(5000)와 제출 점수(45000)의 괴리&lt;/li&gt;
&lt;li&gt;실험에 대한 기준점 부재&lt;/li&gt;
&lt;li&gt;평가셋 구성의 부정합&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;이런 시행착오 속에서 가장 크게 배운 것은 &lt;b&gt;로컬 평가 환경의 중요성&lt;/b&gt;이었다. 대회가 끝난 후에는 혼자 다시 평가셋을 구성해보며 복기를 했고, 그 과정에서 더 많이 배울 수 있었다.&lt;/p&gt;
&lt;hr data-ke-style=&quot;style1&quot; /&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;두 번째 프로젝트: MLOps (2025.01.16 ~ 2025.02.03)&lt;/h2&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;팀장을 맡아 진행한 MLOps 프로젝트에서는 웹 개발자로서의 경험이 도움이 되었다.&lt;br /&gt;프로젝트 주제는 &lt;b&gt;서울 공공기상 데이터 기반 기온 예측&lt;/b&gt;이었고, &lt;b&gt;Airflow&lt;/b&gt;, &lt;b&gt;MLflow&lt;/b&gt;를 처음 접하며 &lt;b&gt;&quot;데이터 수집 &amp;rarr; 학습 &amp;rarr; 버전 관리 &amp;rarr; 배포&quot;&lt;/b&gt; 전 과정을 경험할 수 있었다.&lt;/p&gt;
&lt;hr data-ke-style=&quot;style1&quot; /&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;세 번째 대회: 문서 타입 분류 (Computer Vision)&lt;/h2&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;이 대회는 가장 기억에 남는 대회였다. 이유는 다음과 같다:&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;최종 점수: &lt;b&gt;0.9832&lt;/b&gt; (전 기수 최고 성적 기록 갱신)&lt;/li&gt;
&lt;li&gt;철저한 &lt;b&gt;가설 설정 &amp;rarr; 검증 프로세스&lt;/b&gt;로 실험 설계&lt;/li&gt;
&lt;li&gt;실험의 이유와 효과를 명확히 이해하며 점진적 성능 향상&lt;/li&gt;
&lt;li&gt;&lt;b&gt;YOLOv5&lt;/b&gt; 객체탐지 모델을 도입하여 문서 타이틀 인식 &amp;rarr; 성능 급상승&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;결국 강사님께 &amp;ldquo;감동적이다&amp;rdquo;는 칭찬까지 들을 수 있었고, 해당 대회의 점수는 과정중에 가끔식 회자되기도 했다.&lt;/p&gt;
&lt;hr data-ke-style=&quot;style1&quot; /&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;그 외 프로젝트들&lt;/h2&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;b&gt;Dialogue Summarization&lt;/b&gt;&lt;/li&gt;
&lt;li&gt;&lt;b&gt;Scientific Knowledge QA&lt;/b&gt;&lt;/li&gt;
&lt;li&gt;&lt;b&gt;Chemical Process Anomaly Detection&lt;/b&gt;&lt;/li&gt;
&lt;li&gt;&lt;b&gt;기업 연계 프로젝트&lt;/b&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;다양한 실전 프로젝트를 경험하며 실력을 넓히고, 실무에 대한 감각도 키울 수 있었다.&lt;/p&gt;
&lt;hr data-ke-style=&quot;style1&quot; /&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;마무리하며&lt;/h2&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;사실 부트캠프 시작 전, &lt;b&gt;불안감과 두려움&lt;/b&gt;도 컸다. 내가 과연 잘할 수 있을까? 그 생각에 마음이 무거웠다. 하지만 &lt;b&gt;욕심을 내려놓고 순간을 즐기자&lt;/b&gt;는 마음가짐으로 임했던 것이 큰 도움이 됐다.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;그 결과, 나는 &lt;b&gt;우수 수료생&lt;/b&gt;으로 선발되어 AI 부트캠프를 성공적으로 마무리할 수 있었다.&lt;/p&gt;
&lt;hr data-ke-style=&quot;style1&quot; /&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;끝으로&lt;/h2&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;AI는 여전히 어렵고 방대하지만, &lt;b&gt;처음보다 훨씬 가까운 기술&lt;/b&gt;이 되었다. 그리고 그 시작은 '호기심'과 '포기하지 않는 실험'이었다. 이 글을 읽는 누군가에게도 그 시작의 용기가 닿기를 바란다.&lt;/p&gt;</description>
      <category>프로그래밍/AI</category>
      <author>slowin</author>
      <guid isPermaLink="true">https://slowin.tistory.com/69</guid>
      <comments>https://slowin.tistory.com/69#entry69comment</comments>
      <pubDate>Thu, 19 Jun 2025 14:46:19 +0900</pubDate>
    </item>
    <item>
      <title>기업연계 - 고객센터 Agent 프로젝트</title>
      <link>https://slowin.tistory.com/68</link>
      <description>&lt;h1&gt;커머스 고객센터 챗봇을 위한 멀티 에이전트 기반 지식 그래프 시스템 구축&lt;/h1&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;개요&lt;/h2&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;본 프로젝트는 &lt;b&gt;퓨처워크랩(Future Work Lab)&lt;/b&gt;과 협업하여 진행한 산업 연계형 프로젝트입니다. 총 4인의 팀으로 구성되었으며, 저는 팀장 역할과 함께 &lt;b&gt;기술 검토 및 멀티 에이전트 시스템 개발&lt;/b&gt;을 담당하였습니다.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;프로젝트의 핵심 목표는 &lt;b&gt;커머스 고객센터에 특화된 멀티 에이전트 기반 챗봇 시스템&lt;/b&gt;을 개발하는 것이었습니다. 이를 위해 최신 LLM(대규모 언어 모델) 생태계 도구인 &lt;b&gt;LangChain&lt;/b&gt;, &lt;b&gt;LangGraph&lt;/b&gt;, &lt;b&gt;Neo4j&lt;/b&gt; 등을 적극적으로 활용하였습니다.&lt;/p&gt;
&lt;hr data-ke-style=&quot;style1&quot; /&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;프로젝트 기간&lt;/h2&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b&gt;2025.05.09 ~ 2025.06.05&lt;/b&gt;&lt;/p&gt;
&lt;hr data-ke-style=&quot;style1&quot; /&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;역할 및 책임&lt;/h2&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;b&gt;팀장&lt;/b&gt;&lt;br /&gt;프로젝트 방향성 수립 및 팀원 간 협업 조율&lt;/li&gt;
&lt;li&gt;&lt;b&gt;기술 리드&lt;/b&gt;&lt;br /&gt;멀티 에이전트 구조 및 지식 그래프 설계/검토&lt;/li&gt;
&lt;li&gt;&lt;b&gt;멀티 에이전트 개발&lt;/b&gt;&lt;br /&gt;LangChain &amp;amp; LangGraph 기반 에이전트 시스템 구현&lt;/li&gt;
&lt;/ul&gt;
&lt;hr data-ke-style=&quot;style1&quot; /&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;기술 스택&lt;/h2&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;b&gt;멀티 에이전트 프레임워크&lt;/b&gt;: LangChain, LangGraph, AutoGen&lt;/li&gt;
&lt;li&gt;&lt;b&gt;그래프 데이터베이스&lt;/b&gt;: Neo4j&lt;/li&gt;
&lt;li&gt;&lt;b&gt;Embedding Search&lt;/b&gt;: Vector Search (Neo4j)&lt;/li&gt;
&lt;/ul&gt;
&lt;hr data-ke-style=&quot;style1&quot; /&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;주요 작업 및 성과&lt;/h2&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;1. 멀티 에이전트 시스템 구축&lt;/h3&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;b&gt;초기 조사 및 프레임워크 비교&lt;/b&gt;&lt;br /&gt;다양한 오픈소스 프레임워크(AutoGen, CrewAI 등)를 비교 분석하고, 구조적 유연성과 상태 관리 측면에서 &lt;b&gt;LangChain + LangGraph 조합&lt;/b&gt;을 선택&lt;/li&gt;
&lt;li&gt;&lt;b&gt;AutoGen 기반 1차 프로토타입 설계&lt;/b&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;code&gt;hand off 구조&lt;/code&gt; 기반의 멀티 에이전트 시나리오 구현&lt;/li&gt;
&lt;li&gt;AutoGen을 통해 에이전트 간 협업 구조 개념을 명확히 이해&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;577&quot; data-origin-height=&quot;326&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/eRHWtJ/btsOuWU87Rg/PexARsyhjdmz7DdTKZU61K/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/eRHWtJ/btsOuWU87Rg/PexARsyhjdmz7DdTKZU61K/img.png&quot; data-alt=&quot;hand off 구조도&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/eRHWtJ/btsOuWU87Rg/PexARsyhjdmz7DdTKZU61K/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FeRHWtJ%2FbtsOuWU87Rg%2FPexARsyhjdmz7DdTKZU61K%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;547&quot; height=&quot;309&quot; data-origin-width=&quot;577&quot; data-origin-height=&quot;326&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;hand off 구조도&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;b&gt;LangGraph로 전환 및 개선&lt;/b&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;LangGraph의 &lt;b&gt;명시적 State 관리&lt;/b&gt; 기능을 활용하여 복잡한 에이전트 간 워크플로우를 시각화 및 구현&lt;/li&gt;
&lt;li&gt;System Message 최적화를 통해 불필요한 턴(Turn) 발생 이슈를 해결&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1006&quot; data-origin-height=&quot;514&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/cskg08/btsOttGKps9/zk9vCkPKiD6OKOnZXSdM80/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/cskg08/btsOttGKps9/zk9vCkPKiD6OKOnZXSdM80/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/cskg08/btsOttGKps9/zk9vCkPKiD6OKOnZXSdM80/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fcskg08%2FbtsOttGKps9%2Fzk9vCkPKiD6OKOnZXSdM80%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;674&quot; height=&quot;344&quot; data-origin-width=&quot;1006&quot; data-origin-height=&quot;514&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;hr data-ke-style=&quot;style1&quot; /&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;2. 온톨로지 기반 지식그래프 구축&lt;/h3&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;b&gt;GraphRAG 방식 학습 및 응용&lt;/b&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;강의 예제(&lt;code&gt;Academic QA System&lt;/code&gt;)를 참고하여 온톨로지 설계 및 구축 방식 이해&lt;/li&gt;
&lt;li&gt;고객센터 업무 문서를 Neo4j 그래프 구조로 변환하는 프로세스 실험&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;b&gt;Graph DB 자동화 시도&lt;/b&gt;&lt;/li&gt;
&lt;li&gt;문서 형식의 작업지시서를 입력으로 받아, 구조화된 &lt;b&gt;Knowledge Graph 생성 파이프라인&lt;/b&gt; 설계&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1645&quot; data-origin-height=&quot;880&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/dvTah4/btsOuVvaCHy/DP1iFEEKnlX4et96BVK6b1/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/dvTah4/btsOuVvaCHy/DP1iFEEKnlX4et96BVK6b1/img.png&quot; data-alt=&quot;문서 -&amp;amp;gt; LLM -&amp;amp;gt; 온톨로지&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/dvTah4/btsOuVvaCHy/DP1iFEEKnlX4et96BVK6b1/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FdvTah4%2FbtsOuVvaCHy%2FDP1iFEEKnlX4et96BVK6b1%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;1645&quot; height=&quot;880&quot; data-origin-width=&quot;1645&quot; data-origin-height=&quot;880&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;문서 -&amp;gt; LLM -&amp;gt; 온톨로지&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/p&gt;
&lt;hr data-ke-style=&quot;style1&quot; /&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;3. 정보 검색 및 추론 방식 실험&lt;/h3&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;b&gt;Embedding Similarity Search&lt;/b&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;그래프 노드 정보를 벡터로 임베딩하여 유사도 기반 검색 구현&lt;/li&gt;
&lt;li&gt;단일 노드 검색 성능은 우수하나, &lt;b&gt;다단계 그래프 추론이 어려움&lt;/b&gt;을 확인&lt;/li&gt;
&lt;li&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;674&quot; data-origin-height=&quot;1532&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/rYZH9/btsOtiyGBAB/mHUIGvWHIyAHkNlILGckO1/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/rYZH9/btsOtiyGBAB/mHUIGvWHIyAHkNlILGckO1/img.png&quot; data-alt=&quot;검색용 embedding 속성 추가&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/rYZH9/btsOtiyGBAB/mHUIGvWHIyAHkNlILGckO1/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FrYZH9%2FbtsOtiyGBAB%2FmHUIGvWHIyAHkNlILGckO1%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;286&quot; height=&quot;650&quot; data-origin-width=&quot;674&quot; data-origin-height=&quot;1532&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;검색용 embedding 속성 추가&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;b&gt;기존 RAG 방식의 한계&lt;/b&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;문서 단위 검색으로는 A&amp;rarr;B&amp;rarr;C 형태의 그래프 추론이 필요한 &lt;b&gt;KGQA&lt;/b&gt;에서 한계 발생&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;b&gt;GNN-RAG 도입 가능성 탐색&lt;/b&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;논문 &lt;b&gt;GNN-RAG: Graph Neural Retrieval for LLM Reasoning&lt;/b&gt; 기반으로,&lt;br /&gt;&lt;b&gt;LLM의 자연어 이해력 + GNN의 구조적 추론 능력&lt;/b&gt;을 결합한 새로운 RAG 구조 가능성 탐색&lt;/li&gt;
&lt;li&gt;&lt;b&gt;진행 상황&lt;/b&gt;:&lt;br /&gt;실제 구현 및 정량적 평가까지는 이르지 못했으나, 구조적 탐색의 유연성과 LLM-기반 응답 정확도 개선 측면에서 &lt;b&gt;실험적 가능성을 확인&lt;/b&gt;한 수준에서 마무리.&lt;/li&gt;
&lt;li&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1439&quot; data-origin-height=&quot;632&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/bfiwDR/btsOuWHBVaL/C9dUZwB0xjTknhk5MQHAX1/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/bfiwDR/btsOuWHBVaL/C9dUZwB0xjTknhk5MQHAX1/img.png&quot; data-alt=&quot;\GNN - RAG&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/bfiwDR/btsOuWHBVaL/C9dUZwB0xjTknhk5MQHAX1/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FbfiwDR%2FbtsOuWHBVaL%2FC9dUZwB0xjTknhk5MQHAX1%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;1439&quot; height=&quot;632&quot; data-origin-width=&quot;1439&quot; data-origin-height=&quot;632&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;\GNN - RAG&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;hr data-ke-style=&quot;style1&quot; /&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;주요 이슈 및 인사이트&lt;/h2&gt;
&lt;table data-ke-align=&quot;alignLeft&quot;&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;이슈&lt;/th&gt;
&lt;th&gt;대응 및 인사이트&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;에이전트 간 상태 공유의 복잡성&lt;/td&gt;
&lt;td&gt;LangGraph의 State Transition 기능 활용&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;반복되는 System Message&lt;/td&gt;
&lt;td&gt;메시지 템플릿 최적화로 해결&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;스키마 구조의 확장성 문제&lt;/td&gt;
&lt;td&gt;단일 구조보다 유연한 온톨로지 설계 필요&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;그래프 기반 검색의 한계&lt;/td&gt;
&lt;td&gt;GNN 기반 추론 구조(GNN-RAG) 가능성 확인&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;hr data-ke-style=&quot;style1&quot; /&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;결론&lt;/h2&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;이번 프로젝트를 통해 &lt;b&gt;멀티 에이전트 시스템 설계&lt;/b&gt;, &lt;b&gt;온톨로지 기반 지식 그래프 구축&lt;/b&gt;, &lt;b&gt;LLM-RAG 시스템의 한계 및 확장성&lt;/b&gt;에 대한 깊은 통찰을 얻을 수 있었습니다. 특히, 단순 챗봇을 넘어 &lt;b&gt;지식 추론이 가능한 고객 응대 시스템의 가능성&lt;/b&gt;을 탐색하는 소중한 계기가 되었습니다.&lt;/p&gt;</description>
      <category>프로그래밍/AI</category>
      <author>slowin</author>
      <guid isPermaLink="true">https://slowin.tistory.com/68</guid>
      <comments>https://slowin.tistory.com/68#entry68comment</comments>
      <pubDate>Mon, 9 Jun 2025 20:45:56 +0900</pubDate>
    </item>
    <item>
      <title>이상탐지(Anomaly Detection)</title>
      <link>https://slowin.tistory.com/67</link>
      <description>&lt;h1&gt;이상탐지란 무엇인가?&lt;/h1&gt;
&lt;hr data-ke-style=&quot;style1&quot; /&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;1. 서론: 왜 이상탐지가 중요한가?&lt;/h2&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b&gt;이상탐지(Anomaly Detection)&lt;/b&gt;란 정상적인 패턴에서 벗어난 데이터를 찾아내는 과정입니다.&lt;br /&gt;이상 데이터는 종종 문제의 전조이며, 때로는 기회일 수도 있습니다.&lt;/p&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;일상 속 예시&lt;/h3&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;b&gt;신용카드 사기&lt;/b&gt;: 평소와 다른 지역에서 큰 금액 결제&lt;/li&gt;
&lt;li&gt;&lt;b&gt;서버 이상&lt;/b&gt;: 갑작스러운 트래픽 급증&lt;/li&gt;
&lt;li&gt;&lt;b&gt;센서 고장&lt;/b&gt;: 이상한 값이 지속적으로 측정됨&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;머신러닝에서의 역할&lt;/h3&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;정상 데이터를 학습해 &lt;b&gt;정상 범위&lt;/b&gt;를 파악&lt;/li&gt;
&lt;li&gt;이를 벗어나는 데이터를 &lt;b&gt;이상치&lt;/b&gt;로 분류&lt;/li&gt;
&lt;/ul&gt;
&lt;hr data-ke-style=&quot;style1&quot; /&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;2. 이상탐지의 분류&lt;/h2&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;지도학습(Supervised)&lt;/h3&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;b&gt;특징&lt;/b&gt;: 이상/정상 라벨 존재&lt;/li&gt;
&lt;li&gt;&lt;b&gt;기법&lt;/b&gt;: 분류 모델 (로지스틱 회귀, 의사결정트리 등)&lt;/li&gt;
&lt;li&gt;&lt;b&gt;주의&lt;/b&gt;: 이상 라벨 확보가 어려움&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;비지도학습(Unsupervised)&lt;/h3&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;b&gt;특징&lt;/b&gt;: 라벨 없음&lt;/li&gt;
&lt;li&gt;&lt;b&gt;기법&lt;/b&gt;: 클러스터링, 거리 기반 (k-NN, LOF 등)&lt;/li&gt;
&lt;li&gt;&lt;b&gt;활용&lt;/b&gt;: 이상 라벨이 없을 때 가장 널리 사용&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;준지도학습(Semi-supervised)&lt;/h3&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;b&gt;특징&lt;/b&gt;: 대부분 정상 데이터로만 학습&lt;/li&gt;
&lt;li&gt;&lt;b&gt;기법&lt;/b&gt;: Autoencoder, One-Class SVM&lt;/li&gt;
&lt;li&gt;&lt;b&gt;장점&lt;/b&gt;: 이상 데이터가 적을 때 효과적&lt;/li&gt;
&lt;/ul&gt;
&lt;hr data-ke-style=&quot;style1&quot; /&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;3. 주요 이상탐지 기법&lt;/h2&gt;
&lt;table data-ke-align=&quot;alignLeft&quot;&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;분류&lt;/th&gt;
&lt;th&gt;기법 예시&lt;/th&gt;
&lt;th&gt;설명&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;통계 기반&lt;/td&gt;
&lt;td&gt;평균, 표준편차, Z-Score&lt;/td&gt;
&lt;td&gt;통계 지표로 기준 이상값 판별&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;거리 기반&lt;/td&gt;
&lt;td&gt;k-NN, LOF&lt;/td&gt;
&lt;td&gt;이웃과 거리 기반으로 이상 탐지&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;밀도 기반&lt;/td&gt;
&lt;td&gt;DBSCAN&lt;/td&gt;
&lt;td&gt;밀도가 낮은 점을 이상치로 간주&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;머신러닝 기반&lt;/td&gt;
&lt;td&gt;Isolation Forest, One-Class SVM&lt;/td&gt;
&lt;td&gt;학습 기반 이상탐지&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;딥러닝 기반&lt;/td&gt;
&lt;td&gt;Autoencoder, LSTM&lt;/td&gt;
&lt;td&gt;복잡한 패턴 학습 및 이상 검출&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;hr data-ke-style=&quot;style1&quot; /&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;4. 이상탐지 모델 개발 흐름&lt;/h2&gt;
&lt;ol style=&quot;list-style-type: decimal;&quot; data-ke-list-type=&quot;decimal&quot;&gt;
&lt;li&gt;&lt;b&gt;데이터 수집 및 전처리&lt;/b&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;결측치 처리, 정규화, 특성 선택 등&lt;/li&gt;
&lt;/ul&gt;
&lt;ol style=&quot;list-style-type: decimal;&quot; start=&quot;2&quot; data-ke-list-type=&quot;decimal&quot;&gt;
&lt;li&gt;&lt;b&gt;이상 기준 설정&lt;/b&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;예: 평균 &amp;plusmn; 3표준편차, 재구성 오차 임계값 등&lt;/li&gt;
&lt;/ul&gt;
&lt;ol style=&quot;list-style-type: decimal;&quot; start=&quot;3&quot; data-ke-list-type=&quot;decimal&quot;&gt;
&lt;li&gt;&lt;b&gt;모델 선택 및 학습&lt;/b&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;비지도/준지도/지도학습 모델 중 선택&lt;/li&gt;
&lt;/ul&gt;
&lt;ol style=&quot;list-style-type: decimal;&quot; start=&quot;4&quot; data-ke-list-type=&quot;decimal&quot;&gt;
&lt;li&gt;&lt;b&gt;성능 평가&lt;/b&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;Precision, Recall, F1-score, AUC&lt;/li&gt;
&lt;/ul&gt;
&lt;ol style=&quot;list-style-type: decimal;&quot; start=&quot;5&quot; data-ke-list-type=&quot;decimal&quot;&gt;
&lt;li&gt;&lt;b&gt;실무 고려사항&lt;/b&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;불균형 데이터, 개념 변화(Concept Drift), 실시간 처리 여부&lt;/li&gt;
&lt;/ul&gt;
&lt;hr data-ke-style=&quot;style1&quot; /&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;5. 실생활 응용 예시&lt;/h2&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;금융 분야: 신용카드 거래나 송금 데이터에서 비정상 패턴을 실시간으로 감지해 사기를 차단합니다
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;은행은 비정상적인 거래 금액&amp;middot;횟수를 발견하면 즉시 경보를 띄웁니다.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;제조업 분야: 공장 설비의 센서 데이터나 카메라 영상을 분석해 장비 고장이나 제품 결함을 조기에 식별합니다
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;센서에서 이상 신호가 감지되면 예지보전을 수행해 생산 차질을 줄일 수 있습니다.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;헬스케어 분야: 환자의 바이털(signals)이나 의료 기록에서 비정상 신호를 탐지해 질병 초기 징후를 발견하고 환자 모니터링에 활용합니다
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;심전도(ECG) 데이터에서 이상 패턴을 실시간으로 감지해 응급 상황을 예방할 수 있습니다.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;IT/인프라 분야: 서버 로그나 네트워크 트래픽 분석을 통해 침입/오류 이상 징후를 실시간으로 모니터링합니다
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;침입탐지시스템(IDS)은 이상 패턴을 파악해 DDoS나 무단 접근 시도를 경고하고, 장애 예측 및 대응에 활용됩니다.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;hr data-ke-style=&quot;style1&quot; /&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;6. 마무리&lt;/h2&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;오픈 데이터셋&lt;/h3&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;a href=&quot;https://www.kaggle.com/datasets/mlg-ulb/creditcardfraud&quot;&gt;Kaggle - Credit Card Fraud Detection&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;실전 시 주의점&lt;/h3&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;오탐/미탐(FP/FN) 비용 고려&lt;/li&gt;
&lt;li&gt;이상 판단 기준 재설정 주기 필요&lt;/li&gt;
&lt;li&gt;도메인 지식 적극 활용&lt;/li&gt;
&lt;/ul&gt;</description>
      <category>프로그래밍/AI</category>
      <author>slowin</author>
      <guid isPermaLink="true">https://slowin.tistory.com/67</guid>
      <comments>https://slowin.tistory.com/67#entry67comment</comments>
      <pubDate>Fri, 2 May 2025 18:22:25 +0900</pubDate>
    </item>
    <item>
      <title>과학 지식 질의 IR 경진대회</title>
      <link>https://slowin.tistory.com/66</link>
      <description>&lt;h2 data-ke-size=&quot;size26&quot;&gt;1. 서론&lt;/h2&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;이번 &lt;b&gt;Information Retrieval(이하 IR) 대회&lt;/b&gt;는 &amp;ldquo;질문과 이전 대화 히스토리를 보고 최적의 문서를 찾아 답변을 생성&amp;rdquo;하는 과제를 해결하는 과정이었습니다. 2025-04-16 부터 2025-04-22 까지 단 1주일​이라는 짧은 기간 동안 빠르게 가설을 세우고, 실험을 반복하며, 점차 성능을 끌어올렸습니다. 최종적으로는 &lt;b&gt;0.8833&lt;/b&gt;이라는 만족스러운 점수에 도달했고, 그 과정에서 검색 &amp;middot; 분류 &amp;middot; 재랭킹 전반에 걸친 노하우를 축적했습니다.&lt;/p&gt;
&lt;hr data-ke-style=&quot;style1&quot; /&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;2. 대회 목표 &amp;amp; 범위&lt;/h2&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;구분 내용&lt;/p&gt;
&lt;table style=&quot;border-collapse: collapse; width: 100%;&quot; border=&quot;1&quot; data-ke-align=&quot;alignLeft&quot;&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;b&gt;궁극적 목표&lt;/b&gt;&lt;/td&gt;
&lt;td&gt;다양한 질의 유형(과학 지식 질문 vs 일반 질문)에 대해 &lt;b&gt;정확하고 빠른 문서 검색&lt;/b&gt; 파이프라인 구축&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;b&gt;제한 사항&lt;/b&gt;&lt;/td&gt;
&lt;td&gt;1. 과학 &amp;middot; 지식 질문에만 문서 검색 적용2. 그 외 질문은 문서 검색 X&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;b&gt;평가지표&lt;/b&gt;&lt;/td&gt;
&lt;td&gt;대회 측에서 제공한 전용 스코어(정확도 + 정성 평가 혼합)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;b&gt;기간&lt;/b&gt;&lt;/td&gt;
&lt;td&gt;2025-04-16 ~ 2025-04-22 (7 일)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;hr data-ke-style=&quot;style1&quot; /&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;3. 실험 연대기&lt;/h2&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;실험 핵심 전략 주요 변화 점수&lt;/p&gt;
&lt;table style=&quot;border-collapse: collapse; width: 100%;&quot; border=&quot;1&quot; data-ke-align=&quot;alignLeft&quot;&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;b&gt;1&lt;/b&gt;&lt;/td&gt;
&lt;td&gt;&amp;nbsp;Function Call 기반 검색&lt;/td&gt;
&lt;td&gt;슬라이드 정보를 함수 호출로 직접 꺼내 쓰도록 시도&lt;/td&gt;
&lt;td&gt;&lt;b&gt;0.3197&lt;/b&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;b&gt;2&lt;/b&gt;&lt;/td&gt;
&lt;td&gt;&amp;nbsp;한글 쿼리 + 간단 분류기&lt;/td&gt;
&lt;td&gt;영문 대신 &lt;b&gt;한글 쿼리&lt;/b&gt;로 전환, 질문을 과학/비과학으로 분류&lt;/td&gt;
&lt;td&gt;&lt;b&gt;0.7424&lt;/b&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;b&gt;3&lt;/b&gt;&lt;/td&gt;
&lt;td&gt;&amp;nbsp;문서 기반 분류기 Fine-tuning&lt;/td&gt;
&lt;td&gt;Gemma-3-4B를 미세조정하여 질문&amp;ndash;문서 적합성 분류&lt;/td&gt;
&lt;td&gt;&lt;b&gt;0.7848&lt;/b&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;b&gt;4&lt;/b&gt;&lt;/td&gt;
&lt;td&gt;&amp;nbsp;문서 Chunking&lt;/td&gt;
&lt;td&gt;긴 문서를 문단 단위로 쪼개어 Recall &amp;uarr;&lt;/td&gt;
&lt;td&gt;&amp;mdash;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;b&gt;5&lt;/b&gt;&lt;/td&gt;
&lt;td&gt;&lt;b&gt;Re-Ranker&lt;/b&gt; 도입&lt;/td&gt;
&lt;td&gt;BM25 top-k &amp;rarr; Cross-Encoder 재랭킹&lt;/td&gt;
&lt;td&gt;&amp;mdash;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;b&gt;6&lt;/b&gt;&lt;/td&gt;
&lt;td&gt;&amp;nbsp;임베딩 모델 교체&lt;/td&gt;
&lt;td&gt;jinaai/jina-embeddings-v3 적용, 하이브리드 검색&lt;/td&gt;
&lt;td&gt;&lt;b&gt;0.8833&lt;/b&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;b&gt;7&lt;/b&gt;&lt;/td&gt;
&lt;td&gt;&amp;nbsp;질문 증강&lt;/td&gt;
&lt;td&gt;ReAct 스타일 키워드 보강&lt;/td&gt;
&lt;td&gt;&amp;mdash;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;&amp;nbsp;&lt;/h3&gt;
&lt;hr data-ke-style=&quot;style1&quot; /&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;4. 단계별 회고&lt;/h2&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;4.1 실험 1 &amp;mdash; Function Call&lt;/h3&gt;
&lt;blockquote data-ke-style=&quot;style1&quot;&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;ldquo;문서에 정답이 있는데 함수 호출이 안 일어난다?&amp;rdquo;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;b&gt;문제&lt;/b&gt;: LLM이 함수 호출 트리거 토큰을 학습하지 못함 &amp;rarr; 정답 포함해도 call 불청.&lt;/li&gt;
&lt;li&gt;&lt;b&gt;교훈&lt;/b&gt;: 구조화 출력 제어(토큰 강제, tool description) 없이는 LLM이 일관되게 행동하지 않는다.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignLeft&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1122&quot; data-origin-height=&quot;417&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/JA7q1/btsNATyGOtJ/hKhkL4XB1HnKTJz3JKmf60/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/JA7q1/btsNATyGOtJ/hKhkL4XB1HnKTJz3JKmf60/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/JA7q1/btsNATyGOtJ/hKhkL4XB1HnKTJz3JKmf60/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FJA7q1%2FbtsNATyGOtJ%2FhKhkL4XB1HnKTJz3JKmf60%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;1122&quot; height=&quot;417&quot; data-origin-width=&quot;1122&quot; data-origin-height=&quot;417&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;4.2 실험 2 &amp;mdash; 한글 쿼리 &amp;amp; 라이트 분류기&lt;/h3&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;쿼리를 &lt;b&gt;Korean BM25&lt;/b&gt;로 변환&lt;/li&gt;
&lt;li&gt;간단한 키워드 규칙으로 &amp;ldquo;과학 질문&amp;rdquo;만 검색 적용.&lt;br /&gt;↳ &lt;b&gt;에러 케이스&lt;/b&gt;: &amp;ldquo;피라미드 높이는?&amp;rdquo; &amp;rarr; 비과학으로 잘못 분류.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignLeft&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1280&quot; data-origin-height=&quot;440&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/beL34W/btsNAIdokmm/zzIeuP5gCKXHgcaq0scKk0/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/beL34W/btsNAIdokmm/zzIeuP5gCKXHgcaq0scKk0/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/beL34W/btsNAIdokmm/zzIeuP5gCKXHgcaq0scKk0/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FbeL34W%2FbtsNAIdokmm%2FzzIeuP5gCKXHgcaq0scKk0%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;1280&quot; height=&quot;440&quot; data-origin-width=&quot;1280&quot; data-origin-height=&quot;440&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;4.3 실험 3 &amp;mdash; 문서 기반 분류기 Fine-tuning&lt;/h3&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;b&gt;Gemma-3-4B&lt;/b&gt;&amp;nbsp;파인 튜닝.&lt;/li&gt;
&lt;li&gt;Prompt 설계: &amp;ldquo;질문 Q 와 문서 D 가 있을 때, 답이 D 안에 있나?&amp;rdquo; &amp;rarr; 분류&lt;/li&gt;
&lt;li&gt;Fine-tune 효과 검증.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignLeft&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;825&quot; data-origin-height=&quot;316&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/bSgCAl/btsNBIi7yMz/BgKO6uZAFDKXYmEAm1XbA0/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/bSgCAl/btsNBIi7yMz/BgKO6uZAFDKXYmEAm1XbA0/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/bSgCAl/btsNBIi7yMz/BgKO6uZAFDKXYmEAm1XbA0/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FbSgCAl%2FbtsNBIi7yMz%2FBgKO6uZAFDKXYmEAm1XbA0%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;825&quot; height=&quot;316&quot; data-origin-width=&quot;825&quot; data-origin-height=&quot;316&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;figure class=&quot;imageblock alignLeft&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1148&quot; data-origin-height=&quot;331&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/bBCKkL/btsNARgO4fm/EFyUnTkEiKlwo5Z3YJffTK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/bBCKkL/btsNARgO4fm/EFyUnTkEiKlwo5Z3YJffTK/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/bBCKkL/btsNARgO4fm/EFyUnTkEiKlwo5Z3YJffTK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FbBCKkL%2FbtsNARgO4fm%2FEFyUnTkEiKlwo5Z3YJffTK%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;1148&quot; height=&quot;331&quot; data-origin-width=&quot;1148&quot; data-origin-height=&quot;331&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;4.4 실험 4 ~ 5 &amp;mdash; Chunking &amp;amp; Re-Ranking&lt;/h3&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;긴 PDF &amp;rarr; 한글문장 분류&amp;nbsp;&lt;b&gt;Chunking&lt;/b&gt; 후 성능 향상&lt;/li&gt;
&lt;li&gt;&lt;b&gt;Cross-Encoder&lt;/b&gt;로 top-10 재랭킹 &amp;rarr; Precision 눈에 띄게 개선.&lt;/li&gt;
&lt;li&gt;다만 latency &amp;uarr;&amp;nbsp;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignLeft&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1280&quot; data-origin-height=&quot;255&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/cajDSN/btsNARVpp3T/vcphbMkd3qx8K6BNdl8Jbk/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/cajDSN/btsNARVpp3T/vcphbMkd3qx8K6BNdl8Jbk/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/cajDSN/btsNARVpp3T/vcphbMkd3qx8K6BNdl8Jbk/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FcajDSN%2FbtsNARVpp3T%2FvcphbMkd3qx8K6BNdl8Jbk%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;1280&quot; height=&quot;255&quot; data-origin-width=&quot;1280&quot; data-origin-height=&quot;255&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;figure class=&quot;imageblock alignLeft&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1280&quot; data-origin-height=&quot;238&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/brJTfs/btsNAWvsfP2/dRulZX2ubmzEeCTiMCwOC1/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/brJTfs/btsNAWvsfP2/dRulZX2ubmzEeCTiMCwOC1/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/brJTfs/btsNAWvsfP2/dRulZX2ubmzEeCTiMCwOC1/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FbrJTfs%2FbtsNAWvsfP2%2FdRulZX2ubmzEeCTiMCwOC1%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;1280&quot; height=&quot;238&quot; data-origin-width=&quot;1280&quot; data-origin-height=&quot;238&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;figure class=&quot;imageblock alignLeft&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;328&quot; data-origin-height=&quot;343&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/mu5Ss/btsNAMNtaEl/dUZAMKBGBmHZ3pbyPKnYj0/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/mu5Ss/btsNAMNtaEl/dUZAMKBGBmHZ3pbyPKnYj0/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/mu5Ss/btsNAMNtaEl/dUZAMKBGBmHZ3pbyPKnYj0/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fmu5Ss%2FbtsNAMNtaEl%2FdUZAMKBGBmHZ3pbyPKnYj0%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;328&quot; height=&quot;343&quot; data-origin-width=&quot;328&quot; data-origin-height=&quot;343&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;4.6 실험 6 &amp;mdash; Embedding 교체&lt;/h3&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;기존 모델의 한국어 표현력 한계 인지 &amp;rarr;&lt;b&gt;jina-embeddings-v3&lt;/b&gt;(multilingual, 1024-dim)로 전환&lt;/li&gt;
&lt;li&gt;최종 &lt;b&gt;0.8833&lt;/b&gt; 달성&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;4.7 실험 7 &amp;mdash; 질문 증강&lt;/h3&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;ReAct 프롬프트로 &amp;ldquo;질문 분해 + 연관 키워드&amp;rdquo; 생성하여 검색 다양화&lt;/li&gt;
&lt;li&gt;점수 개선은 미미&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignLeft&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;619&quot; data-origin-height=&quot;674&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/Iev99/btsNAK28A8b/8ugWldTPaeIv0zhx5svwVk/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/Iev99/btsNAK28A8b/8ugWldTPaeIv0zhx5svwVk/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/Iev99/btsNAK28A8b/8ugWldTPaeIv0zhx5svwVk/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FIev99%2FbtsNAK28A8b%2F8ugWldTPaeIv0zhx5svwVk%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;619&quot; height=&quot;674&quot; data-origin-width=&quot;619&quot; data-origin-height=&quot;674&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;4.8 최종 구조&lt;/h4&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignLeft&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1280&quot; data-origin-height=&quot;369&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/b3p2sT/btsNAguuwkd/pOiUcW6Szwlso2ppITqUMk/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/b3p2sT/btsNAguuwkd/pOiUcW6Szwlso2ppITqUMk/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/b3p2sT/btsNAguuwkd/pOiUcW6Szwlso2ppITqUMk/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fb3p2sT%2FbtsNAguuwkd%2FpOiUcW6Szwlso2ppITqUMk%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;1280&quot; height=&quot;369&quot; data-origin-width=&quot;1280&quot; data-origin-height=&quot;369&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;hr data-ke-style=&quot;style1&quot; /&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;5. 활용 정리&lt;/h2&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;범주 선택/구현 내용 이유&lt;/p&gt;
&lt;table style=&quot;border-collapse: collapse; width: 99.7674%; height: 140px;&quot; border=&quot;1&quot; data-ke-align=&quot;alignLeft&quot;&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;b&gt;Retrieval&lt;/b&gt;&lt;/td&gt;
&lt;td&gt;BM25 (&lt;b&gt;elasticsearch&lt;/b&gt;) + Chroma&lt;/td&gt;
&lt;td&gt;전통 vs 임베딩 결합 효과&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;b&gt;Embedding&lt;/b&gt;&lt;/td&gt;
&lt;td&gt;jina-embeddings-v3&amp;nbsp;&lt;/td&gt;
&lt;td&gt;한국어 다국어 성능 우수&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;b&gt;Re-Rank&lt;/b&gt;&lt;/td&gt;
&lt;td&gt;cross-encoder/ms-marco-MiniLM-L-6-v2&lt;/td&gt;
&lt;td&gt;경량 + 빠른 추론&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;b&gt;Prompt Eng.&lt;/b&gt;&lt;/td&gt;
&lt;td&gt;Tool Calling&amp;nbsp;&lt;/td&gt;
&lt;td&gt;일관된 함수 호출&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&amp;nbsp;&lt;/td&gt;
&lt;td&gt;&amp;nbsp;&lt;/td&gt;
&lt;td&gt;&amp;nbsp;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;hr data-ke-style=&quot;style1&quot; /&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;6. 결과 &amp;amp; 성과&lt;/h2&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;b&gt;Public Score: 0.8833&lt;/b&gt;&amp;nbsp;&lt;/li&gt;
&lt;li&gt;&lt;b&gt;RAG 파이프라인&lt;/b&gt; 구축 경험: 검색-분류-재랭킹&lt;/li&gt;
&lt;li&gt;한국어 IR에서 &lt;b&gt;Chunking + 멀티랭 임베딩&lt;/b&gt; 조합의 실효성 확인.&lt;/li&gt;
&lt;/ul&gt;
&lt;hr data-ke-style=&quot;style1&quot; /&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;7. 아쉬움 &amp;amp; 배운 점&lt;/h2&gt;
&lt;ol style=&quot;list-style-type: decimal;&quot; data-ke-list-type=&quot;decimal&quot;&gt;
&lt;li&gt;&lt;b&gt;데이터 스케일링&lt;/b&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;Fine-tune 데이터가 제한적이어서 Overfitting 위험 ⟶ knowledge distillation 시도 못함.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;b&gt;Latency&amp;ndash;Quality Trade-off&lt;/b&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;Re-Ranker depth 조정이 어려웠음. Elastic Search score cutoff 자동화 필요.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;b&gt;Tool Calling 안정화&lt;/b&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;System prompt 설계가 성능만큼 중요. 함수 호출 실패확률 존재&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;hr data-ke-style=&quot;style1&quot; /&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;8. 다음 단계&lt;/h2&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;b&gt;Dense Retriever&lt;/b&gt; 학습: Korean ColBERT or Contriever-KR로 BM25 대체.&lt;/li&gt;
&lt;li&gt;&lt;b&gt;Adaptive Chunking&lt;/b&gt;: 문서 구조 분석(heading, list) 기반 가변 chunk.&lt;/li&gt;
&lt;li&gt;&lt;b&gt;Re-Ranker Distil&lt;/b&gt;: larger cross-encoder knowledge &amp;rarr; Tiny-BERT distillation.&lt;/li&gt;
&lt;li&gt;&lt;b&gt;RAG 강화&lt;/b&gt;: Retrieval-Aware Generation(RAG-Fusion)로 answer faithfulness 향상.&lt;/li&gt;
&lt;/ul&gt;
&lt;hr data-ke-style=&quot;style1&quot; /&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;9. 결론&lt;/h2&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;짧은 기간이었지만, &lt;b&gt;실험적 탐구 &amp;rarr; 원인 분석 &amp;rarr; 전략 수정을 일일 사이클&lt;/b&gt;로 빠르게 반복하며 의미 있는 성과를 냈습니다. 특히 IR에서 &lt;b&gt;하이브리드 검색 + 임베딩 모델 교체&lt;/b&gt;가 미치는 성능 개선을 경험했고, &amp;ldquo;검색 품질은 전처리(쿼리 &amp;middot; 문서)와 후처리(재랭킹)&amp;rdquo; 라는 교훈을 얻었습니다. 이번 경험을 바탕으로 더 깊이 있는 &lt;b&gt;RAG 시스템&lt;/b&gt;을 설계해 보고자 합니다.&lt;/p&gt;</description>
      <category>프로그래밍/AI</category>
      <author>slowin</author>
      <guid isPermaLink="true">https://slowin.tistory.com/66</guid>
      <comments>https://slowin.tistory.com/66#entry66comment</comments>
      <pubDate>Fri, 25 Apr 2025 20:02:59 +0900</pubDate>
    </item>
    <item>
      <title>LangChain 기반 부트캠프 RAG봇 프로젝트 수행 후기</title>
      <link>https://slowin.tistory.com/65</link>
      <description>&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span&gt;본 포스팅은 부트캠프 과정에서 수행한 LangChain 기반의 문서 검색 및 Q&amp;amp;A 자동화 시스템(RAG, Retrieval-Augmented Generation) 구축 프로젝트에 대한 기록입니다.&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h3 data-pm-slice=&quot;1 1 []&quot; data-ke-size=&quot;size23&quot;&gt;&lt;span&gt;프로젝트 배경 및 목적&lt;/span&gt;&lt;/h3&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span&gt;부트캠프 과정에서 제공되는 다양한 정보가 여러 채널에 분산되어 있어 사용자들이 필요한 정보를 신속하게 얻기 어려운 문제가 있었습니다. 복잡한 행정 절차와 교육 정보를 사용자가 쉽게 이해할 수 있도록 지원하기 위해, 사용자 질문에 즉각적이고 정확한 답변을 제공할 수 있는 RAG 기반의 챗봇을 구현하게 되었습니다.&lt;/span&gt;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc; background-color: #ffffff; color: #1f2328; text-align: start;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;b&gt;프로젝트 기간:&lt;/b&gt;&lt;span&gt;&amp;nbsp;&lt;/span&gt;2025.04.02 ~ 2025.04.08&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 data-pm-slice=&quot;1 3 []&quot; data-ke-size=&quot;size23&quot;&gt;&lt;span&gt;비즈니스 문제 정의 및 KPI&lt;/span&gt;&lt;/h3&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-spread=&quot;false&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;span&gt;&lt;b&gt;문제 상황&lt;/b&gt;&lt;/span&gt;&lt;span&gt;: 정보가 여러 곳에 흩어져 있어 검색의 불편함이 발생&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;b&gt;목표&lt;/b&gt;&lt;/span&gt;&lt;span&gt;: 내부 문서 정보를 빠르고 정확하게 전달하는 챗봇 개발&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;b&gt;KPI 설정&lt;/b&gt;&lt;/span&gt;&lt;span&gt;: 사용자 질문에 대한 응답 정확도와 활용성&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 data-pm-slice=&quot;1 1 []&quot; data-ke-size=&quot;size23&quot;&gt;&lt;span&gt;주요 작업 내용&lt;/span&gt;&lt;/h3&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;&lt;span&gt;1. 데이터 수집 및 전처리&lt;/span&gt;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-spread=&quot;false&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;span&gt;학원 내부의 다양한 자료(시간표, 강의 리스트, 법령, Notion 페이지 등)를 Selenium을 활용한 웹 크롤링을 통해 수집하였습니다.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;문서의 일관된 형식을 위해 팀원별로 DocsLoader를 도메인 특화 방식으로 개발하고, Parsing 및 Chunking 처리를 진행했습니다.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;898&quot; data-origin-height=&quot;535&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/cmmfps/btsNhjjUw32/oempj6uOH5bG47mOGUs2N1/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/cmmfps/btsNhjjUw32/oempj6uOH5bG47mOGUs2N1/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/cmmfps/btsNhjjUw32/oempj6uOH5bG47mOGUs2N1/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fcmmfps%2FbtsNhjjUw32%2Foempj6uOH5bG47mOGUs2N1%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;380&quot; height=&quot;226&quot; data-origin-width=&quot;898&quot; data-origin-height=&quot;535&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;&lt;span&gt;2. 임베딩 및 벡터화&lt;/span&gt;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-spread=&quot;false&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;span&gt;도메인별 특성에 따라 최적화된 Embedding 전략을 수립하여 일관된 임베딩 로직을 적용했습니다.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;향후 확장성을 고려하여 Vector Store 구조를 설계하고, Embedding Vector를 저장하여 LLM의 Context로 활용할 수 있도록 하였습니다.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;&lt;span&gt;3. Router 설계 및 구현&lt;/span&gt;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-spread=&quot;false&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;span&gt;사용자의 질문을 분석하여 강의, 문서, 일정 등 적절한 도메인으로 자동 분류할 수 있도록 Routing 로직을 구축했습니다.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&quot;내일 시간표 어떻게 되나요?&quot;와 같은 질문에서 발생할 수 있는 날짜 정보 이슈를 해결하기 위해 조회 대상 날짜를 명확히 하는 추가 프롬프팅을 도입했습니다.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1822&quot; data-origin-height=&quot;814&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/byRPbS/btsNhdEelgh/7xKnHfWivFUcdE6CEFgq00/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/byRPbS/btsNhdEelgh/7xKnHfWivFUcdE6CEFgq00/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/byRPbS/btsNhdEelgh/7xKnHfWivFUcdE6CEFgq00/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FbyRPbS%2FbtsNhdEelgh%2F7xKnHfWivFUcdE6CEFgq00%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;508&quot; height=&quot;227&quot; data-origin-width=&quot;1822&quot; data-origin-height=&quot;814&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-spread=&quot;false&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;span&gt;Hallucination 현상을 방지하기 위해 반드시 컨텍스트에 현재 시간을 포함시키는 로직을 구현하였습니다.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1160&quot; data-origin-height=&quot;1054&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/bOZk3O/btsNhRmSHUy/2H7X4SML9rvaaF2Af6jbB1/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/bOZk3O/btsNhRmSHUy/2H7X4SML9rvaaF2Af6jbB1/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/bOZk3O/btsNhRmSHUy/2H7X4SML9rvaaF2Af6jbB1/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FbOZk3O%2FbtsNhRmSHUy%2F2H7X4SML9rvaaF2Af6jbB1%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;409&quot; height=&quot;372&quot; data-origin-width=&quot;1160&quot; data-origin-height=&quot;1054&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1451&quot; data-origin-height=&quot;432&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/bo3RNt/btsNfDYu4zA/i5OBW13FXOD8BoQMVkqSy0/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/bo3RNt/btsNfDYu4zA/i5OBW13FXOD8BoQMVkqSy0/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/bo3RNt/btsNfDYu4zA/i5OBW13FXOD8BoQMVkqSy0/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fbo3RNt%2FbtsNfDYu4zA%2Fi5OBW13FXOD8BoQMVkqSy0%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;1451&quot; height=&quot;432&quot; data-origin-width=&quot;1451&quot; data-origin-height=&quot;432&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;&lt;span&gt;4. 사용자 인터페이스&lt;/span&gt;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-spread=&quot;false&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-spread=&quot;false&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;span&gt;Streamlit을 사용하여 사용자가 쉽고 편리하게 질문을 입력하고 답변을 받을 수 있도록 사용자 친화적인 인터페이스를 구축했습니다.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1522&quot; data-origin-height=&quot;1248&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/eHwb8t/btsNfhH5pWt/w0CvaaPs6VZQey5GZ0V7fK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/eHwb8t/btsNfhH5pWt/w0CvaaPs6VZQey5GZ0V7fK/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/eHwb8t/btsNfhH5pWt/w0CvaaPs6VZQey5GZ0V7fK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FeHwb8t%2FbtsNfhH5pWt%2Fw0CvaaPs6VZQey5GZ0V7fK%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;319&quot; height=&quot;262&quot; data-origin-width=&quot;1522&quot; data-origin-height=&quot;1248&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1508&quot; data-origin-height=&quot;1476&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/qKZN7/btsNhK88BfF/K3EcbomVzlytAK4Ep6WUc0/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/qKZN7/btsNhK88BfF/K3EcbomVzlytAK4Ep6WUc0/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/qKZN7/btsNhK88BfF/K3EcbomVzlytAK4Ep6WUc0/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FqKZN7%2FbtsNhK88BfF%2FK3EcbomVzlytAK4Ep6WUc0%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;321&quot; height=&quot;314&quot; data-origin-width=&quot;1508&quot; data-origin-height=&quot;1476&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;h3 data-pm-slice=&quot;1 1 []&quot; data-ke-size=&quot;size23&quot;&gt;&lt;span&gt;한계점 및 개선 방안&lt;/span&gt;&lt;/h3&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-spread=&quot;false&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;span&gt;멀티턴 대화 흐름 구현이 충분하지 않았으며, LangChain의 체이닝 기능 활용이 미흡했습니다.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;테스트 및 평가가 다소 부족하여 응답 품질 개선의 여지가 있었습니다.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;향후 데이터의 질적, 양적 확장이 반드시 필요하며, 특히 정확한 데이터 확보의 중요성을 깨닫게 되었습니다.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 data-pm-slice=&quot;1 1 []&quot; data-ke-size=&quot;size23&quot;&gt;&lt;span&gt;결론 및 향후 계획&lt;/span&gt;&lt;/h3&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span&gt;본 프로젝트를 통해 LangChain과 RAG 기술의 가능성을 체감하였으며, 사용자 중심의 정보 제공 시스템 구축에 대한 경험을 쌓을 수 있었습니다. 앞으로의 프로젝트에서는 더욱 정교한 멀티턴 대화 기능과 체계적인 평가 프로세스를 도입하여 사용자 경험을 지속적으로 개선해 나갈 예정입니다.&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;</description>
      <category>프로그래밍/AI</category>
      <author>slowin</author>
      <guid isPermaLink="true">https://slowin.tistory.com/65</guid>
      <comments>https://slowin.tistory.com/65#entry65comment</comments>
      <pubDate>Thu, 10 Apr 2025 19:00:39 +0900</pubDate>
    </item>
    <item>
      <title>업스테이지 NLP 경진대회</title>
      <link>https://slowin.tistory.com/64</link>
      <description>&lt;h3 data-ke-size=&quot;size23&quot;&gt;대회 개요&lt;/h3&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;대회명&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;Dialogue&amp;nbsp;Summarization&amp;nbsp;|&amp;nbsp;일상&amp;nbsp;대화&amp;nbsp;요약&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;대회일정&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;2025.03.12 ~ 2025.03.24&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;팀&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;One Team 6th&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;대회 소개&lt;/h3&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;span style=&quot;background-color: #ffffff; color: #374151; text-align: start;&quot;&gt;주어진 데이터를 활용하여 일상 대화에 대한 요약을 효과적으로 생성하는 모델을 개발&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style=&quot;background-color: #ffffff; color: #374151; text-align: start;&quot;&gt;제공되는&amp;nbsp;데이터셋은&amp;nbsp;오직&amp;nbsp;&quot;대화문과&amp;nbsp;요약문&quot;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style=&quot;background-color: #ffffff; color: #374151; text-align: start;&quot;&gt;&lt;a href=&quot;https://github.com/cylnlp/dialogsum&quot; target=&quot;_blank&quot; rel=&quot;noopener&amp;nbsp;noreferrer&quot;&gt;https://github.com/cylnlp/dialogsum&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;평가지표&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;Rouge-1
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;span style=&quot;background-color: #ffffff; color: #374151; text-align: left;&quot;&gt;모델 요약본과 참조 요약본 간에 겹치는&lt;span&gt;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;b&gt;unigram&lt;/b&gt;&lt;span style=&quot;background-color: #ffffff; color: #374151; text-align: left;&quot;&gt;의 수를 비교&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Rouge-2
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;span style=&quot;background-color: #ffffff; color: #374151; text-align: left;&quot;&gt;모델 요약본과 참조 요약본 간에 겹치는&lt;span&gt;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;b&gt;bigram&lt;/b&gt;&lt;span style=&quot;background-color: #ffffff; color: #374151; text-align: left;&quot;&gt;의 수를 비교&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Rouge-L
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;span style=&quot;background-color: #ffffff; color: #374151; text-align: left;&quot;&gt;LCS 기법을 이용해 최장 길이로 매칭되는 문자열을 측정합니다. n-gram에서 n을 고정하지 않고, 단어의 등장 순서가 동일한 빈도수를 모두 세기 때문에 보다 유연한 성능 비교&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span style=&quot;background-color: #ffffff; color: #374151; text-align: left;&quot;&gt;ROUGE-F1&lt;/span&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;span style=&quot;background-color: #ffffff; color: #374151; text-align: left;&quot;&gt;&lt;span style=&quot;background-color: #ffffff; color: #374151; text-align: left;&quot;&gt;ROUGE-Recall과 ROUGE-Precisioin의 조화 평균입니다&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span style=&quot;background-color: #ffffff; color: #374151; text-align: left;&quot;&gt;&lt;span style=&quot;background-color: #ffffff; color: #374151; text-align: left;&quot;&gt;문장&amp;nbsp;토큰화&lt;/span&gt;&lt;/span&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;span style=&quot;background-color: #ffffff; color: #374151; text-align: left;&quot;&gt;&lt;span style=&quot;background-color: #ffffff; color: #374151; text-align: left;&quot;&gt;&lt;span style=&quot;background-color: #ffffff; color: #374151; text-align: start;&quot;&gt;한국어 데이터 특성 상 정확한 ROUGE score 산출하기 위하여 문장 토큰화를 진행한 후 평가&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style=&quot;background-color: #ffffff; color: #374151; text-align: left;&quot;&gt;&lt;span style=&quot;background-color: #ffffff; color: #374151; text-align: left;&quot;&gt;&lt;span style=&quot;background-color: #ffffff; color: #374151; text-align: start;&quot;&gt;&lt;span style=&quot;background-color: #ffffff; color: #374151; text-align: start;&quot;&gt;한국어 형태소 분석기를 통해 의미를 갖는 최소한의 단위인 형태소 단위로 문장을 쪼갠 뒤 모델이 생성한 문장과 정답 문장을 비교하여 ROUGE score를 산출&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1280&quot; data-origin-height=&quot;180&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/bjDtZQ/btsMWkxOfhO/mzQSYbTSaG2tKsf3ekpDt0/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/bjDtZQ/btsMWkxOfhO/mzQSYbTSaG2tKsf3ekpDt0/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/bjDtZQ/btsMWkxOfhO/mzQSYbTSaG2tKsf3ekpDt0/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FbjDtZQ%2FbtsMWkxOfhO%2FmzQSYbTSaG2tKsf3ekpDt0%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;569&quot; height=&quot;80&quot; data-origin-width=&quot;1280&quot; data-origin-height=&quot;180&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;문제 정의&lt;/h3&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1280&quot; data-origin-height=&quot;511&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/br0xhl/btsMXVqjgSk/MTzzaG5BOwclTIeiszK5S1/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/br0xhl/btsMXVqjgSk/MTzzaG5BOwclTIeiszK5S1/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/br0xhl/btsMXVqjgSk/MTzzaG5BOwclTIeiszK5S1/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fbr0xhl%2FbtsMXVqjgSk%2FMTzzaG5BOwclTIeiszK5S1%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;647&quot; height=&quot;258&quot; data-origin-width=&quot;1280&quot; data-origin-height=&quot;511&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1280&quot; data-origin-height=&quot;278&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/dxoXOM/btsMYxhVgJv/TJzwHWidyKZ1dSrmJ99hz0/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/dxoXOM/btsMYxhVgJv/TJzwHWidyKZ1dSrmJ99hz0/img.png&quot; data-alt=&quot;Chat GPT&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/dxoXOM/btsMYxhVgJv/TJzwHWidyKZ1dSrmJ99hz0/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FdxoXOM%2FbtsMYxhVgJv%2FTJzwHWidyKZ1dSrmJ99hz0%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;634&quot; height=&quot;138&quot; data-origin-width=&quot;1280&quot; data-origin-height=&quot;278&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;Chat GPT&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1280&quot; data-origin-height=&quot;278&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/bTVEZA/btsMWZmEtb7/3UguTpKZG1hbS4J05XtSZK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/bTVEZA/btsMWZmEtb7/3UguTpKZG1hbS4J05XtSZK/img.png&quot; data-alt=&quot;Claude&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/bTVEZA/btsMWZmEtb7/3UguTpKZG1hbS4J05XtSZK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FbTVEZA%2FbtsMWZmEtb7%2F3UguTpKZG1hbS4J05XtSZK%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;635&quot; height=&quot;138&quot; data-origin-width=&quot;1280&quot; data-origin-height=&quot;278&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;Claude&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1280&quot; data-origin-height=&quot;278&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/I1eOq/btsMWtIxwXm/klNOeW9RbmYgf36VCghj01/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/I1eOq/btsMWtIxwXm/klNOeW9RbmYgf36VCghj01/img.png&quot; data-alt=&quot;대회 훈련데이터&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/I1eOq/btsMWtIxwXm/klNOeW9RbmYgf36VCghj01/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FI1eOq%2FbtsMWtIxwXm%2FklNOeW9RbmYgf36VCghj01%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;639&quot; height=&quot;139&quot; data-origin-width=&quot;1280&quot; data-origin-height=&quot;278&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;대회 훈련데이터&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;ChatGPT,&amp;nbsp;Claude,&amp;nbsp;실제&amp;nbsp;훈련데이터의&amp;nbsp;요약&amp;nbsp;스타일은&amp;nbsp;각각&amp;nbsp;다름&lt;/li&gt;
&lt;li&gt;서로&amp;nbsp;다른&amp;nbsp;요약&amp;nbsp;스타일로&amp;nbsp;요약&amp;nbsp;품질을&amp;nbsp;좋고&amp;nbsp;나쁨으로&amp;nbsp;판단하기&amp;nbsp;난해함&lt;/li&gt;
&lt;li&gt;Rouge-1,&amp;nbsp;Rouge-2,&amp;nbsp;Rouge-L&amp;nbsp;같은&amp;nbsp;평가&amp;nbsp;지표&amp;nbsp;활용&lt;/li&gt;
&lt;li&gt;그러나&amp;nbsp;여전히&amp;nbsp;근본적&amp;nbsp;질문&amp;nbsp;존재:&amp;nbsp;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;span style=&quot;color: #ee2323;&quot;&gt;&lt;b&gt;좋은&amp;nbsp;요약이란&amp;nbsp;무엇인가?&amp;nbsp;해결하고자&amp;nbsp;하는&amp;nbsp;포인트는?&lt;/b&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;blockquote data-ke-style=&quot;style2&quot;&gt;&lt;span style=&quot;color: #000000;&quot;&gt;&lt;b&gt;좋은 요약이란 무엇인가?&lt;/b&gt;&lt;/span&gt;&lt;/blockquote&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;정보&amp;nbsp;보존이&amp;nbsp;중요한가,&amp;nbsp;문장&amp;nbsp;유창성이&amp;nbsp;중요한가,&amp;nbsp;핵심만&amp;nbsp;압축하는&amp;nbsp;것이&amp;nbsp;더&amp;nbsp;나은가?&lt;/li&gt;
&lt;li&gt;사용자의&amp;nbsp;목적에&amp;nbsp;따라&amp;nbsp;기준이&amp;nbsp;달라질&amp;nbsp;수&amp;nbsp;있음.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;&lt;span style=&quot;color: #000000;&quot;&gt;EDA&lt;/span&gt;&lt;/h3&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b&gt;대화문 / 요약문 길이 분포&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-filename=&quot;스크린샷 2025-03-26 오후 4.31.08.png&quot; data-origin-width=&quot;2446&quot; data-origin-height=&quot;614&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/bm3Ebk/btsMYr9Ud93/XHcg30s3PXvvxvl98vBUxK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/bm3Ebk/btsMYr9Ud93/XHcg30s3PXvvxvl98vBUxK/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/bm3Ebk/btsMYr9Ud93/XHcg30s3PXvvxvl98vBUxK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fbm3Ebk%2FbtsMYr9Ud93%2FXHcg30s3PXvvxvl98vBUxK%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;2446&quot; height=&quot;614&quot; data-filename=&quot;스크린샷 2025-03-26 오후 4.31.08.png&quot; data-origin-width=&quot;2446&quot; data-origin-height=&quot;614&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;i&gt;&lt;u&gt;&lt;span style=&quot;color: #000000;&quot;&gt;Train&lt;/span&gt;&lt;/u&gt;&lt;/i&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;color: #000000;&quot;&gt;max dialogue len&amp;nbsp; : 606&amp;nbsp;&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;color: #000000;&quot;&gt;max summary len : 99&lt;/span&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;u&gt;&lt;i&gt;&lt;span style=&quot;color: #000000;&quot;&gt;Dev&lt;/span&gt;&lt;/i&gt;&lt;/u&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;color: #000000;&quot;&gt;max dialogue len&amp;nbsp; : 337&amp;nbsp;&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;color: #000000;&quot;&gt;max summary len : 52&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;u&gt;&lt;i&gt;&lt;span style=&quot;color: #000000;&quot;&gt;Test&lt;/span&gt;&lt;/i&gt;&lt;/u&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;color: #000000;&quot;&gt;max dialogue len&amp;nbsp; : 577&amp;nbsp;&lt;/span&gt;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;b&gt;Train&amp;nbsp;Set&lt;/b&gt;은&amp;nbsp;가장&amp;nbsp;긴&amp;nbsp;대화&amp;nbsp;데이터를&amp;nbsp;포함하며,&amp;nbsp;모델이&amp;nbsp;긴&amp;nbsp;문맥도&amp;nbsp;학습할&amp;nbsp;수&amp;nbsp;있도록&amp;nbsp;설계됨&lt;/li&gt;
&lt;li&gt;&lt;b&gt;Summary Length&lt;/b&gt;&lt;span&gt;는 모두 &lt;/span&gt;&lt;b&gt;100 이하&lt;/b&gt;&lt;span&gt;로 비교적 짧고 간결한 요약을 요구함.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;Test&amp;nbsp;Set은&amp;nbsp;요약이&amp;nbsp;제공되지&amp;nbsp;않아,&amp;nbsp;생성된&amp;nbsp;요약의&amp;nbsp;품질&amp;nbsp;평가용으로&amp;nbsp;활용됨&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;hr contenteditable=&quot;false&quot; data-ke-type=&quot;horizontalRule&quot; data-ke-style=&quot;style6&quot; /&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b&gt;&lt;span&gt;요약비율&lt;/span&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p style=&quot;text-align: center;&quot; data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;color: #000000;&quot;&gt;mean &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 0.200244&lt;/span&gt;&lt;/p&gt;
&lt;p style=&quot;text-align: center;&quot; data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;color: #000000;&quot;&gt;std&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 0.060514&lt;/span&gt;&lt;/p&gt;
&lt;p style=&quot;text-align: center;&quot; data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;color: #000000;&quot;&gt;min &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;&lt;span style=&quot;color: #ee2323;&quot;&gt;0.038462&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p style=&quot;text-align: center;&quot; data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;color: #000000;&quot;&gt;25%&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 0.161074&lt;/span&gt;&lt;/p&gt;
&lt;p style=&quot;text-align: center;&quot; data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;color: #000000;&quot;&gt;50%&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 0.192982&lt;/span&gt;&lt;/p&gt;
&lt;p style=&quot;text-align: center;&quot; data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;color: #000000;&quot;&gt;75%&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 0.229167&lt;/span&gt;&lt;/p&gt;
&lt;p style=&quot;text-align: center;&quot; data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;color: #000000;&quot;&gt;max&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;span style=&quot;color: #ee2323;&quot;&gt;0.793103&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;전체&amp;nbsp;분포의&amp;nbsp;평균값은&amp;nbsp;약&amp;nbsp;0.200으로,&amp;nbsp;전반적으로&amp;nbsp;낮은&amp;nbsp;비율&amp;nbsp;분포를&amp;nbsp;보임&lt;/li&gt;
&lt;li&gt;&lt;b&gt;표준편차는 0.06 수준&lt;/b&gt;&lt;span&gt;으로, 분포의 &lt;/span&gt;&lt;b&gt;변동성이 크지 않음&lt;/b&gt;&lt;/li&gt;
&lt;li&gt;&lt;b&gt;최솟값은&amp;nbsp;0.038,&amp;nbsp;일부&amp;nbsp;케이스는&amp;nbsp;거의&amp;nbsp;요약이&amp;nbsp;되지&amp;nbsp;않았거나&amp;nbsp;비정상적으로&amp;nbsp;짧은&amp;nbsp;경우&amp;nbsp;존재&lt;/b&gt;&lt;/li&gt;
&lt;li&gt;&lt;b&gt;최댓값은&amp;nbsp;0.79로,&amp;nbsp;특정&amp;nbsp;케이스는&amp;nbsp;원문에&amp;nbsp;가까운&amp;nbsp;긴&amp;nbsp;요약도&amp;nbsp;존재&lt;/b&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1854&quot; data-origin-height=&quot;806&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/cnKrt4/btsMYqQJNQN/io0BzH3tCkk09RJObZwmkK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/cnKrt4/btsMYqQJNQN/io0BzH3tCkk09RJObZwmkK/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/cnKrt4/btsMYqQJNQN/io0BzH3tCkk09RJObZwmkK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FcnKrt4%2FbtsMYqQJNQN%2Fio0BzH3tCkk09RJObZwmkK%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;1854&quot; height=&quot;806&quot; data-origin-width=&quot;1854&quot; data-origin-height=&quot;806&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;i&gt;&quot;극한으로 짧은 요약을 이상 데이터로 가져가야하는가?&quot; &lt;/i&gt;에 대한 고민&lt;/li&gt;
&lt;/ul&gt;
&lt;hr contenteditable=&quot;false&quot; data-ke-type=&quot;horizontalRule&quot; data-ke-style=&quot;style6&quot; /&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b&gt;발화자 토큰&lt;/b&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;color: #000000;&quot;&gt;Train&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;color: #000000;&quot;&gt;['#Person1', '#Person2', '#Person3', '#Person4', '#Person5', '#Person6', '#Person7']&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;color: #000000;&quot;&gt;&lt;b&gt;&amp;nbsp;&lt;/b&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;color: #000000;&quot;&gt;Dev&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;color: #000000;&quot;&gt;['#Person1', '#Person2', '#Person3', '#Person4']&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;color: #000000;&quot;&gt;&lt;b&gt;&amp;nbsp;&lt;/b&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;color: #000000;&quot;&gt;Test&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;color: #ee2323;&quot;&gt;['#Person1', '#Person2', '#Person3']&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;전체적으로 &lt;b&gt;Train 데이터가 가장 다양한 화자(Person)&lt;/b&gt;를 포함하고 있어, 모델이 다양한 대화 구조를 학습할 수 있음&lt;/li&gt;
&lt;li&gt;Dev와&amp;nbsp;Test는&amp;nbsp;등장&amp;nbsp;인물&amp;nbsp;수가&amp;nbsp;줄어들며,&amp;nbsp;대화&amp;nbsp;구조가&amp;nbsp;단순화되어&amp;nbsp;있음&amp;nbsp;&amp;rarr;&amp;nbsp;모델이&amp;nbsp;평가&amp;nbsp;구간에서는&amp;nbsp;더&amp;nbsp;제한된&amp;nbsp;화자&amp;nbsp;구성에&amp;nbsp;노출됨&lt;/li&gt;
&lt;li&gt;이는 &lt;b&gt;Train &amp;rarr; Dev/Test 간 분포 차이(distribution shift)&lt;/b&gt;를 유발할 수 있으며, 특히 다중 화자 요약 성능에 영향을 줄 가능성 있음&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;i&gt;포인트&lt;/i&gt;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;b&gt;다자간 대화(multi-party dialogue)&lt;/b&gt;에&amp;nbsp;강한&amp;nbsp;모델을&amp;nbsp;만들고자&amp;nbsp;한다면,&amp;nbsp;&amp;rarr;&amp;nbsp;Train의&amp;nbsp;다양한&amp;nbsp;화자&amp;nbsp;구조를&amp;nbsp;충분히&amp;nbsp;활용한&amp;nbsp;학습&amp;nbsp;전략이&amp;nbsp;필요&lt;/li&gt;
&lt;li&gt;평가(Dev/Test)에서는 단일/소수 화자 구조 중심이므로, &amp;rarr; 모델의 요약 일반화 성능을 평가하기에 더 정제된 테스트 환경&lt;/li&gt;
&lt;/ul&gt;
&lt;hr contenteditable=&quot;false&quot; data-ke-type=&quot;horizontalRule&quot; data-ke-style=&quot;style6&quot; /&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b&gt;스페션 토큰&lt;/b&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;color: #000000;&quot;&gt;Train :&amp;nbsp;&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;color: #000000;&quot;&gt;['#Address#', '#CarNumber#', '#CardNumber#', '#DateOfBirth#', '#Email#', '#PassportNumber#', &lt;span style=&quot;color: #ee2323;&quot;&gt;'#Person#'&lt;/span&gt;, '#PhoneNumber#', '#SSN#']&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;color: #000000;&quot;&gt;&lt;b&gt;&amp;nbsp;&lt;/b&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;color: #000000;&quot;&gt;Dev :&amp;nbsp;&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;color: #000000;&quot;&gt;['#Address#', '#DateOfBirth#', '#PhoneNumber#']&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;color: #000000;&quot;&gt;&lt;b&gt;&amp;nbsp;&lt;/b&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;color: #000000;&quot;&gt;Test :&amp;nbsp;&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;color: #ee2323;&quot;&gt;['#Address#', '#PassportNumber#', '#PhoneNumber#']&lt;/span&gt;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;i&gt;&lt;u&gt;#SSN#&lt;span&gt;, &lt;/span&gt;#CardNumber#&lt;span&gt;, &lt;/span&gt;#CarNumber#&lt;/u&gt;&lt;/i&gt;&lt;span&gt; 등은 Train에만 존재 &amp;rarr; &lt;/span&gt;&lt;b&gt;Train-Test 간 도메인 불일치(possible domain gap)&lt;/b&gt;&lt;span&gt; 발생 가능&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;Test에서는 일부 스페션 토큰 종류만 포함되므로, &amp;rarr; 평가 지표가 모델의 전체 스페션 토큰 대응 능력을 완전히 반영하지 못할 수 있음&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b&gt;왜 스페셜 토큰을 토크나이저에 추가해야 할까요?&lt;/b&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b&gt;1. 토크나이저는 텍스트를 &amp;ldquo;조각&amp;rdquo;으로 나눔&lt;/b&gt;&lt;/p&gt;
&lt;pre id=&quot;code_1742975308777&quot; class=&quot;bash&quot; data-ke-language=&quot;bash&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;text = &quot;제 이름은 #Person#이고, 전화번호는 #PhoneNumber#입니다.&quot;&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b&gt;스페셜 토큰을 추가하지 않으면?&lt;/b&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;rarr; &lt;span&gt;#Person#&lt;/span&gt;이 다음과 같이 &lt;span&gt;&lt;b&gt;쪼개질 수 있음&lt;/b&gt;&lt;/span&gt;&lt;/p&gt;
&lt;pre id=&quot;code_1742975324158&quot; class=&quot;bash&quot; data-ke-language=&quot;bash&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;['#', 'Person', '#'] 또는 ['#Per', 'son#']&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span&gt;이렇게 되면 모델은 이 토큰을 &lt;/span&gt;&lt;b&gt;하나의 의미 있는 단어로 인식하지 못함&lt;/b&gt;&lt;b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;blockquote data-ke-style=&quot;style3&quot;&gt;자연어 처리(NLP)에서 **스페셜 토큰(Special Tokens)**은 모델이 입력 텍스트를 효과적으로 이해하고 처리할 수 있도록 돕는 특별한 기호입니다. 이러한 토큰은 문장의 구조를 명확히 하거나, 특정한 의미를 부여하는 데 사용됩니다.&lt;br /&gt;&lt;br /&gt;스페셜 토큰의 역할과 필요성&lt;br /&gt;1.&amp;nbsp;문장 경계 표시: 모델이 문장의 시작과 끝을 정확히 파악하도록 돕습니다. 예를 들어, BERT 모델에서는&amp;nbsp;[CLS]&amp;nbsp;토큰이 문장의 시작을,&amp;nbsp;[SEP]&amp;nbsp;토큰이 문장의 구분을 나타냅니다.&lt;br /&gt;2.&amp;nbsp;특정 태스크 지시: 모델에게 현재 수행 중인 작업의 종류를 알려줍니다. 예를 들어,&amp;nbsp;[MASK]&amp;nbsp;토큰은 마스킹된 언어 모델링에서 가려진 단어를 나타내는 데 사용됩니다.&lt;br /&gt;3. 특수한 의미 부여: 개인정보와 같은 특정 정보를 일반화하여 모델이 이를 한 단위로 인식하고 처리할 수 있도록 합니다. 예를 들어, #Person#, #PhoneNumber#와 같은 토큰을 사용하여 이름이나 전화번호를 일반화할 수 있습니다.&lt;br /&gt;&lt;br /&gt;스페셜 토큰은 NLP 모델이 입력 텍스트를 보다 정확하게 이해하고 처리하는 데 중요한 역할을 합니다. 특히, 특정 태스크나 도메인에 맞게 모델을 조정할 때, 스페셜 토큰의 활용은 모델의 성능 향상에 큰 도움이 됩니다. 따라서, 토크나이저에 적절한 스페셜 토큰을 추가하고, 모델이 이를 인식하도록 조정하는 것은 효과적인 NLP 모델 구축을 위한 중요한 과정입니다.&lt;/blockquote&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;hr contenteditable=&quot;false&quot; data-ke-type=&quot;horizontalRule&quot; data-ke-style=&quot;style6&quot; /&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;&lt;b&gt;대화분석&lt;/b&gt;&lt;/p&gt;
&lt;blockquote data-ke-style=&quot;style3&quot;&gt;#Person1#: 속았어! &lt;span style=&quot;color: #ee2323;&quot;&gt;ㅋㅋ..&lt;/span&gt; 완전 속았어.&lt;br /&gt;#Person1#: 곧 봐! 안녕! &lt;span style=&quot;color: #ee2323;&quot;&gt;휴.. 드디어 끝났네. . .&lt;/span&gt; 나는 그 여자나 그녀의 건방진 딸을 참을 수 없어.&lt;br /&gt;#Person2#: 남편의 이름은 린이에요, &lt;span style=&quot;color: #ee2323;&quot;&gt;L - I - N.&lt;/span&gt;&lt;br /&gt;#Person2#: 저는 &lt;span style=&quot;color: #ee2323;&quot;&gt;L. A.&lt;/span&gt;에 머물렀고, 주말 내내 날씨가 맑았습니다. 대부분의 시간을 집에서 보냈지만 크리스마스날에는 킹콩을 보러 갔습니다.&lt;br /&gt;#Person1#: 알겠습니다. 그럼 22&lt;span style=&quot;color: #ee2323;&quot;&gt;파운드&lt;/span&gt;입니다.&lt;br /&gt;#Person1#: 트럭 하나에 대략 200 &lt;span style=&quot;color: #ee2323;&quot;&gt;위안&lt;/span&gt; 정도야.&lt;br /&gt;#Person1#: &lt;span style=&quot;color: #ee2323;&quot;&gt;와!&lt;/span&gt;&lt;/blockquote&gt;
&lt;table style=&quot;border-collapse: collapse; width: 100%;&quot; border=&quot;1&quot; data-ke-align=&quot;alignLeft&quot;&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td style=&quot;width: 20.4651%;&quot;&gt;&lt;span&gt;감정 표현 과다&lt;/span&gt;&lt;/td&gt;
&lt;td style=&quot;width: 79.4186%;&quot;&gt;&lt;span&gt;&amp;lsquo;ㅋㅋ&amp;rsquo;, &amp;lsquo;와!&amp;rsquo;, &amp;lsquo;휴&amp;hellip;&amp;rsquo; 등 감정적 표현 많음&lt;/span&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;width: 20.4651%;&quot;&gt;&lt;span&gt;언어 코드 혼용&lt;/span&gt;&lt;/td&gt;
&lt;td style=&quot;width: 79.4186%;&quot;&gt;&lt;span&gt;영어 스펠링이나 단어(L-I-N, L.A.)가 한글과 함께 등장&lt;/span&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;width: 20.4651%;&quot;&gt;&lt;span&gt;단위 혼합&lt;/span&gt;&lt;/td&gt;
&lt;td style=&quot;width: 79.4186%;&quot;&gt;&lt;span&gt;파운드, 위안 등 여러 화폐 단위가 혼재되어 있음&lt;/span&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;width: 20.4651%;&quot;&gt;&lt;span&gt;주제 다양성&lt;/span&gt;&lt;/td&gt;
&lt;td style=&quot;width: 79.4186%;&quot;&gt;
&lt;div&gt;
&lt;div&gt;
&lt;div&gt;일상&amp;nbsp;감정,&amp;nbsp;영화&amp;nbsp;관람,&amp;nbsp;위치&amp;nbsp;정보,&amp;nbsp;가격&amp;nbsp;협상&amp;nbsp;등&amp;nbsp;주제가&amp;nbsp;매우&amp;nbsp;다양함&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;데이터 전처리 예시&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;color: #ee2323;&quot;&gt;&amp;nbsp;#Person1#: 속았어! ㅋㅋ.. 완전 속았어.&lt;/span&gt;&amp;nbsp; -&amp;gt; &lt;span style=&quot;color: #ee2323;&quot;&gt;속았어! [웃음] . 완전 속았어.&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;&lt;span style=&quot;color: #000000;&quot;&gt;실험&lt;/span&gt;&lt;/h3&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;color: #000000;&quot;&gt;FP16&lt;/span&gt;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;FP16은 16비트 부동소수점 형식으로, FP32(32비트 부동소수점)보다 낮은 비트 수를 사용하여 데이터를 표현&lt;/li&gt;
&lt;li&gt;이는 메모리 사용량을 절감하고 연산 속도를 높이는 데 도움&lt;/li&gt;
&lt;li&gt;그러나 정밀도가 낮아져 연산 오차가 발생할 수 있으므로, FP32와 FP16을 혼합하여 사용하는 &lt;b&gt;혼합 정밀도를 활용 할 수 있음&lt;/b&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;hr contenteditable=&quot;false&quot; data-ke-type=&quot;horizontalRule&quot; data-ke-style=&quot;style6&quot; /&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;Open Model&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;300&quot; data-origin-height=&quot;168&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/90Zfp/btsMXyWy8qF/PzeMxOoAf3ASLqNEwqb7ek/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/90Zfp/btsMXyWy8qF/PzeMxOoAf3ASLqNEwqb7ek/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/90Zfp/btsMXyWy8qF/PzeMxOoAf3ASLqNEwqb7ek/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2F90Zfp%2FbtsMXyWy8qF%2FPzeMxOoAf3ASLqNEwqb7ek%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;298&quot; height=&quot;167&quot; data-origin-width=&quot;300&quot; data-origin-height=&quot;168&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;Gemma-3 모델을 활용한 요약 실험&lt;/p&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;프롬프트:&lt;/p&gt;
&lt;blockquote data-ke-style=&quot;style3&quot;&gt;prompt:&lt;br /&gt;당신은 한국어 대화 내용을 간결하게 요약하는 모델입니다.&lt;br /&gt;&lt;br /&gt;#Person1#:더슨 씨, 받아쓰기 좀 해주세요.&lt;br /&gt;#Person2#: 네, 실장님...&lt;br /&gt;#Person1#: 이것은 오늘 오후까지 모든 직원에게 내부 메모로 전달되어야 합니다. 준비되셨나요?&lt;br /&gt;#Person2#: 네, 실장님. 시작하셔도 됩니다.&lt;br /&gt;#Person1#: 모든 직원들에게 주의하라... 즉시 효력을 발휘하여, 모든 사무실 통신은 이메일 통신과 공식 메모로 제한됩니다. 근무 시간 동안 직원들이 즉시 메시지 프로그램을 사용하는 것은 엄격히 금지됩니다.&amp;hellip;&lt;br /&gt;#Person1#: 그들은 그들의 의사소통 방법을 바꾸어야만 합니다. 이 사무실에서 누구도 즉시 메시지를 사용하지 않기를 원합니다. 너무 많은 시간을 낭비하게 됩니다! 이제, 메모를 계속해주세요. 우리가 어디까지 했나요?&lt;br /&gt;#Person2#: 이것은 내부와 외부 통신에 적용됩니다.&lt;br /&gt;#Person1#: 그렇습니다. 즉시 메시지를 계속 사용하는 어떤 직원이라도 먼저 경고를 받고 직무 정지에 처해질 것입니다. 두 번째 위반 시에는 직원은 해고에 처해질 것입니다. 이 새로운 정책에 대한 어떤 질문이라도 부서장에게 직접 문의하면 됩니다.&lt;br /&gt;#Person2#: 그게 다신가요?&lt;br /&gt;#Person1#: 네. 이 메모를 오후 4시 전에 모든 직원에게 타이핑하여 배포해 주세요.&lt;/blockquote&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;출력:&amp;nbsp;&lt;/p&gt;
&lt;blockquote data-ke-style=&quot;style3&quot;&gt;실장이 모든 직원에게 내부 메모를 통해 사무실 내 모든 통신(내부 및 외부)에서 즉시 메시지 사용을 금지하는 새로운 정책을 발표했습니다. 이를 위반할 경우 경고, 직무 정지, 해고될 수 있습니다. 메모는 오후 4시까지 모든 직원에게 배포되어야 합니다.&lt;/blockquote&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;특별하게 fine-tuning없이도 프롬프트 엔지니어링 만으로 요약이 가능한것을 확인&lt;/li&gt;
&lt;li&gt;하지만, 원하는 스타일의 요약 출력에 한계점 존재&lt;/li&gt;
&lt;/ul&gt;
&lt;hr contenteditable=&quot;false&quot; data-ke-type=&quot;horizontalRule&quot; data-ke-style=&quot;style6&quot; /&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;incontext-learning 진행&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;One-Shot&lt;/p&gt;
&lt;blockquote data-ke-style=&quot;style3&quot;&gt;당신은 한국어 대화 내용을 간결하게 요약하는 모델입니다.&lt;br /&gt;주어진 대화 전문을 읽고, 그 대화의 핵심 내용(주요 행동, 요청, 결정사항 등)을 한 문장으로 요약하세요.&lt;br /&gt;반드시 JSON 형식으로 summary 키를 가진 JSON 형식으로 반환하세요.&amp;nbsp;&lt;br /&gt;&lt;br /&gt;요약예시:&lt;br /&gt;대화:&lt;br /&gt;...&lt;br /&gt;&lt;br /&gt;요약:&lt;br /&gt;...&lt;br /&gt;&lt;br /&gt;대화:&lt;br /&gt;#Person1#:더슨 씨, 받아쓰기 좀 해주세요.&lt;br /&gt;#Person2#: 네, 실장님...&lt;br /&gt;#Person1#: 이것은 오늘 오후까지 모든 직원에게 내부 메모로 전달되어야 합니다. 준비되셨나요?&lt;br /&gt;#Person2#: 네, 실장님. 시작하셔도 됩니다.&lt;br /&gt;...&lt;br /&gt;요약:&lt;/blockquote&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;출력:&lt;/p&gt;
&lt;blockquote data-ke-style=&quot;style3&quot;&gt;#Person1#은 모든 사무실 통신을 이메일과 공식 메모로 제한하는 새로운 정책을 발표하고, 즉시 메시지 사용을 금지합니다. 위반 시 경고, 정지, 해고될 수 있습니다. #Person2#는 정책의 적용 범위에 대해 질문하고, #Person1#은 오후 4시까지 메모를 배포하도록 지시합니다.&lt;/blockquote&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;원하는 스타일과 유사하게 출력되는 것을 확인&lt;/li&gt;
&lt;/ul&gt;
&lt;p style=&quot;text-align: center;&quot; data-ke-size=&quot;size16&quot;&gt;&lt;u&gt;&lt;span style=&quot;color: #ee2323;&quot;&gt;가정: 그렇다면, Few-Shot을 시도한다면 적합한 출력을 할것인가?&lt;/span&gt;&lt;/u&gt;&lt;/p&gt;
&lt;blockquote data-ke-style=&quot;style3&quot;&gt;당신은 한국어 대화 내용을 간결하게 요약하는 모델입니다.&lt;br /&gt;주어진 대화 전문을 읽고, 그 대화의 핵심 내용(주요 행동, 요청, 결정사항 등)을 한 문장으로 요약하세요.&lt;br /&gt;반드시 JSON 형식으로 summary 키를 가진 JSON 형식으로 반환하세요.&lt;br /&gt;요약예시:&lt;br /&gt;대화1&lt;br /&gt;&amp;hellip;&lt;br /&gt;요약1&lt;br /&gt;&amp;hellip;&lt;br /&gt;대화2&lt;br /&gt;&amp;hellip;&lt;/blockquote&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;출력:&lt;/p&gt;
&lt;blockquote data-ke-style=&quot;style3&quot;&gt;더슨 씨는 모든 직원을 대상으로 즉시 메시지 사용을 금지하는 새로운 정책을 발표하고, 이를 위반하는 직원은 경고, 정직, 해고될 수 있다고 경고합니다.&lt;/blockquote&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;좀 더 적합한 스타일로 출력 되는것을 확인&lt;/li&gt;
&lt;/ul&gt;
&lt;blockquote data-ke-style=&quot;style1&quot;&gt;&lt;span style=&quot;font-family: 'Noto Serif KR';&quot;&gt;새로운 고민&lt;/span&gt;&lt;/blockquote&gt;
&lt;blockquote data-ke-style=&quot;style2&quot;&gt;비슷한 대화를 넣어주면 좋을까?&lt;br /&gt;비슷한 대화는 어떻게 찾지?&lt;/blockquote&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;color: #000000;&quot;&gt;&lt;b&gt;Embedding&lt;/b&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style=&quot;color: #000000;&quot;&gt;`&lt;span style=&quot;color: #000000; text-align: start;&quot;&gt;jinaai/jina-embeddings-v3` 모델 활용&lt;/span&gt;&lt;/span&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1280&quot; data-origin-height=&quot;927&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/dSBBoF/btsMXQv0yTJ/7QWyiRT5HUVlTVEsrW2HXk/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/dSBBoF/btsMXQv0yTJ/7QWyiRT5HUVlTVEsrW2HXk/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/dSBBoF/btsMXQv0yTJ/7QWyiRT5HUVlTVEsrW2HXk/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FdSBBoF%2FbtsMXQv0yTJ%2F7QWyiRT5HUVlTVEsrW2HXk%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;1280&quot; height=&quot;927&quot; data-origin-width=&quot;1280&quot; data-origin-height=&quot;927&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;Embedding 된 값을 활용하여 코사인 유사도를 통해 비숫한 대화를 Few-Shot으로 활용&lt;/li&gt;
&lt;li&gt;크게 성능에 도움되지 않는것을 확인&lt;/li&gt;
&lt;/ul&gt;
&lt;blockquote data-ke-style=&quot;style1&quot;&gt;&lt;span style=&quot;font-family: 'Noto Serif KR';&quot;&gt;요약 스타일을 학습시키는 방법이 없을까?&lt;/span&gt;&lt;/blockquote&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;DPO(Direct Preference Optimization)&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1280&quot; data-origin-height=&quot;347&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/bVkICe/btsMYf26avm/EvGQ6kapUPjodxZSDtqEo0/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/bVkICe/btsMYf26avm/EvGQ6kapUPjodxZSDtqEo0/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/bVkICe/btsMYf26avm/EvGQ6kapUPjodxZSDtqEo0/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FbVkICe%2FbtsMYf26avm%2FEvGQ6kapUPjodxZSDtqEo0%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;1280&quot; height=&quot;347&quot; data-origin-width=&quot;1280&quot; data-origin-height=&quot;347&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;요약은 단순히 문장을 짧게 줄이는 작업이 아닙니다. 어떤 내용을 강조할지, 어떤 어조를 쓸지, 문장의 구조를 어떻게 정리할지 등은 모두 사람마다 다른 &lt;b&gt;스타일&lt;/b&gt;에&amp;nbsp;해당합니다.&lt;/li&gt;
&lt;li&gt;DPO는 &lt;b&gt;Direct Preference Optimization&lt;/b&gt;, 즉 &lt;b&gt;직접적인 선호 최적화&lt;/b&gt;를 의미&lt;/li&gt;
&lt;li&gt;&lt;span style=&quot;font-family: -apple-system, BlinkMacSystemFont, 'Helvetica Neue', 'Apple SD Gothic Neo', Arial, sans-serif; letter-spacing: 0px;&quot;&gt;기존의 RLHF(Reinforcement Learning with Human Feedback)와는 달리, &lt;/span&gt;&lt;b&gt;사람의 선호만으로 모델을 직접 fine-tuning할 수 있는 방법&lt;/b&gt;&lt;span style=&quot;font-family: -apple-system, BlinkMacSystemFont, 'Helvetica Neue', 'Apple SD Gothic Neo', Arial, sans-serif; letter-spacing: 0px;&quot;&gt;입니다.&lt;/span&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;보상 모델(reward model)이 필요 없음&lt;/li&gt;
&lt;li&gt;쌍의 비교만으로 학습 가능&lt;/li&gt;
&lt;li&gt;더 간단하고 안정적인 방식&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;DPO 학습코드 일부&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1244&quot; data-origin-height=&quot;2048&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/doiSML/btsMXwdpx6W/zlDCqUwgnw9LfsR2Ek0aIk/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/doiSML/btsMXwdpx6W/zlDCqUwgnw9LfsR2Ek0aIk/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/doiSML/btsMXwdpx6W/zlDCqUwgnw9LfsR2Ek0aIk/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FdoiSML%2FbtsMXwdpx6W%2FzlDCqUwgnw9LfsR2Ek0aIk%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;364&quot; height=&quot;599&quot; data-origin-width=&quot;1244&quot; data-origin-height=&quot;2048&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;문제발생&lt;/p&gt;
&lt;blockquote data-ke-style=&quot;style3&quot;&gt;#Person1#은 아홉시 반 기차를 타야 한다. #Person2#는 #Person1#이 늦게까지 늦게까지 늦게까지 늦게까지 늦게까지 늦게까지 늦게까지 늦게까지 늦게까지 늦게까지 늦게까지 늦게까지 늦게까지 늦게까지 늦게까지 늦게까지 늦게까지 늦게까지 늦게까지 늦게까지 늦게까지 늦게까지 늦게까지 늦게까지 늦게까지 늦게까지 늦게까지 늦게까지 늦게까지 늦게까지 늦게까지 늦게까지 늦게까지 늦게까지 늦게까지 늦게까지 늦게까지 늦게까지 늦게까지 늦게까지 늦게까지 늦게까지 늦게까지 늦게까지 늦게까지 늦게까지 늦게까지 늦게까지 늦게까지 늦게까지 늦게까지 늦게까지 늦게까지 늦게까지 늦게까지 늦게까지 늦게까지 늦게까지 늦게까지 늦게까지 늦게까지 늦게까지 늦게까지 늦게까지 늦게까지 늦게까지 늦게까지 늦게까지 늦게까지 늦게까지 늦게까지 늦게까지&amp;hellip;.&lt;/blockquote&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;최대 토큰 길이까지 끝나지 않는 추론&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1280&quot; data-origin-height=&quot;335&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/boXJZy/btsMXn8IBZC/hOMeCylKXlWyRerLT5pqz1/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/boXJZy/btsMXn8IBZC/hOMeCylKXlWyRerLT5pqz1/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/boXJZy/btsMXn8IBZC/hOMeCylKXlWyRerLT5pqz1/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FboXJZy%2FbtsMXn8IBZC%2FhOMeCylKXlWyRerLT5pqz1%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;1280&quot; height=&quot;335&quot; data-origin-width=&quot;1280&quot; data-origin-height=&quot;335&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li data-end=&quot;825&quot; data-start=&quot;765&quot;&gt;decoder_input에 &lt;b&gt;eos_token도 같이 붙이기&lt;/b&gt; (훈련 시 학습 안정화 도움)&lt;/li&gt;
&lt;/ul&gt;
&lt;blockquote data-ke-style=&quot;style3&quot;&gt;요약:&lt;br /&gt;#Person1#은 #Person2#의 건강검진을 받고 담배를 끊도록 권했습니다.&lt;/blockquote&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;시간적 여유가 없어 많은 데이터로 학습하지 못하였지만,&lt;/li&gt;
&lt;li&gt;의미있는 시도정도로 마무리&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;한국어 모델 시도&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1280&quot; data-origin-height=&quot;275&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/lBMSd/btsMYsnDZQb/AKzcKnETcDO3nnrebshFBk/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/lBMSd/btsMYsnDZQb/AKzcKnETcDO3nnrebshFBk/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/lBMSd/btsMYsnDZQb/AKzcKnETcDO3nnrebshFBk/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FlBMSd%2FbtsMYsnDZQb%2FAKzcKnETcDO3nnrebshFBk%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;568&quot; height=&quot;122&quot; data-origin-width=&quot;1280&quot; data-origin-height=&quot;275&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1280&quot; data-origin-height=&quot;1349&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/1XVrw/btsMYhs8Zev/f5kdpuU2vml9yCaKn0NWk1/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/1XVrw/btsMYhs8Zev/f5kdpuU2vml9yCaKn0NWk1/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/1XVrw/btsMYhs8Zev/f5kdpuU2vml9yCaKn0NWk1/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2F1XVrw%2FbtsMYhs8Zev%2Ff5kdpuU2vml9yCaKn0NWk1%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;505&quot; height=&quot;532&quot; data-origin-width=&quot;1280&quot; data-origin-height=&quot;1349&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;hr contenteditable=&quot;false&quot; data-ke-type=&quot;horizontalRule&quot; data-ke-style=&quot;style6&quot; /&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;&lt;b&gt;데이터 증강&lt;/b&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;&lt;b&gt;Solar Pro 활용&lt;b&gt;&lt;/b&gt;&lt;/b&gt;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;Embedding + 코사인 유사도 를 활용&lt;/li&gt;
&lt;li&gt;비슷한 대화를 Few Shot 작용&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1280&quot; data-origin-height=&quot;1000&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/N4t4j/btsMWDj7IKX/Upit2OB2ryZb1IIFgW489K/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/N4t4j/btsMWDj7IKX/Upit2OB2ryZb1IIFgW489K/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/N4t4j/btsMWDj7IKX/Upit2OB2ryZb1IIFgW489K/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FN4t4j%2FbtsMWDj7IKX%2FUpit2OB2ryZb1IIFgW489K%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;1280&quot; height=&quot;1000&quot; data-origin-width=&quot;1280&quot; data-origin-height=&quot;1000&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;hr contenteditable=&quot;false&quot; data-ke-type=&quot;horizontalRule&quot; data-ke-style=&quot;style6&quot; /&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;&lt;b&gt;최종 결론&lt;/b&gt;&lt;/h3&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;blockquote data-ke-style=&quot;style1&quot;&gt;&lt;span style=&quot;font-family: 'Noto Serif KR';&quot;&gt;&lt;span style=&quot;color: #333333; font-size: 16px; letter-spacing: 0px; font-family: 'Noto Serif KR', serif;&quot;&gt;좋은 요약이란 무엇인가? 무엇을 해결해야할까?&lt;/span&gt;&lt;/span&gt;&lt;span style=&quot;font-family: 'Noto Serif KR';&quot;&gt;&lt;br /&gt;&lt;/span&gt;&lt;/blockquote&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;요약은&amp;nbsp;단순히&amp;nbsp;문장을&amp;nbsp;줄이는&amp;nbsp;작업이&amp;nbsp;아닙니다.&lt;/li&gt;
&lt;li&gt;좋은 요약의 조건
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;span style=&quot;font-family: -apple-system, BlinkMacSystemFont, 'Helvetica Neue', 'Apple SD Gothic Neo', Arial, sans-serif; letter-spacing: 0px;&quot;&gt;핵심 주제가 명확해야 함 : 글에서&amp;nbsp;가장&amp;nbsp;중요한&amp;nbsp;내용을&amp;nbsp;빠짐없이&amp;nbsp;담아야&amp;nbsp;함&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style=&quot;font-family: -apple-system, BlinkMacSystemFont, 'Helvetica Neue', 'Apple SD Gothic Neo', Arial, sans-serif; letter-spacing: 0px;&quot;&gt;불필요한 정보는 제거 : 군더더기&amp;nbsp;없이&amp;nbsp;핵심만&amp;nbsp;간결하게&amp;nbsp;전달&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style=&quot;font-family: -apple-system, BlinkMacSystemFont, 'Helvetica Neue', 'Apple SD Gothic Neo', Arial, sans-serif; letter-spacing: 0px;&quot;&gt;문장 구조가 깔끔해야 함 : 독자가&amp;nbsp;쉽게&amp;nbsp;이해할&amp;nbsp;수&amp;nbsp;있도록&amp;nbsp;자연스럽게&amp;nbsp;정리&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style=&quot;font-family: -apple-system, BlinkMacSystemFont, 'Helvetica Neue', 'Apple SD Gothic Neo', Arial, sans-serif; letter-spacing: 0px;&quot;&gt;요약 스타일이 일관성 있게 유지 :&amp;nbsp; 목적에&amp;nbsp;맞는&amp;nbsp;어조/형식으로&amp;nbsp;표현&amp;nbsp;(예:&amp;nbsp;논문,&amp;nbsp;블로그,&amp;nbsp;뉴스&amp;nbsp;등)&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-end=&quot;795&quot; data-start=&quot;779&quot; data-ke-size=&quot;size16&quot;&gt;&lt;u&gt;&lt;b&gt;해결을 위한 방향&lt;/b&gt;&lt;/u&gt;&lt;/p&gt;
&lt;h4 data-end=&quot;827&quot; data-start=&quot;797&quot; data-ke-size=&quot;size20&quot;&gt;1.&amp;nbsp; &lt;b&gt;스타일에 최적화된 학습 필요&lt;/b&gt;&lt;/h4&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-end=&quot;956&quot; data-start=&quot;829&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li data-end=&quot;870&quot; data-start=&quot;829&quot;&gt;동일한 내용도 &lt;b&gt;요약 스타일&lt;/b&gt;에 따라 표현이 완전히 달라질 수 있음&lt;/li&gt;
&lt;li data-end=&quot;956&quot; data-start=&quot;871&quot;&gt;따라서 원하는 스타일(간결한, 감성적인, 기술적인 등)에 맞는 &lt;b&gt;DPO 학습&lt;/b&gt; 또는 &lt;b&gt;SFT(Supervised Fine-tuning)&lt;/b&gt; 진행&lt;/li&gt;
&lt;/ul&gt;
&lt;h4 data-end=&quot;990&quot; data-start=&quot;958&quot; data-ke-size=&quot;size20&quot;&gt;2.&amp;nbsp; &lt;b&gt;데이터 증강을 통한 다양성 확보&lt;/b&gt;&lt;/h4&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-end=&quot;1060&quot; data-start=&quot;992&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li data-end=&quot;1027&quot; data-start=&quot;992&quot;&gt;한 입력에 대해 여러 요약 버전을 생성 &amp;rarr; 다양한 표현 학습&lt;/li&gt;
&lt;li data-end=&quot;1060&quot; data-start=&quot;1028&quot;&gt;사람 피드백이 가능하다면 &amp;rarr; DPO로 선호 스타일 반영&lt;/li&gt;
&lt;/ul&gt;
&lt;h4 data-end=&quot;1092&quot; data-start=&quot;1062&quot; data-ke-size=&quot;size20&quot;&gt;3.&amp;nbsp; &lt;b&gt;정제된 전처리 및 토큰 설정&lt;/b&gt;&lt;/h4&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-end=&quot;1184&quot; data-start=&quot;1094&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li data-end=&quot;1142&quot; data-start=&quot;1094&quot;&gt;eos_token 처리 누락, 잘못된 padding 등으로 인한 무한 반복 방지&lt;/li&gt;
&lt;li data-end=&quot;1184&quot; data-start=&quot;1143&quot;&gt;학습 시 시작/끝 토큰, attention mask 등을 신경 써야 함&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;</description>
      <category>프로그래밍/AI</category>
      <author>slowin</author>
      <guid isPermaLink="true">https://slowin.tistory.com/64</guid>
      <comments>https://slowin.tistory.com/64#entry64comment</comments>
      <pubDate>Wed, 26 Mar 2025 18:14:43 +0900</pubDate>
    </item>
    <item>
      <title>업스테이지 CV 경진대회 정리</title>
      <link>https://slowin.tistory.com/63</link>
      <description>&lt;h2 data-ke-size=&quot;size26&quot;&gt;업스테이지 Computer Vision&lt;/h2&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;대회 개요&lt;/h3&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;대회명&lt;/h4&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;문서 타입 분류를 위한 &lt;b&gt;이미지 분류 대회&lt;/b&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;대회 일정 (Timeline)&lt;/h4&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;프로젝트 전체 기간: 2월 14일 (금) ~ 2월 25일 (화)&lt;/li&gt;
&lt;/ul&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;팀:&lt;/h4&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;팀명: 오합지존 Idle&lt;/li&gt;
&lt;/ul&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;대회 소개&lt;/h4&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;Computer Vision에서 가장 중요한 태스크 중 하나인 이미지 분류 문제.&lt;/li&gt;
&lt;li&gt;&lt;b&gt;문서 타입 분류&lt;/b&gt;를 위한 이미지 분류 대회.
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;금융, 의료, 보험, 물류 등 다양한 산업에서 활용됨.&lt;/li&gt;
&lt;li&gt;대량의 문서 이미지를 식별하고 자동화 가능.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;데이터 개요&lt;/h4&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;총 17종의 문서 클래스&lt;/li&gt;
&lt;li&gt;&lt;b&gt;입력 (Input):&lt;/b&gt; 3,140장의 이미지&lt;/li&gt;
&lt;li&gt;&lt;b&gt;출력 (Output):&lt;/b&gt; 각 이미지의 문서 클래스 예측 (CSV 파일 제출)&lt;/li&gt;
&lt;/ul&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;기술적 기대 효과&lt;/h4&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;딥러닝 기반 이미지 분류 모델 구축&lt;/li&gt;
&lt;li&gt;Backbone 모델 활용 및 성능 최적화&lt;/li&gt;
&lt;li&gt;현업에서 사용하는 실 데이터 기반 학습&lt;/li&gt;
&lt;li&gt;다양한 테크닉 적용 가능&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;나의 역할&lt;/h2&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;팀장 / 이미지 전처리 / 모델링 / 환경구축&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;&amp;nbsp;&lt;/h2&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;팀 규칙&lt;/h2&gt;
&lt;ol style=&quot;list-style-type: decimal;&quot; data-ke-list-type=&quot;decimal&quot;&gt;
&lt;li&gt;대회성적보다는 개인의 성장 &amp;amp; 팀의 성장을 우선시한다.&lt;/li&gt;
&lt;li&gt;상호간의 지식 &amp;amp; 인사이트를 공유하며 함께 발전해나가는 경험을 한다.&lt;/li&gt;
&lt;li&gt;철저한 가설 검증을 프로세스를 반복하며 성장하는 경험을 한다.&lt;/li&gt;
&lt;/ol&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;&amp;nbsp;&lt;/h3&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;지속 가능한 성장 환경 구축&lt;/h3&gt;
&lt;ol style=&quot;list-style-type: decimal;&quot; data-ke-list-type=&quot;decimal&quot;&gt;
&lt;li&gt;회의시간 활용하기&lt;/li&gt;
&lt;/ol&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;학습 내용 공유&lt;/li&gt;
&lt;li&gt;부족한 학습내용 보완하기&lt;/li&gt;
&lt;li&gt;기록하기&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ol style=&quot;list-style-type: decimal;&quot; start=&quot;2&quot; data-ke-list-type=&quot;decimal&quot;&gt;
&lt;li&gt;TIL (Today I Learned)&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-filename=&quot;스크린샷 2025-03-02 오후 12.36.23.png&quot; data-origin-width=&quot;1310&quot; data-origin-height=&quot;710&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/kEm0E/btsMylqHYSa/UA5kLUISZFRXMrBQkjVIdK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/kEm0E/btsMylqHYSa/UA5kLUISZFRXMrBQkjVIdK/img.png&quot; data-alt=&quot;매일 배운 내용을 정리하고 팀원간 인사이트 얻기&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/kEm0E/btsMylqHYSa/UA5kLUISZFRXMrBQkjVIdK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FkEm0E%2FbtsMylqHYSa%2FUA5kLUISZFRXMrBQkjVIdK%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;460&quot; height=&quot;249&quot; data-filename=&quot;스크린샷 2025-03-02 오후 12.36.23.png&quot; data-origin-width=&quot;1310&quot; data-origin-height=&quot;710&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;매일 배운 내용을 정리하고 팀원간 인사이트 얻기&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;3214&quot; data-origin-height=&quot;1158&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/B00hZ/btsMA5TSAPj/7bIyNdNEK1B1EXJ7glORA0/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/B00hZ/btsMA5TSAPj/7bIyNdNEK1B1EXJ7glORA0/img.png&quot; data-alt=&quot;지속가능한 성장을 위한 기록에 대해 익숙해지기&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/B00hZ/btsMA5TSAPj/7bIyNdNEK1B1EXJ7glORA0/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FB00hZ%2FbtsMA5TSAPj%2F7bIyNdNEK1B1EXJ7glORA0%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;608&quot; height=&quot;219&quot; data-origin-width=&quot;3214&quot; data-origin-height=&quot;1158&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;지속가능한 성장을 위한 기록에 대해 익숙해지기&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/p&gt;
&lt;ol style=&quot;list-style-type: decimal;&quot; start=&quot;3&quot; data-ke-list-type=&quot;decimal&quot;&gt;
&lt;li&gt;철저한 가설 검증 단계를 통한 모델 개선 경험 해보기&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-filename=&quot;스크린샷 2025-03-02 오후 12.38.43.png&quot; data-origin-width=&quot;646&quot; data-origin-height=&quot;324&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/bnJIZv/btsMzuf7Rsp/5rWO8CLr574or9OBDgT4r1/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/bnJIZv/btsMzuf7Rsp/5rWO8CLr574or9OBDgT4r1/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/bnJIZv/btsMzuf7Rsp/5rWO8CLr574or9OBDgT4r1/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FbnJIZv%2FbtsMzuf7Rsp%2F5rWO8CLr574or9OBDgT4r1%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;367&quot; height=&quot;184&quot; data-filename=&quot;스크린샷 2025-03-02 오후 12.38.43.png&quot; data-origin-width=&quot;646&quot; data-origin-height=&quot;324&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;ol style=&quot;list-style-type: decimal;&quot; start=&quot;4&quot; data-ke-list-type=&quot;decimal&quot;&gt;
&lt;li&gt;실험 기록하기
&lt;ol style=&quot;list-style-type: decimal;&quot; data-ke-list-type=&quot;decimal&quot;&gt;
&lt;li&gt;Wandb 팀 프로젝트를 활용&lt;/li&gt;
&lt;li&gt;팀원간 실험 상황을 실시간으로 공유&lt;/li&gt;
&lt;li&gt;좋은 모델 &amp;amp; 파라미터를 실시간으로 시각화하고 인사이트를 얻음&lt;/li&gt;
&lt;/ol&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-filename=&quot;스크린샷 2025-03-02 오후 12.39.55.png&quot; data-origin-width=&quot;712&quot; data-origin-height=&quot;306&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/cRPmWr/btsMyu19s7N/PR3nEkV8uhn5ZK3LOJzH70/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/cRPmWr/btsMyu19s7N/PR3nEkV8uhn5ZK3LOJzH70/img.png&quot; data-alt=&quot;Wandb 팀 프로젝트&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/cRPmWr/btsMyu19s7N/PR3nEkV8uhn5ZK3LOJzH70/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FcRPmWr%2FbtsMyu19s7N%2FPR3nEkV8uhn5ZK3LOJzH70%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;470&quot; height=&quot;202&quot; data-filename=&quot;스크린샷 2025-03-02 오후 12.39.55.png&quot; data-origin-width=&quot;712&quot; data-origin-height=&quot;306&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;Wandb 팀 프로젝트&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;EDA&lt;/h2&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;1. Target 종류&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;2544&quot; data-origin-height=&quot;998&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/b9l6wc/btsMBgucvqA/RuGID9R9QqJ4Vn9DMY5tD0/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/b9l6wc/btsMBgucvqA/RuGID9R9QqJ4Vn9DMY5tD0/img.png&quot; data-alt=&quot;0~16 종류의 17가지 문서가 존재&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/b9l6wc/btsMBgucvqA/RuGID9R9QqJ4Vn9DMY5tD0/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fb9l6wc%2FbtsMBgucvqA%2FRuGID9R9QqJ4Vn9DMY5tD0%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;715&quot; height=&quot;280&quot; data-origin-width=&quot;2544&quot; data-origin-height=&quot;998&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;0~16 종류의 17가지 문서가 존재&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;- 특이사항: Target 별로 40~100개의 분포를 가지는 데이터 불균형이 존재&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;2. 실제 이미지 형태&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;- 훈련이미지: 특별히 변형이 되어있지 않지않음&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;- 테스트 데이터: 회전, 컷믹스, 믹스업, 노이즈, 밝기조절, 재질 변형 등&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-filename=&quot;스크린샷 2025-03-02 오후 4.29.12.png&quot; data-origin-width=&quot;3704&quot; data-origin-height=&quot;930&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/cpquXM/btsMBjqRwhA/5GGby5hB0HMHCwoZHkxjuK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/cpquXM/btsMBjqRwhA/5GGby5hB0HMHCwoZHkxjuK/img.png&quot; data-alt=&quot;훈련 데이터 &amp;amp;amp; 테스트 이미지&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/cpquXM/btsMBjqRwhA/5GGby5hB0HMHCwoZHkxjuK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FcpquXM%2FbtsMBjqRwhA%2F5GGby5hB0HMHCwoZHkxjuK%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;3704&quot; height=&quot;930&quot; data-filename=&quot;스크린샷 2025-03-02 오후 4.29.12.png&quot; data-origin-width=&quot;3704&quot; data-origin-height=&quot;930&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;훈련 데이터 &amp;amp; 테스트 이미지&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/p&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;&amp;nbsp;&lt;/h2&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;Preprocessing&lt;/h2&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;3개의 라이브러리를 활용해 전처리 진행&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;1. Albumentations&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;2. Augraphy&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;3. Open CV&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;i&gt;&lt;b&gt;다양한 메소드를 통한 전처리 진행&lt;/b&gt;&lt;/i&gt;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-filename=&quot;스크린샷 2025-03-02 오후 4.31.45.png&quot; data-origin-width=&quot;1250&quot; data-origin-height=&quot;358&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/EB3xd/btsMArwlcIh/KDkqUtzVTFADKpwKvCkSnK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/EB3xd/btsMArwlcIh/KDkqUtzVTFADKpwKvCkSnK/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/EB3xd/btsMArwlcIh/KDkqUtzVTFADKpwKvCkSnK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FEB3xd%2FbtsMArwlcIh%2FKDkqUtzVTFADKpwKvCkSnK%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;685&quot; height=&quot;196&quot; data-filename=&quot;스크린샷 2025-03-02 오후 4.31.45.png&quot; data-origin-width=&quot;1250&quot; data-origin-height=&quot;358&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-filename=&quot;스크린샷 2025-03-02 오후 4.32.28.png&quot; data-origin-width=&quot;1268&quot; data-origin-height=&quot;368&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/bhHm6v/btsMBjxDGaA/hClEgMTTyimpMB8Mqyn8JK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/bhHm6v/btsMBjxDGaA/hClEgMTTyimpMB8Mqyn8JK/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/bhHm6v/btsMBjxDGaA/hClEgMTTyimpMB8Mqyn8JK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FbhHm6v%2FbtsMBjxDGaA%2FhClEgMTTyimpMB8Mqyn8JK%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;702&quot; height=&quot;204&quot; data-filename=&quot;스크린샷 2025-03-02 오후 4.32.28.png&quot; data-origin-width=&quot;1268&quot; data-origin-height=&quot;368&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-filename=&quot;스크린샷 2025-03-02 오후 4.32.39.png&quot; data-origin-width=&quot;1190&quot; data-origin-height=&quot;346&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/2DoGR/btsMAtVeISJ/1HruAtYjy3rLZp3YskfFJk/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/2DoGR/btsMAtVeISJ/1HruAtYjy3rLZp3YskfFJk/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/2DoGR/btsMAtVeISJ/1HruAtYjy3rLZp3YskfFJk/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2F2DoGR%2FbtsMAtVeISJ%2F1HruAtYjy3rLZp3YskfFJk%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;657&quot; height=&quot;191&quot; data-filename=&quot;스크린샷 2025-03-02 오후 4.32.39.png&quot; data-origin-width=&quot;1190&quot; data-origin-height=&quot;346&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b&gt;Mixup&lt;/b&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;믹스업은 유리재질, 반사재질의 경우 존재하므로 특정 target끼리 믹스업을 진행&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-filename=&quot;스크린샷 2025-03-02 오후 4.33.30.png&quot; data-origin-width=&quot;1014&quot; data-origin-height=&quot;346&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/W0w6S/btsMykS2Lya/MCK09RWQnG3xctlKh84q1K/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/W0w6S/btsMykS2Lya/MCK09RWQnG3xctlKh84q1K/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/W0w6S/btsMykS2Lya/MCK09RWQnG3xctlKh84q1K/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FW0w6S%2FbtsMykS2Lya%2FMCK09RWQnG3xctlKh84q1K%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;725&quot; height=&quot;247&quot; data-filename=&quot;스크린샷 2025-03-02 오후 4.33.30.png&quot; data-origin-width=&quot;1014&quot; data-origin-height=&quot;346&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b&gt;이미지 사이즈 분포&lt;/b&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;- &quot;443*591&quot; 사이즈가 대부분&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;- 특이 사항으로 테스트 이미지는 회전으로 인해 가로로 긴 &quot;591*443&quot;사이즈가 많음.&lt;/p&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;&amp;nbsp;&lt;/h2&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;가설 검증&lt;/h2&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;&lt;b&gt;&lt;span style=&quot;font-family: AppleSDGothicNeo-Regular, 'Malgun Gothic', '맑은 고딕', dotum, 돋움, sans-serif;&quot;&gt;가설1. Padding 처리를 일괄되게 처리한다면 학습추론시 도움이 될것이다.&lt;/span&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;font-family: AppleSDGothicNeo-Regular, 'Malgun Gothic', '맑은 고딕', dotum, 돋움, sans-serif;&quot;&gt;- 기본 세로이미지가 대부분이기때문에 전체이미지를 세로로 변환 후 모델링 진행&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;830&quot; data-origin-height=&quot;364&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/cXEHMz/btsMz8qefwF/u5WK8XzmChdgsADU69lqsk/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/cXEHMz/btsMz8qefwF/u5WK8XzmChdgsADU69lqsk/img.png&quot; data-alt=&quot;왼쪽: 원본 가로 이미지 , 오른쪽: 세로 정렬 이미지&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/cXEHMz/btsMz8qefwF/u5WK8XzmChdgsADU69lqsk/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FcXEHMz%2FbtsMz8qefwF%2Fu5WK8XzmChdgsADU69lqsk%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;830&quot; height=&quot;364&quot; data-origin-width=&quot;830&quot; data-origin-height=&quot;364&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;왼쪽: 원본 가로 이미지 , 오른쪽: 세로 정렬 이미지&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;실험 파라미터&lt;/p&gt;
&lt;pre id=&quot;code_1740901131999&quot; class=&quot;python&quot; data-ke-language=&quot;python&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;{
    epochs: 2,
    batch_size: 100,
    learnning_rate: 0.001,
    model: ResNet18
}&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;font-family: 'Noto Sans Demilight', 'Noto Sans KR';&quot;&gt;실험 결과&lt;/span&gt;&lt;/p&gt;
&lt;pre id=&quot;code_1740901178533&quot; class=&quot;python&quot; data-ke-language=&quot;python&quot; data-ke-type=&quot;codeblock&quot;&gt;&lt;code&gt;F1 Score
Valid: 0.926
Public: 0.3549&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b&gt;결론:&lt;/b&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;세로 변환후 padding 처리만으로 public f1 score 0.35는 의미있는 점수라고 판단.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b&gt;&lt;span style=&quot;color: #000000;&quot;&gt;입력을 일관되게 처리하는 전처리가 학습/추론시 도움이 된다.&lt;/span&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;&lt;b&gt;가설2. 이미지 사이즈가 작아지면 정보손실로 성능에 악형향이 있을것이다.&lt;/b&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;실험: 원본이미지의 1/10사이즈로 학습/추론&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;결론:&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;- 0.3549 -&amp;gt; 0.0334 (-0.0209)하락&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;- 정보손실로 인해 예측 악영향 가능성 확인&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;&lt;b&gt;가설3. 훈련 이미지에 회전 이미지를 추가하면 성능이 향상될 것이다.&lt;/b&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;실험A:&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;1. 훈련 이미지: 원본100%&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;2. 검증 이미지: 쉬프트 회전 3400개&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;실험B:&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;1. 훈련 이미지: 원본 100% + 회전이미지 1700개&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;2. 검증 이미지: 쉬프트 회전 3400개&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;결과:&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;- Valid : 0.33 -&amp;gt; 0.58( + 0.2524)&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;- Public: 0.35-&amp;gt;0.78( + 0.4262)&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b&gt;- 추론하고자하는 이미지의 상태(노이즈, 회전 등)을 훈련시에 포함시키는 것이 중요하다.&lt;/b&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;blockquote data-ke-style=&quot;style2&quot;&gt;새로운 발견 (1)&lt;br /&gt;특정 Target에 대한 오탐률이 집중되어있다.&amp;nbsp;&lt;br /&gt;(비슷한 문서는 예측에 어려움이 있다.)&lt;br /&gt;&lt;br /&gt;새로운 발견 (2)&lt;br /&gt;Epoch마다 오탐 Target의 분포가 변한다&lt;br /&gt;적절한 시점에 학습을 중단이 중요할수있다.&lt;br /&gt;추론시 각 Epoch 모델 Voting 전략 가능성 확인&amp;nbsp;&lt;/blockquote&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;532&quot; data-origin-height=&quot;406&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/JBVoQ/btsMzOFCDzo/FIweHkWZKB81QIQlal1oc0/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/JBVoQ/btsMzOFCDzo/FIweHkWZKB81QIQlal1oc0/img.png&quot; data-alt=&quot;오탐의 Target이 집중&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/JBVoQ/btsMzOFCDzo/FIweHkWZKB81QIQlal1oc0/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FJBVoQ%2FbtsMzOFCDzo%2FFIweHkWZKB81QIQlal1oc0%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;246&quot; height=&quot;188&quot; data-origin-width=&quot;532&quot; data-origin-height=&quot;406&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;오탐의 Target이 집중&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;872&quot; data-origin-height=&quot;318&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/pgO7T/btsMyD5RJ5q/V1gG2e471AFo6UbJ0GZ1b1/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/pgO7T/btsMyD5RJ5q/V1gG2e471AFo6UbJ0GZ1b1/img.png&quot; data-alt=&quot;에폭마다 틀리는 Target이 변함&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/pgO7T/btsMyD5RJ5q/V1gG2e471AFo6UbJ0GZ1b1/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FpgO7T%2FbtsMyD5RJ5q%2FV1gG2e471AFo6UbJ0GZ1b1%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;622&quot; height=&quot;227&quot; data-origin-width=&quot;872&quot; data-origin-height=&quot;318&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;에폭마다 틀리는 Target이 변함&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;&lt;b&gt;가설4. 더 많은 데이터를 학습하면 학습에 도움이 될 것이다.&lt;/b&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;실험:&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;- 회전이미지 1700-&amp;gt; 3400개 증강&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b&gt;결론: F1 Score 향상에 도움이 된다.&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;620&quot; data-origin-height=&quot;314&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/enLiYj/btsMyta90HD/5bOVGQ3vIbaKejEsmkrDIk/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/enLiYj/btsMyta90HD/5bOVGQ3vIbaKejEsmkrDIk/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/enLiYj/btsMyta90HD/5bOVGQ3vIbaKejEsmkrDIk/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FenLiYj%2FbtsMyta90HD%2F5bOVGQ3vIbaKejEsmkrDIk%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;456&quot; height=&quot;231&quot; data-origin-width=&quot;620&quot; data-origin-height=&quot;314&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;blockquote data-ke-style=&quot;style2&quot;&gt;그렇다면, 데이터를 계속해서 증가시킨다면 F1 Score는 계속해서 향상될것인가?&lt;br /&gt;실험: 회전이미지 3400 -&amp;gt; 6800개 증강&lt;br /&gt;결과: 임계점이 존재한다.&lt;/blockquote&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-filename=&quot;스크린샷 2025-03-02 오후 4.49.24.png&quot; data-origin-width=&quot;636&quot; data-origin-height=&quot;302&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/Denbk/btsMyvNCjmk/xz3QQ3X4vahfEQDcN3bVfK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/Denbk/btsMyvNCjmk/xz3QQ3X4vahfEQDcN3bVfK/img.png&quot; data-alt=&quot;임계점이 존재한다.&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/Denbk/btsMyvNCjmk/xz3QQ3X4vahfEQDcN3bVfK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FDenbk%2FbtsMyvNCjmk%2Fxz3QQ3X4vahfEQDcN3bVfK%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;512&quot; height=&quot;243&quot; data-filename=&quot;스크린샷 2025-03-02 오후 4.49.24.png&quot; data-origin-width=&quot;636&quot; data-origin-height=&quot;302&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;임계점이 존재한다.&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;&lt;b&gt;가설5. 모든 이미지 전처리 + 데이터 증강을 적용하면 성능 향상될것이다.&lt;/b&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;지금까지의 실험을 바탕으로 이미지 전처리 및 데이터 증강은 모델성능형능 향상에 도움이 된다는것을 알게되었습니다.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;그렇다면, 다양한 이미지 전처리 와 이미지를 증강시킨다면 성능 향상될것이라고 생각했습니다.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-filename=&quot;스크린샷 2025-03-02 오후 4.51.24.png&quot; data-origin-width=&quot;1062&quot; data-origin-height=&quot;396&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/raF9d/btsMzhVAyYr/8PsLejhrYtLC0knc6V1Vk0/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/raF9d/btsMzhVAyYr/8PsLejhrYtLC0knc6V1Vk0/img.png&quot; data-alt=&quot;실험 A&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/raF9d/btsMzhVAyYr/8PsLejhrYtLC0knc6V1Vk0/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FraF9d%2FbtsMzhVAyYr%2F8PsLejhrYtLC0knc6V1Vk0%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;541&quot; height=&quot;202&quot; data-filename=&quot;스크린샷 2025-03-02 오후 4.51.24.png&quot; data-origin-width=&quot;1062&quot; data-origin-height=&quot;396&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;실험 A&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;Train F1: 0.95&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;Valid F1: 0.78&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-filename=&quot;스크린샷 2025-03-02 오후 4.52.04.png&quot; data-origin-width=&quot;1060&quot; data-origin-height=&quot;396&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/bfbUTs/btsMyk6uvIG/0K6RHYIcp3toaL5ReNUiDK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/bfbUTs/btsMyk6uvIG/0K6RHYIcp3toaL5ReNUiDK/img.png&quot; data-alt=&quot;실험 B&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/bfbUTs/btsMyk6uvIG/0K6RHYIcp3toaL5ReNUiDK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FbfbUTs%2FbtsMyk6uvIG%2F0K6RHYIcp3toaL5ReNUiDK%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;509&quot; height=&quot;190&quot; data-filename=&quot;스크린샷 2025-03-02 오후 4.52.04.png&quot; data-origin-width=&quot;1060&quot; data-origin-height=&quot;396&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;실험 B&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;- 이미지 노이지 전체 적용&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;- 데이터 114610개 증강&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;결과:&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;Train F1: 0.99&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;Valid F1: 0.98&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;Public F1: 0.90&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b&gt;검증하고자 하는 이미지(테스트이미지)와 유사한 환경의 이미지가 필요하며 다양한 이미지 셋이 필요하다&lt;/b&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;&lt;b&gt;가설6. 각 모델의 장점을 활용하여 모델앙상블을 한다면 성능향상될것이다.&lt;/b&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;실험: ResNet + &lt;span style=&quot;background-color: #ffffff; color: #111827; text-align: start;&quot;&gt;Efficientnet 앙상블&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;background-color: #ffffff; color: #111827; text-align: start;&quot;&gt;결과: &lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b&gt;&lt;span style=&quot;background-color: #ffffff; color: #111827; text-align: start;&quot;&gt;- 성능향상 : 0.90 -&amp;gt; 0.94&lt;/span&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b&gt;&lt;span style=&quot;background-color: #ffffff; color: #111827; text-align: start;&quot;&gt;- 각 모델의 장점을 활용한 앙상블은 효과가 있다.&lt;/span&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b&gt;&lt;span style=&quot;background-color: #ffffff; color: #111827; text-align: start;&quot;&gt;- hard voting, soft voting 의 성능은 큰 차이가 있지 않다.&lt;/span&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-filename=&quot;스크린샷 2025-03-02 오후 4.55.31.png&quot; data-origin-width=&quot;646&quot; data-origin-height=&quot;262&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/dZil5G/btsMAHZ3LvU/iDQwJGHXgUKlu7ieFemLlK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/dZil5G/btsMAHZ3LvU/iDQwJGHXgUKlu7ieFemLlK/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/dZil5G/btsMAHZ3LvU/iDQwJGHXgUKlu7ieFemLlK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FdZil5G%2FbtsMAHZ3LvU%2FiDQwJGHXgUKlu7ieFemLlK%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;392&quot; height=&quot;159&quot; data-filename=&quot;스크린샷 2025-03-02 오후 4.55.31.png&quot; data-origin-width=&quot;646&quot; data-origin-height=&quot;262&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;blockquote data-ke-style=&quot;style1&quot;&gt;사람은 어떻게 문서를 인식할까?&lt;/blockquote&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;754&quot; data-origin-height=&quot;644&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/MHxcU/btsMALOUfdO/yYFg7VGMYWAFd6B8LsMuK0/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/MHxcU/btsMALOUfdO/yYFg7VGMYWAFd6B8LsMuK0/img.png&quot; data-alt=&quot;사람 인식 추상화&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/MHxcU/btsMALOUfdO/yYFg7VGMYWAFd6B8LsMuK0/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FMHxcU%2FbtsMALOUfdO%2FyYFg7VGMYWAFd6B8LsMuK0%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;328&quot; height=&quot;280&quot; data-origin-width=&quot;754&quot; data-origin-height=&quot;644&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;사람 인식 추상화&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;848&quot; data-origin-height=&quot;642&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/vB5AW/btsMA41L4dl/A899X0qhNrmo8bI4dL0TS1/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/vB5AW/btsMA41L4dl/A899X0qhNrmo8bI4dL0TS1/img.png&quot; data-alt=&quot;AI 인식 추상화&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/vB5AW/btsMA41L4dl/A899X0qhNrmo8bI4dL0TS1/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FvB5AW%2FbtsMA41L4dl%2FA899X0qhNrmo8bI4dL0TS1%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;334&quot; height=&quot;253&quot; data-origin-width=&quot;848&quot; data-origin-height=&quot;642&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;AI 인식 추상화&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;&lt;b&gt;가설7. 타이틀을 탐지 할 수 있다면 문서를 전체를 보지 않아도 문서 분류가 가능 할 것이다&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-filename=&quot;스크린샷 2025-03-02 오후 5.13.16.png&quot; data-origin-width=&quot;400&quot; data-origin-height=&quot;316&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/biq4O1/btsMyvGRmsH/nycBfOj0sihg5Kkf3OyVrk/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/biq4O1/btsMyvGRmsH/nycBfOj0sihg5Kkf3OyVrk/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/biq4O1/btsMyvGRmsH/nycBfOj0sihg5Kkf3OyVrk/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fbiq4O1%2FbtsMyvGRmsH%2FnycBfOj0sihg5Kkf3OyVrk%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;170&quot; height=&quot;134&quot; data-filename=&quot;스크린샷 2025-03-02 오후 5.13.16.png&quot; data-origin-width=&quot;400&quot; data-origin-height=&quot;316&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;모델: YOLOv12&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;데이터라벨링: Label Studio&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;1540개 훈련데이터에 대한 라벨링 진행&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-filename=&quot;스크린샷 2025-03-02 오후 5.13.54.png&quot; data-origin-width=&quot;1042&quot; data-origin-height=&quot;448&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/ueCRd/btsMz84VCUW/0xmcqFHGpkAVqMJWxE0zVk/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/ueCRd/btsMz84VCUW/0xmcqFHGpkAVqMJWxE0zVk/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/ueCRd/btsMz84VCUW/0xmcqFHGpkAVqMJWxE0zVk/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FueCRd%2FbtsMz84VCUW%2F0xmcqFHGpkAVqMJWxE0zVk%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;556&quot; height=&quot;239&quot; data-filename=&quot;스크린샷 2025-03-02 오후 5.13.54.png&quot; data-origin-width=&quot;1042&quot; data-origin-height=&quot;448&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;결과: 불규칙적이고 일관성이 부족한 라벨링으로 인해 성능에 큰 도움이 되지 않음&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;blockquote data-ke-style=&quot;style2&quot;&gt;BBox라벨링 재처리가장많이 틀리는 문서들만 YOLO탐지로 변경&lt;br /&gt;1. 입퇴원확인서&lt;br /&gt;2. 진단서&lt;br /&gt;3. 외래_진료확인서&lt;br /&gt;4. 소견서_진료소견서&lt;/blockquote&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;- 가장 많이 틀리는 4개의 Target만 객체탐지 진행&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;- ResNet + Efficientnet + YOLOv12 앙상블&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-filename=&quot;스크린샷 2025-03-02 오후 5.15.50.png&quot; data-origin-width=&quot;322&quot; data-origin-height=&quot;142&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/b7nmcj/btsMyDx2yvx/rYIpfPZEPOknbyL39asTB0/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/b7nmcj/btsMyDx2yvx/rYIpfPZEPOknbyL39asTB0/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/b7nmcj/btsMyDx2yvx/rYIpfPZEPOknbyL39asTB0/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fb7nmcj%2FbtsMyDx2yvx%2FrYIpfPZEPOknbyL39asTB0%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;322&quot; height=&quot;142&quot; data-filename=&quot;스크린샷 2025-03-02 오후 5.15.50.png&quot; data-origin-width=&quot;322&quot; data-origin-height=&quot;142&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;결론: 전체를 보는 Stage_2 모델과 일부를 탐색하는 Stage_1 모델의 앙상블은 성능 향상에 도움이 된다.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;blockquote data-ke-style=&quot;style2&quot;&gt;성능 극대화 시도&lt;br /&gt;TTA&lt;/blockquote&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;- 테스트 데이터에서 가장 많이 틀리는 문서를 탐색해본결과,&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;- 회전되어있거나 반전된 이미지에 대한 오탐률이 높은것을 확인&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;- 90도 회전, 좌우반전, 상하반전 테스트 타임 증강을 시도&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-filename=&quot;스크린샷 2025-03-02 오후 5.17.46.png&quot; data-origin-width=&quot;486&quot; data-origin-height=&quot;94&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/bKnKmj/btsMyu2bU54/gNVuMXrYYfy5AL5vCD4xz0/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/bKnKmj/btsMyu2bU54/gNVuMXrYYfy5AL5vCD4xz0/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/bKnKmj/btsMyu2bU54/gNVuMXrYYfy5AL5vCD4xz0/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FbKnKmj%2FbtsMyu2bU54%2FgNVuMXrYYfy5AL5vCD4xz0%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;486&quot; height=&quot;94&quot; data-filename=&quot;스크린샷 2025-03-02 오후 5.17.46.png&quot; data-origin-width=&quot;486&quot; data-origin-height=&quot;94&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;결과: 0.9617 -&amp;gt; 0.9691 로 소폭의 성능향상을 볼 수 있음&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;blockquote data-ke-style=&quot;style2&quot;&gt;이슈,&lt;br /&gt;YOLO학습시 Target 전체 데이터를 보지 않다보니, 비슷한 글자에 대한 오탐이 발생&lt;br /&gt;오탐이 발생하는 Target에 대해 학습 데이터 추가&lt;/blockquote&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;806&quot; data-origin-height=&quot;366&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/cyaI3q/btsMBhs296s/IyZsDmB2maFmV1Wn7GGR5K/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/cyaI3q/btsMBhs296s/IyZsDmB2maFmV1Wn7GGR5K/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/cyaI3q/btsMBhs296s/IyZsDmB2maFmV1Wn7GGR5K/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FcyaI3q%2FbtsMBhs296s%2FIyZsDmB2maFmV1Wn7GGR5K%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;806&quot; height=&quot;366&quot; data-origin-width=&quot;806&quot; data-origin-height=&quot;366&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;&lt;b&gt;가설8. 처방전, 자동차등록증, 이력서를 학습에 포함시킨다면 분류성능 향상될것이다.&lt;/b&gt;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;입퇴원확인서&lt;/li&gt;
&lt;li&gt;진단서&lt;/li&gt;
&lt;li&gt;외래_진료확인서&lt;/li&gt;
&lt;li&gt;소견서_진료소견서&lt;/li&gt;
&lt;li&gt;&lt;span style=&quot;color: #ee2323;&quot;&gt;처방전 (추가)&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style=&quot;color: #ee2323;&quot;&gt;자동차등록증 (추가)&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style=&quot;color: #ee2323;&quot;&gt;이력서 (추가)&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-filename=&quot;스크린샷 2025-03-02 오후 5.20.18.png&quot; data-origin-width=&quot;888&quot; data-origin-height=&quot;330&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/t0Drq/btsMxG9sPV7/gBAUiIQN2vqb2beWc6xWS1/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/t0Drq/btsMxG9sPV7/gBAUiIQN2vqb2beWc6xWS1/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/t0Drq/btsMxG9sPV7/gBAUiIQN2vqb2beWc6xWS1/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Ft0Drq%2FbtsMxG9sPV7%2FgBAUiIQN2vqb2beWc6xWS1%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;888&quot; height=&quot;330&quot; data-filename=&quot;스크린샷 2025-03-02 오후 5.20.18.png&quot; data-origin-width=&quot;888&quot; data-origin-height=&quot;330&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;결론 :&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;- 많은 오탐이 줄어들었음을 확인&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;color: #000000;&quot;&gt;- 문서의 글자 패턴:&amp;nbsp;&lt;/span&gt;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li style=&quot;list-style-type: disc; color: #000000;&quot;&gt;&lt;span style=&quot;color: #000000;&quot;&gt;흰색 바탕 + 곡선 &amp;amp; 직선 조합이 많음&lt;/span&gt;&lt;/li&gt;
&lt;li style=&quot;list-style-type: disc; color: #000000;&quot;&gt;&lt;span style=&quot;color: #000000;&quot;&gt;다양한 문서 유형 추가 및 균형 잡힌 데이터셋 필요&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;최종채택:&lt;/p&gt;
&lt;p style=&quot;text-align: center;&quot; data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;color: #fc5230;&quot;&gt;Efficientnet + ResNet + SWIN + YOLO 앙상블&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-filename=&quot;스크린샷 2025-03-02 오후 5.21.14.png&quot; data-origin-width=&quot;576&quot; data-origin-height=&quot;74&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/HWzwd/btsMAmaPckr/6DjZKUBgJKKRlCclGZymk1/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/HWzwd/btsMAmaPckr/6DjZKUBgJKKRlCclGZymk1/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/HWzwd/btsMAmaPckr/6DjZKUBgJKKRlCclGZymk1/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FHWzwd%2FbtsMAmaPckr%2F6DjZKUBgJKKRlCclGZymk1%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;483&quot; height=&quot;62&quot; data-filename=&quot;스크린샷 2025-03-02 오후 5.21.14.png&quot; data-origin-width=&quot;576&quot; data-origin-height=&quot;74&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;최종회고&amp;nbsp;&lt;/h3&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-filename=&quot;스크린샷 2025-03-02 오후 5.49.05.png&quot; data-origin-width=&quot;642&quot; data-origin-height=&quot;124&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/bzowCW/btsMBiZNAUN/LJ7zgcE9KOKtTRMCqUGsC0/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/bzowCW/btsMBiZNAUN/LJ7zgcE9KOKtTRMCqUGsC0/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/bzowCW/btsMBiZNAUN/LJ7zgcE9KOKtTRMCqUGsC0/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FbzowCW%2FbtsMBiZNAUN%2FLJ7zgcE9KOKtTRMCqUGsC0%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;414&quot; height=&quot;80&quot; data-filename=&quot;스크린샷 2025-03-02 오후 5.49.05.png&quot; data-origin-width=&quot;642&quot; data-origin-height=&quot;124&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;지난 대회 &quot;아파트경진대회&quot;에서의 최종 점수는 거의 최하위권이었으며, &lt;b&gt;Base Codeline&lt;/b&gt;의 성능을 크게 벗어나지 못했습니다. 대회 종료 후 &lt;b&gt;Submission 제출이 불가능&lt;/b&gt;해졌지만, 이후에도 혼자 실험을 지속하며 문제점을 회고했습니다.&lt;/p&gt;
&lt;hr data-ke-style=&quot;style1&quot; /&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;주요 회고 내용&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b&gt;1. Base Codeline에 집착하여 새로운 접근을 시도하지 못함&lt;/b&gt;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;기존 코드에 의존하다 보니 &lt;b&gt;새로운 피처 엔지니어링 및 모델링 아이디어&lt;/b&gt;를 적용하지 못함&lt;/li&gt;
&lt;li&gt;창의적인 접근보다는 &lt;b&gt;기존 프레임워크를 개선하는 것&lt;/b&gt;에 초점을 맞춤&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;2. &lt;b&gt;기록 부족으로 실험 과정이 명확하지 않음&lt;/b&gt;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;실험 결과를 정리하지 않아 &lt;b&gt;어떤 시도가 효과적이었는지 분석하기 어려움&lt;/b&gt;&lt;/li&gt;
&lt;li&gt;같은 실험을 반복하는 비효율적인 상황 발생&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b&gt;3. 가설 기반 검증 부족으로 지속적인 모델 성능 향상이 어려움&lt;/b&gt;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;b&gt;데이터 분석 및 가설 검증 과정 없이&lt;/b&gt; 여러 방법을 시도함&lt;/li&gt;
&lt;li&gt;모델 성능 개선이 체계적이지 못하고, &lt;b&gt;랜덤한 시도에 의존&lt;/b&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;hr data-ke-style=&quot;style1&quot; /&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;개선 방향&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;1.&amp;nbsp; &lt;b&gt;Base Codeline을 참고하되, 독자적인 모델 개선 시도&lt;/b&gt;&lt;br /&gt;&amp;rarr; 다양한 피처 엔지니어링, 모델 구조 변경, 새로운 접근법 실험&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b&gt;2. 실험 기록을 철저히 관리&lt;/b&gt;&lt;br /&gt;&amp;rarr; &lt;b&gt;Weights &amp;amp; Biases&lt;/b&gt;(wandb) 활용하여 실험 로그를 체계적으로 기록&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;3. &lt;b&gt;가설 기반 실험 설계&lt;/b&gt;&lt;br /&gt;&amp;rarr; 데이터 탐색 후 &lt;b&gt;명확한 가설 수립 &amp;rarr; 검증 &amp;rarr; 피드백 반영&lt;/b&gt; 과정 수행&lt;/p&gt;
&lt;hr data-ke-style=&quot;style1&quot; /&gt;
&lt;h3 data-end=&quot;33&quot; data-start=&quot;0&quot; data-ke-size=&quot;size23&quot;&gt;지난 대회를 바탕으로 한 이번 목표 &amp;amp; 결과&lt;/h3&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;b&gt;실험을 체계적으로 기록하고 분석하는 습관&lt;/b&gt;을 들여 재현 가능성을 높이기&lt;/li&gt;
&lt;li&gt;다양한 모델링 기법을 적용하며 &lt;b&gt;자신만의 최적화 전략 개발&lt;/b&gt;&lt;/li&gt;
&lt;li&gt;다음 경진대회에서는 &lt;b&gt;단순 점수 개선이 아닌, 문제 해결력 자체를 키우는 것&lt;/b&gt;에 집중&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 data-end=&quot;273&quot; data-start=&quot;259&quot; data-ke-size=&quot;size26&quot;&gt;도전과 성장&lt;/h2&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-end=&quot;448&quot; data-start=&quot;275&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li data-end=&quot;320&quot; data-start=&quot;275&quot;&gt;운 좋게도 &lt;b&gt;좋은 팀원분들을 만나게 되었고, 팀장을 맡아 협업을 주도&lt;/b&gt;&lt;/li&gt;
&lt;li data-end=&quot;448&quot; data-start=&quot;321&quot;&gt;&lt;b&gt;TIL(오늘 배운 것, Today I Learned) 공유 및 상호 피드백&lt;/b&gt;을 통해
&lt;ul style=&quot;list-style-type: disc;&quot; data-end=&quot;448&quot; data-start=&quot;377&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li data-end=&quot;410&quot; data-start=&quot;377&quot;&gt;개인의 성장뿐만 아니라 팀의 성장도 함께 이루는 경험&lt;/li&gt;
&lt;li data-end=&quot;448&quot; data-start=&quot;413&quot;&gt;실험 과정과 문제 해결 방식에 대한 깊이 있는 논의 진행&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 data-end=&quot;479&quot; data-start=&quot;455&quot; data-ke-size=&quot;size26&quot;&gt;결과: 전기수 최고 점수 달성&lt;/h2&gt;
&lt;p data-end=&quot;580&quot; data-start=&quot;481&quot; data-ke-size=&quot;size16&quot;&gt;지난 대회에서는 &lt;b&gt;최하위권&lt;/b&gt;이었지만, 이번에는 &lt;b&gt;전기수 최고 점수&lt;/b&gt;로 마무리!&lt;br /&gt;목표했던 &lt;b&gt;문제 해결력 강화 및 협업을 통한 성과 창출&lt;/b&gt;을 달성할 수 있었습니다.&lt;/p&gt;
&lt;p data-end=&quot;580&quot; data-start=&quot;481&quot; data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 data-end=&quot;602&quot; data-start=&quot;587&quot; data-ke-size=&quot;size26&quot;&gt;앞으로의 방향&lt;/h2&gt;
&lt;p data-end=&quot;718&quot; data-start=&quot;604&quot; data-ke-size=&quot;size16&quot;&gt;이러한 경험을 바탕으로:&lt;br /&gt;✅ &lt;b&gt;더 효율적인 협업 프로세스 구축&lt;/b&gt;&lt;br /&gt;✅ &lt;b&gt;지속적인 실험 및 피드백을 통한 문제 해결력 강화&lt;/b&gt;&lt;br /&gt;✅ &lt;b&gt;다음 대회에서도 성장을 이어나가며 한 단계 더 도약&lt;/b&gt;&lt;/p&gt;
&lt;p data-is-only-node=&quot;&quot; data-is-last-node=&quot;&quot; data-end=&quot;774&quot; data-start=&quot;720&quot; data-ke-size=&quot;size16&quot;&gt;  &lt;b&gt;점수만을 목표로 하지 않고, 문제 해결력을 기르는 과정 자체에 집중하는 자세를 유지!&lt;/b&gt;&lt;/p&gt;</description>
      <category>프로그래밍/AI</category>
      <author>slowin</author>
      <guid isPermaLink="true">https://slowin.tistory.com/63</guid>
      <comments>https://slowin.tistory.com/63#entry63comment</comments>
      <pubDate>Sun, 2 Mar 2025 18:04:08 +0900</pubDate>
    </item>
    <item>
      <title>tensor: dim 이해하기</title>
      <link>https://slowin.tistory.com/62</link>
      <description>&lt;h3 data-ke-size=&quot;size23&quot;&gt;dim이란?&lt;/h3&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;Tensor에서 dim을 지정하는 이유는 데이터를 특정 차원(축, dimension) 기준으로 연산하기 위해서입니다.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;머신러닝과 딥러닝에서는 다차원 데이터(Tensor)를 다룰 때 차원을 기준으로 합(sum), 평균(mean), 최대값(max) 등의 연산을 수행해야 합니다.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;코드 예시&lt;/h4&gt;
&lt;pre id=&quot;code_1739243893052&quot; style=&quot;background-color: #f8f8f8; color: #383a42; text-align: start;&quot; data-ke-type=&quot;codeblock&quot; data-ke-language=&quot;python&quot;&gt;&lt;code&gt;
# 3x3 Tensor 생성
tensor = torch.tensor([[1, 2, 3], 
                       [4, 5, 6], 
                       [7, 8, 9]])

# 전체 합계
print(torch.sum(tensor))  # 45

# 열을 기준으로 합계 (dim=0) &amp;rarr; 세로 방향 합
print(torch.sum(tensor, dim=0))  # tensor([12, 15, 18])

# 행을 기준으로 합계 (dim=1) &amp;rarr; 가로 방향 합
print(torch.sum(tensor, dim=1))  # tensor([ 6, 15, 24])

# 최대값 구하기
print(torch.max(tensor, dim=1))  # (tensor([3, 6, 9]), tensor([2, 2, 2]))  (값, 인덱스)

# 출력
tensor(45)
tensor([12, 15, 18])
tensor([ 6, 15, 24])
torch.return_types.max(
values=tensor([3, 6, 9]),
indices=tensor([2, 2, 2]))&lt;/code&gt;&lt;/pre&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;그림으로 이해하기&lt;/h4&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;817&quot; data-origin-height=&quot;634&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/b0k35q/btsMdIZ1n9s/jd7Va49DPkKJ08EGB5aqj1/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/b0k35q/btsMdIZ1n9s/jd7Va49DPkKJ08EGB5aqj1/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/b0k35q/btsMdIZ1n9s/jd7Va49DPkKJ08EGB5aqj1/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fb0k35q%2FbtsMdIZ1n9s%2Fjd7Va49DPkKJ08EGB5aqj1%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;817&quot; height=&quot;634&quot; data-origin-width=&quot;817&quot; data-origin-height=&quot;634&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;956&quot; data-origin-height=&quot;634&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/GUTkx/btsMeMUGjxA/Tuakg8bJbRbjeUjoJBrftk/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/GUTkx/btsMeMUGjxA/Tuakg8bJbRbjeUjoJBrftk/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/GUTkx/btsMeMUGjxA/Tuakg8bJbRbjeUjoJBrftk/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FGUTkx%2FbtsMeMUGjxA%2FTuakg8bJbRbjeUjoJBrftk%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;956&quot; height=&quot;634&quot; data-origin-width=&quot;956&quot; data-origin-height=&quot;634&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;914&quot; data-origin-height=&quot;634&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/bUm9AY/btsMeKCxGZK/LzFMGkVJZ8vcQjUV54aqw1/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/bUm9AY/btsMeKCxGZK/LzFMGkVJZ8vcQjUV54aqw1/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/bUm9AY/btsMeKCxGZK/LzFMGkVJZ8vcQjUV54aqw1/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FbUm9AY%2FbtsMeKCxGZK%2FLzFMGkVJZ8vcQjUV54aqw1%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;914&quot; height=&quot;634&quot; data-origin-width=&quot;914&quot; data-origin-height=&quot;634&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;986&quot; data-origin-height=&quot;634&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/l5G4c/btsMec7mniY/JBKHrNSAwzFpc3LKx4WtdK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/l5G4c/btsMec7mniY/JBKHrNSAwzFpc3LKx4WtdK/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/l5G4c/btsMec7mniY/JBKHrNSAwzFpc3LKx4WtdK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fl5G4c%2FbtsMec7mniY%2FJBKHrNSAwzFpc3LKx4WtdK%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;986&quot; height=&quot;634&quot; data-origin-width=&quot;986&quot; data-origin-height=&quot;634&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;</description>
      <category>프로그래밍/AI</category>
      <author>slowin</author>
      <guid isPermaLink="true">https://slowin.tistory.com/62</guid>
      <comments>https://slowin.tistory.com/62#entry62comment</comments>
      <pubDate>Tue, 11 Feb 2025 12:20:47 +0900</pubDate>
    </item>
    <item>
      <title>역전파</title>
      <link>https://slowin.tistory.com/61</link>
      <description>&lt;p&gt;&lt;strong&gt;여러 층으로 구성된 뉴럴 네트워크(Neural Network)의 구조&lt;/strong&gt;를 이해하기 쉽게 정리해봄&lt;/p&gt;
&lt;h2&gt;  &lt;strong&gt;뉴럴 네트워크를 학생들이 여러 줄로 서 있는 구조로 이해하기&lt;/strong&gt;&lt;/h2&gt;
&lt;p&gt;  &lt;strong&gt;상황&lt;/strong&gt;  &lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;4명의 학생(A, B, C, D)&lt;/strong&gt;이 &lt;strong&gt;한 줄로만 서 있는 것이 아니라 여러 층(줄)&lt;/strong&gt;으로 서 있음.  &lt;/li&gt;
&lt;li&gt;즉, 한 줄이 아니라 &lt;strong&gt;입력층(첫 번째 줄), 은닉층(중간 줄), 출력층(마지막 줄)&lt;/strong&gt;으로 구성됨.  &lt;/li&gt;
&lt;li&gt;각 학생(뉴런)은 자신이 받은 정보를 가공해서 다음 층으로 전달함.&lt;/li&gt;
&lt;/ul&gt;
&lt;hr&gt;
&lt;h2&gt;&lt;strong&gt;⏳ Step 1: 순전파 (Forward Propagation)&lt;/strong&gt;&lt;/h2&gt;
&lt;h3&gt;&lt;strong&gt;학생들이 여러 줄(층)로 서 있음&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;  &lt;strong&gt;예제:&lt;/strong&gt;&lt;br&gt;문제 = &lt;code&gt;&amp;quot;5 + 3 = ?&amp;quot;을 풀어라&lt;/code&gt;  &lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;입력층 (Input Layer)&lt;/th&gt;
&lt;th&gt;은닉층 (Hidden Layer)&lt;/th&gt;
&lt;th&gt;출력층 (Output Layer)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;A1: &amp;quot;5를 입력&amp;quot;&lt;/td&gt;
&lt;td&gt;B1: &amp;quot;이걸 2배 해야겠어!&amp;quot;&lt;/td&gt;
&lt;td&gt;C1: &amp;quot;이제 1을 빼야겠어!&amp;quot;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;A2: &amp;quot;3을 입력&amp;quot;&lt;/td&gt;
&lt;td&gt;B2: &amp;quot;이건 그냥 놔둘래!&amp;quot;&lt;/td&gt;
&lt;td&gt;C2: &amp;quot;그대로 전달!&amp;quot;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;1️⃣ &lt;strong&gt;A층(입력층)&lt;/strong&gt;  &lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;A1: &lt;code&gt;&amp;quot;나는 5를 입력할게!&amp;quot;&lt;/code&gt;  &lt;/li&gt;
&lt;li&gt;A2: &lt;code&gt;&amp;quot;나는 3을 입력할게!&amp;quot;&lt;/code&gt;  &lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;2️⃣ &lt;strong&gt;B층(첫 번째 은닉층)&lt;/strong&gt;  &lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;B1: &lt;code&gt;&amp;quot;A1이 5라고 했으니까, 2배 해서 10으로 만들자!&amp;quot;&lt;/code&gt;  &lt;/li&gt;
&lt;li&gt;B2: &lt;code&gt;&amp;quot;A2가 3이라고 했으니 그냥 놔둘래.&amp;quot;&lt;/code&gt;  &lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;3️⃣ &lt;strong&gt;C층(출력층)&lt;/strong&gt;  &lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;C1: &lt;code&gt;&amp;quot;B1이 10이라고 했으니, 1을 빼서 9로 만들자!&amp;quot;&lt;/code&gt;  &lt;/li&gt;
&lt;li&gt;C2: &lt;code&gt;&amp;quot;B2가 3이라고 했으니 그대로 전달!&amp;quot;&lt;/code&gt;  &lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;  &lt;strong&gt;출력층의 최종 답: &lt;code&gt;9&lt;/code&gt; (틀림)&lt;/strong&gt;&lt;br&gt;→ 이제 역전파를 진행해서 고쳐야 함!&lt;/p&gt;
&lt;hr&gt;
&lt;h2&gt;&lt;strong&gt;  Step 2: 역전파 (Backward Propagation)&lt;/strong&gt;&lt;/h2&gt;
&lt;p&gt;✅ &lt;strong&gt;선생님(정답 제공자):&lt;/strong&gt; &lt;code&gt;&amp;quot;정답은 8인데, 너희가 9라고 했어! 어디서 틀렸는지 확인하자!&amp;quot;&lt;/code&gt;  &lt;/p&gt;
&lt;p&gt;1️⃣ &lt;strong&gt;C층(출력층) 피드백&lt;/strong&gt;  &lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;C1: &lt;code&gt;&amp;quot;아! 내가 1을 빼는 게 아니라 2를 빼야 했어!&amp;quot;&lt;/code&gt;  &lt;/li&gt;
&lt;li&gt;C2: &lt;code&gt;&amp;quot;내가 준 정보는 문제 없네!&amp;quot;&lt;/code&gt;&lt;br&gt;→ &lt;strong&gt;C층이 값을 수정하고 B층에 피드백 전달&lt;/strong&gt;  &lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;2️⃣ &lt;strong&gt;B층(은닉층) 피드백&lt;/strong&gt;  &lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;B1: &lt;code&gt;&amp;quot;아, 2배 하는 게 아니라 1.5배만 해야겠네!&amp;quot;&lt;/code&gt;  &lt;/li&gt;
&lt;li&gt;B2: &lt;code&gt;&amp;quot;나는 잘했으니까 수정할 필요 없어.&amp;quot;&lt;/code&gt;&lt;br&gt;→ &lt;strong&gt;B층이 값을 수정하고 A층에 피드백 전달&lt;/strong&gt;  &lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;3️⃣ &lt;strong&gt;A층(입력층) 피드백&lt;/strong&gt;  &lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;A1: &lt;code&gt;&amp;quot;5를 입력할 때, 더 신경 써야겠다!&amp;quot;&lt;/code&gt;  &lt;/li&gt;
&lt;li&gt;A2: &lt;code&gt;&amp;quot;나는 문제 없었네!&amp;quot;&lt;/code&gt;  &lt;/li&gt;
&lt;/ul&gt;
&lt;hr&gt;
&lt;h2&gt;&lt;strong&gt;  Step 3: 수정된 데이터로 다시 학습&lt;/strong&gt;&lt;/h2&gt;
&lt;p&gt;  &lt;strong&gt;시간: 2초&lt;/strong&gt;  &lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;모든 학생들이 피드백을 반영하여 다시 답을 만들어 봄.  &lt;/li&gt;
&lt;li&gt;&lt;strong&gt;반복할수록 점점 정답에 가까워짐!&lt;/strong&gt;    &lt;/li&gt;
&lt;/ul&gt;
&lt;hr&gt;
&lt;h2&gt;&lt;strong&gt;  뉴럴 네트워크 정리&lt;/strong&gt;&lt;/h2&gt;
&lt;p&gt;  &lt;strong&gt;뉴럴 네트워크는 여러 층이 존재하며, 각 층이 정보를 조금씩 변형해서 다음 층으로 전달하는 방식!&lt;/strong&gt;  &lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;개념&lt;/th&gt;
&lt;th&gt;설명&lt;/th&gt;
&lt;th&gt;예시&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;입력층&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;문제의 입력 값을 받음&lt;/td&gt;
&lt;td&gt;5와 3을 입력&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;은닉층&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;데이터를 변형해서 처리함&lt;/td&gt;
&lt;td&gt;&amp;quot;2배 하기&amp;quot; → &amp;quot;빼기 1&amp;quot;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;출력층&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;최종 답을 계산&lt;/td&gt;
&lt;td&gt;9 (틀림)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;역전파&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;틀린 값을 수정하면서 다시 전달&lt;/td&gt;
&lt;td&gt;&amp;quot;2배가 아니라 1.5배!&amp;quot;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;hr&gt;
&lt;h2&gt;&lt;strong&gt;  결론&lt;/strong&gt;&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;뉴럴넷은 한 줄이 아니라 &lt;strong&gt;여러 줄(층)이 존재&lt;/strong&gt;함.  &lt;/li&gt;
&lt;li&gt;&lt;strong&gt;순전파:&lt;/strong&gt; 입력층 → 은닉층 → 출력층 순서로 정보 전달  &lt;/li&gt;
&lt;li&gt;&lt;strong&gt;역전파:&lt;/strong&gt; 출력층 → 은닉층 → 입력층으로 틀린 부분을 수정  &lt;/li&gt;
&lt;li&gt;&lt;strong&gt;반복 학습:&lt;/strong&gt; 여러 번 수정하면서 점점 정답을 찾아감  &lt;/li&gt;
&lt;/ul&gt;</description>
      <category>프로그래밍/AI</category>
      <author>slowin</author>
      <guid isPermaLink="true">https://slowin.tistory.com/61</guid>
      <comments>https://slowin.tistory.com/61#entry61comment</comments>
      <pubDate>Mon, 10 Feb 2025 19:01:59 +0900</pubDate>
    </item>
    <item>
      <title>머신 러닝 발전 5단계</title>
      <link>https://slowin.tistory.com/60</link>
      <description>&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1426&quot; data-origin-height=&quot;4291&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/NFRwe/btsMezt98xQ/NZj0GRhPwwvRpCnPks8CSk/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/NFRwe/btsMezt98xQ/NZj0GRhPwwvRpCnPks8CSk/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/NFRwe/btsMezt98xQ/NZj0GRhPwwvRpCnPks8CSk/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FNFRwe%2FbtsMezt98xQ%2FNZj0GRhPwwvRpCnPks8CSk%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;1426&quot; height=&quot;4291&quot; data-origin-width=&quot;1426&quot; data-origin-height=&quot;4291&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;</description>
      <category>프로그래밍/AI</category>
      <author>slowin</author>
      <guid isPermaLink="true">https://slowin.tistory.com/60</guid>
      <comments>https://slowin.tistory.com/60#entry60comment</comments>
      <pubDate>Mon, 10 Feb 2025 18:55:07 +0900</pubDate>
    </item>
  </channel>
</rss>