From 74503ab24b57c5400b3aaf512bb7dc833fb7c1aa Mon Sep 17 00:00:00 2001 From: happybell80 Date: Sun, 22 Mar 2026 10:06:12 +0900 Subject: [PATCH] =?UTF-8?q?docs(research/rag):=20PostgreSQL=20=ED=95=9C?= =?UTF-8?q?=EA=B5=AD=EC=96=B4=20FTS=C2=B7=ED=82=A4=EC=9B=8C=EB=93=9C=20?= =?UTF-8?q?=EA=B2=80=EC=83=89=20=ED=95=9C=EA=B3=84=20=EC=9A=94=EC=95=BD?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Made-with: Cursor --- ...‚ค์›Œ๋“œ๊ฒ€์ƒ‰_ํ•œ๊ณ„_๋ฐ_๋Œ€์•ˆ_์š”์•ฝ.md | 49 +++++++++++++++++++ 1 file changed, 49 insertions(+) create mode 100644 journey/research/rag/260323_PostgreSQL_simple_FTS_ํ•œ๊ตญ์–ด_ํ‚ค์›Œ๋“œ๊ฒ€์ƒ‰_ํ•œ๊ณ„_๋ฐ_๋Œ€์•ˆ_์š”์•ฝ.md diff --git a/journey/research/rag/260323_PostgreSQL_simple_FTS_ํ•œ๊ตญ์–ด_ํ‚ค์›Œ๋“œ๊ฒ€์ƒ‰_ํ•œ๊ณ„_๋ฐ_๋Œ€์•ˆ_์š”์•ฝ.md b/journey/research/rag/260323_PostgreSQL_simple_FTS_ํ•œ๊ตญ์–ด_ํ‚ค์›Œ๋“œ๊ฒ€์ƒ‰_ํ•œ๊ณ„_๋ฐ_๋Œ€์•ˆ_์š”์•ฝ.md new file mode 100644 index 0000000..5c8117d --- /dev/null +++ b/journey/research/rag/260323_PostgreSQL_simple_FTS_ํ•œ๊ตญ์–ด_ํ‚ค์›Œ๋“œ๊ฒ€์ƒ‰_ํ•œ๊ณ„_๋ฐ_๋Œ€์•ˆ_์š”์•ฝ.md @@ -0,0 +1,49 @@ +--- +type: research +tags: [research, rag, postgresql, fts, korean, pg_trgm, hybrid-search, skill-rag-file] +status: open +research_target: PostgreSQL simple + prefix(:*) ํ•œ๊ตญ์–ด ํ‚ค์›Œ๋“œ ๊ฒ€์ƒ‰ ํ•œ๊ณ„, ๊ณต์‹ ๋ฌธ์„œ ๊ทผ๊ฑฐ, pg_trgmยทํ˜•ํƒœ์†Œยทํ•˜์ด๋ธŒ๋ฆฌ๋“œยท์ฟผ๋ฆฌ/์Šค์ฝ”์–ด ๋ณด์™„์•ˆ ์ •๋ฆฌ (Gemini ๋Œ€ํ™” + ์šด์˜ ๊ด€์ฐฐ ๋ฐ˜์˜) +--- + +# PostgreSQL simple FTSยทํ•œ๊ตญ์–ด ํ‚ค์›Œ๋“œ ๊ฒ€์ƒ‰ ํ•œ๊ณ„ ๋ฐ ๋Œ€์•ˆ ์š”์•ฝ + +**์ž‘์„ฑ์ผ**: 2026-03-23 +**์„ฑ๊ฒฉ**: ์™ธ๋ถ€ LLM(Gemini) ๋…ผ์˜ยท์‚ฌ์šฉ์ž ํ™•์ธยท๊ณต์‹ ๋ฌธ์„œ ๋งํฌ๋ฅผ ํ•œ ์žฅ์œผ๋กœ ์••์ถ•ํ•œ ๋ฆฌ์„œ์น˜ ๋ฉ”๋ชจ. ๊ตฌํ˜„ ํ™•์ •์ด ์•„๋‹˜. + +## 1. ํ˜„์ƒ (์šด์˜์—์„œ ๊ด€์ฐฐ) + +- `search_mode: keyword` ๋‹จ๋… ๊ฒฝ๋กœ์—์„œ **0๊ฑด**์ด ๋‚˜์˜ค๋Š” ๊ฒฝ์šฐ๊ฐ€ ์žˆ์Œ. +- P0์—์„œ `prefix(:*)` + ๋‚ฎ์€ threshold๋กœ **ํŠน์ • ๊ณ ์œ ๋ช…์‚ฌ** ์ฟผ๋ฆฌ์—์„œ๋Š” ๊ฑด์ˆ˜๊ฐ€ ๋‚˜์™”์œผ๋‚˜, **๋‹ค๋ฅธ ์งˆ์˜์—์„œ๋Š” 0๊ฑด** โ†’ prefix๋งŒ์œผ๋กœ ์ „ ์งˆ์˜๋ฅผ ์ปค๋ฒ„ํ•˜์ง€ ๋ชปํ•จ. +- **๋ฐ์ดํ„ฐ ๋ถ€์žฌ๊ฐ€ ์•„๋‹˜**: ์›๋ณธยท`tsvector`๋Š” DB์— ์žˆ์œผ๋‚˜, **ํ‚ค์›Œ๋“œ ๋‹จ๋… ๊ฒ€์ƒ‰ ๊ฒฝ๋กœ์˜ ์ฟผ๋ฆฌ/์ ์ˆ˜/thresholdยท๋กœ์ง** ๋•Œ๋ฌธ์— ๊ฒฐ๊ณผ๊ฐ€ ์‚ด์•„๋‚˜์ง€ ์•Š๋Š” ์ผ€์ด์Šค์— ๊ฐ€๊น๋‹ค๋Š” ์ •๋ฆฌ๊ฐ€ ์žˆ์Œ (์ถ”์ถœ ์‹คํŒจ ์ฃผ์žฅ๋ณด๋‹ค ๊ฒ€์ƒ‰ ๊ฒฝ๋กœ ์‹คํŒจ). + +## 2. PostgreSQL ๊ณต์‹ ๊ด€์  (ํ™•์ธ๋œ ํ•ด์„) + +- **Prefix `:*`**: `tsvector` ์•ˆ์˜ **lexeme ์ ‘๋‘**๋งŒ ๋งž์ถ˜๋‹ค. ํ˜•ํƒœ์†Œ ๋ถ„์„์„ ์ƒˆ๋กœ ํ•ด ์ฃผ์ง€ ์•Š๋Š”๋‹ค. + - ์ฐธ๊ณ : [Text Search Controls](https://www.postgresql.org/docs/current/textsearch-controls.html), [Text Search Functions](https://www.postgresql.org/docs/current/functions-textsearch.html) +- **simple + ํ•œ๊ตญ์–ด**: ๊ณต๋ฐฑ ์ค‘์‹ฌ ํ† ํฐํ™”๋กœ **์กฐ์‚ฌ ๊ฒฐํ•ฉยท๋ณตํ•ฉ๋ช…์‚ฌยท๊ณต๋ฐฑ ์—†๋Š” ๋ถ™์ž„** ๋“ฑ์—์„œ lexeme์ด ๊ธฐ๋Œ€์™€ ๋‹ค๋ฅด๊ฒŒ ์Œ“์ด๋ฉด, `@@` / rank๊ฐ€ ๊ธฐ๋Œ€์™€ ์–ด๊ธ‹๋‚  ์ˆ˜ ์žˆ์Œ. +- **pg_trgm**: FTS๊ฐ€ ์ง์ ‘ ๋ชป ๋งž์ถ”๋Š” ํŒจํ„ด ๋ณด์™„์— ์œ ์šฉํ•˜๋‹ค๋Š” ์„ค๋ช…์ด ๋ฌธ์„œ์— ์žˆ์Œ (๋ฒ„์ „๋ณ„ ๋ฌธ์„œ; ์˜ˆ: [pg_trgm](https://www.postgresql.org/docs/current/pgtrgm.html)). + +โ†’ ๋ฆฌ์„œ์น˜์—์„œ ์“ด **ใ€Œ90%๋Š” ์ƒํ•œ์ด์ง€ ๋ณด์žฅ๊ฐ’์ด ์•„๋‹˜ใ€**๋ฅ˜์˜ ํŒ๋‹จ๊ณผ ๋ฐฉํ–ฅ์ด ๊ณต์‹ ์„ค๋ช…๊ณผ ์ถฉ๋Œํ•˜์ง€ ์•Š์Œ. + +## 3. ๋Œ€์•ˆ ์ถ• (์šฐ์„ ์ˆœ์œ„๋Š” ์ธํ”„๋ผยทํŒ€ ํ•ฉ์˜ ํ›„) + +| ์ถ• | ์š”์ง€ | ์žฅ์  | ๋ถ€๋‹ด | +|----|------|------|------| +| **pg_trgm** | 3-gram + `similarity` / `ILIKE` ์ธ๋ฑ์‹ฑ | ์‚ฌ์ „ ์—†์ด ๋ถ€๋ถ„ ์ผ์น˜ยท๋ณตํ•ฉ์–ด ํ‹ˆ ๋ณด์™„ | ์ธ๋ฑ์Šค ํฌ๊ธฐยท๋…ธ์ด์ฆˆ ํ›„๋ณด | +| **ํ•œ๊ตญ์–ด ํ˜•ํƒœ์†Œ (์˜ˆ: MeCab ๊ณ„์—ด)** | ์กฐ์‚ฌ ๋ถ„๋ฆฌยทํ‘œ์ œ์–ด ์ค‘์‹ฌ ์ธ๋ฑ์‹ฑ | ๊ฒ€์ƒ‰ ํ’ˆ์งˆ ์ƒํ•œ ๋†’์Œ | OS/ํ™•์žฅ ์„ค์น˜ยท์šด์˜ | +| **ํ•˜์ด๋ธŒ๋ฆฌ๋“œ** | FTS ๋žญํฌ + trgm ์œ ์‚ฌ๋„ ๋“ฑ ๊ฐ€์ค‘ ๊ฒฐํ•ฉ | 0๊ฑด ๊ตฌ๊ฐ„ ์™„ํ™”ยท์žฌํ˜„์œจ ๋ณด์™„ | ์Šค์ฝ”์–ด ์„ค๊ณ„ยทํŠœ๋‹ | +| **์ฟผ๋ฆฌ/์Šค์ฝ”์–ด ๋กœ์ง** | AND๋งŒ ๊ณ ์ง‘ํ•˜์ง€ ์•Š๊ธฐ, ORยท์™„ํ™” ๋‹จ๊ณ„, **raw score ๋กœ๊น…** ํ›„ threshold ์ ์‘ | ์•ฑ ๋ ˆ๋ฒจ์—์„œ ๋น ๋ฅธ ์‹คํ—˜ | ๊ณผ๋งค์นญยทํŠœ๋‹ ๋ถ€์ฑ„ | + +## 4. ๊ฒ€์ƒ‰ ๊ฒฝ๋กœ ์ชฝ์—์„œ์˜ ์ฆ‰์‹œ ์ ๊ฒ€ ์•„์ด๋””์–ด (์ฝ”๋“œ/์šด์˜) + +- `plainto_tsquery` / `to_tsquery` / `websearch_to_tsquery` ์ค‘ **๋ฌด์—‡์„ ์“ฐ๋Š”์ง€** ํ™•์ธ. +- 0๊ฑด์ผ ๋•Œ **ํ•„ํ„ฐ ์ „ raw score** ๋กœ๊ทธ๋กœ โ€œ์ง„์งœ 0 ๋งค์นญ์ธ์ง€ vs ์ž„๊ณ„๊ฐ’ ์ปท์ธ์ง€โ€ ๋ถ„๋ฆฌ. +- ํ•„์š” ์‹œ **์™„ํ™” ๋‹จ๊ณ„**: ์—„๊ฒฉ ๊ฒ€์ƒ‰ โ†’ threshold ์™„ํ™” โ†’ OR/ prefix ํ˜ผํ•ฉ โ†’ trgm ํด๋ฐฑ ๋“ฑ. + +## 5. ๊ด€๋ จ ๋‚ด๋ถ€ ๋ฌธ์„œ (์žˆ์œผ๋ฉด ๋งํฌ) + +- ํ•˜์ด๋ธŒ๋ฆฌ๋“œ/ํ‚ค์›Œ๋“œ ๋ฆฌ์ฝœ ์ด์Šˆ: `journey/research/rag/260321_ํ•˜์ด๋ธŒ๋ฆฌ๋“œ๊ฒ€์ƒ‰_keyword_recall0_๋ฐ_grounding_์‹คํŒจ_์›์ธํ™•์ •_๋ฆฌ์„œ์น˜.md` ๋“ฑ (ํ”„๋ก ํŠธ๋ฉ”ํƒ€ `status`๋Š” ํ•ด๋‹น ํŒŒ์ผ ๊ธฐ์ค€). + +## 6. ํ•œ๊ณ„ + +- ๋ณธ ๋ฌธ์„œ๋Š” **๋Œ€ํ™”ยท๋ฌธํ—Œ ์š”์•ฝ**์ด๋ฉฐ, 23์„œ๋ฒ„ ์‹ค์ œ `EXPLAIN`ยท์Šค์ฝ”์–ด ๋ถ„ํฌยท๋กœ๋น™ ์ฝ”๋“œ ๋ผ์ธ์— ๋Œ€ํ•œ **๊ฒ€์ฆ ๊ฒฐ๊ณผ๋Š” ํฌํ•จํ•˜์ง€ ์•Š์Œ**. ๋‹ค์Œ ์ž‘์—…์—์„œ Truth First๋กœ ๋ณด๊ฐ•ํ•  ๊ฒƒ.