From 31b8c0b7190185c6ee7c611738cd2c6f163242f2 Mon Sep 17 00:00:00 2001
From: happybell80 <happybell80@gmail.com>
Date: Tue, 5 Aug 2025 13:44:12 +0900
Subject: [PATCH] =?UTF-8?q?docs:=20rb10508=5Fmicro=20HTTP=20=EC=9E=84?=
 =?UTF-8?q?=EB=B2=A0=EB=94=A9=20=EC=A0=84=ED=99=98=20=EC=84=B1=EA=B3=B5=20?=
 =?UTF-8?q?=EA=B8=B0=EB=A1=9D=20=EC=B6=94=EA=B0=80?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

- 메모리 988MB → 118MB (88% 감소)
- HTTP 임베딩 7ms 응답 시간
- 예상보다 훨씬 좋은 결과 달성
---
 ...pybell80_skill-embedding서비스구축.md | 63 ++++++++++++++++++-
 1 file changed, 62 insertions(+), 1 deletion(-)

diff --git a/troubleshooting/250805_happybell80_skill-embedding서비스구축.md b/troubleshooting/250805_happybell80_skill-embedding서비스구축.md
index cc2281c..844c8ac 100644
--- a/troubleshooting/250805_happybell80_skill-embedding서비스구축.md
+++ b/troubleshooting/250805_happybell80_skill-embedding서비스구축.md
@@ -152,4 +152,65 @@ class HTTPEmbeddingFunction(EmbeddingFunction):
 **다음 작업**:
 - rb10508_micro의 memory.py 수정
 - ONNXEmbeddingFunction → HTTPEmbeddingFunction 교체
-- 메모리 절감 효과 측정
\ No newline at end of file
+- 메모리 절감 효과 측정
+
+## 오후 1시 40분
+
+### rb10508_micro HTTP 임베딩 전환 대성공
+
+**목표**: rb10508_micro의 ONNX 임베딩을 HTTP 방식으로 전환
+
+**구현 방식**:
+```python
+# memory.py에 간단한 HTTPEmbeddingFunction 추가
+class HTTPEmbeddingFunction(EmbeddingFunction):
+    def __init__(self):
+        self.url = f"{os.getenv('SKILL_EMBEDDING_URL', 'http://localhost:8015')}/embed"
+    
+    def __call__(self, input: List[str]) -> List[List[float]]:
+        if not input:
+            return []
+        response = requests.post(self.url, json={"texts": input}, timeout=30)
+        response.raise_for_status()
+        return response.json()["embeddings"]
+```
+
+**변경사항**:
+1. memory.py: HTTPEmbeddingFunction 직접 구현 (파일 복사 없이)
+2. requirements.txt: onnxruntime, transformers 제거
+3. docker-compose.yml: ONNX 볼륨 제거, SKILL_EMBEDDING_URL 추가
+
+**배포 결과 - 극적인 메모리 절감**:
+```
+배포 전: 988.1 MiB
+배포 후: 118.4 MiB
+절약량: 870 MiB (88% 감소!)
+```
+
+**성능 검증**:
+- 헬스체크: 정상 (Up 50초, healthy)
+- API 응답: 정상 작동
+- HTTP 임베딩: 7ms 처리 시간
+- skill-embedding 연동: 완벽
+
+## 교훈 (추가)
+
+6. **예상보다 좋은 결과**
+   - 목표 400MB → 실제 118MB (예상의 30%)
+   - ONNX 제거만으로 870MB 절감
+   - PyTorch 의존성이 생각보다 무거웠음
+
+7. **간단한 구현의 힘**
+   - 파일 복사 대신 직접 구현 (12줄)
+   - 불필요한 추상화 제거
+   - 예외처리는 서비스 레벨에서 충분
+
+8. **HTTP 임베딩의 장점**
+   - 극적인 메모리 절감 (88%)
+   - 7ms 레이턴시는 무시할 수준
+   - 중앙 관리로 업데이트 용이
+
+9. **아키텍처 검증**
+   - 임베딩 서비스 분리 전략 성공
+   - 다른 로빙들도 같은 방식 적용 가능
+   - 100개 로빙 = 87GB 메모리 절약 가능
\ No newline at end of file