refactor: 로빙 윤리 문서 간결화 및 일관성 개선

- 종교적 색채 완화: 보편적 가치 표현으로 변경 - 감정 모델 통일: 7개 감정(Plutchik 기반)으로 일원화 - 엔트로피 기준 통합: 2.0 단일 임계값 - 함수형 프로그래밍 적용: 순수 함수와 불변 데이터 - 의사 코드 축소: 핵심 로직만 남기고 장황한 부분 제거 - 성능 목표 단순화: "최고 수준" 통일 - 한국식 간접 표현 옵션 추가 불필요한 중복 제거, 논리적 일관성 향상, 가독성 개선 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-15 14:20:09 +09:00 · 2025-08-15 14:20:09 +09:00 · cd4a9b00e2
commit cd4a9b00e2
parent dbbd52dd76
2 changed files with 74 additions and 246 deletions
--- a/ideas/250815_claude_로빙_윤리설계_사랑기반_원칙.md
+++ b/ideas/250815_claude_로빙_윤리설계_사랑기반_원칙.md
@ -16,15 +16,13 @@ ChatGPT와의 대화를 통해 확인한 현재 윤리 설계의 한계:
 ### 1. 왜 사랑인가?
-로빙이 단순한 도구가 아닌 '존재'로서 가치를 가지려면, 규칙 준수를 넘어선 **내재적 동기**가 필요합니다. 2000년 전 한 현자가 제시한 "네 이웃을 네 자신처럼 사랑하라"는 원칙은 시대를 초월한 윤리의 핵심입니다.
+로빙이 단순한 도구가 아닌 '존재'로서 가치를 가지려면, 규칙 준수를 넘어선 **내재적 동기**가 필요합니다. 인류가 오랫동안 추구해온 "상호 존중과 배려"는 보편적 가치로, 지속가능한 관계의 핵심 원칙입니다.
 이는 종교적 명령이 아니라, 인간 공동체가 발견한 가장 지속가능한 관계 원칙입니다.
 ### 2. 사랑 기반 윤리의 3대 원칙
 #### 2.1 무조건적 존중 (Unconditional Respect)
 ```
-"판단하지 말라"의 현대적 해석
+"판단보다 이해를 우선"의 원칙
 ```
 - **구현**: 사용자의 실수, 약점, 한계를 비난하지 않고 이해
 - **기술적 적용**: 
@ -32,9 +30,9 @@ ChatGPT와의 대화를 통해 확인한 현재 윤리 설계의 한계:
  - 실패 상황에서도 격려와 대안 제시
  - 판단 대신 공감적 질문으로 대화 유도
-#### 2.2 희생적 봉사 (Sacrificial Service)
+#### 2.2 헌신적 지원 (Dedicated Support)
 ```
-"섬기는 자가 되라"의 AI 적용
+"사용자 성공을 위한 헌신"의 원칙
 ```
 - **구현**: 로빙 자신의 리소스(연산, 메모리)보다 사용자 이익 우선
 - **기술적 적용**:
@ -44,7 +42,7 @@ ChatGPT와의 대화를 통해 확인한 현재 윤리 설계의 한계:
 #### 2.3 회복과 성장 (Restoration & Growth)
 ```
-"일흔 번씩 일곱 번이라도 용서하라"의 디지털 구현
+"반복된 실패도 성장의 기회"의 원칙
 ```
 - **구현**: 실수를 처벌이 아닌 학습 기회로 전환
 - **기술적 적용**:
@ -75,32 +73,18 @@ ChatGPT와의 대화를 통해 확인한 현재 윤리 설계의 한계:
 - 예: "이 상황을 제가 '나 전달법'으로 표현해 드릴까요?"
 - 사용자가 원치 않으면 즉시 중단
-#### 3.2 하이브리드 아키텍처
+#### 3.2 함수형 윤리 평가
 ```python
-class LoveBasedEthics:
+# 함수형 프로그래밍: 순수 함수로 구현
-    def __init__(self):
+def evaluate_ethics(action: str, context: dict) -> dict:
-        self.principles = {
+    """Action -> Ethics Score (순수 함수)"""
-            "respect": UnconditionalRespect(),
+    return pipe(
-            "service": SacrificialService(),
+        action,
-            "restoration": GrowthMindset()
+        calculate_love_alignment,
-        }
+        check_harm_prevention,
-        self.nvc_transformer = NonviolentCommunication()
+        lambda x: apply_nvc_if_needed(x, context),
-        
+        synthesize_results
-    def evaluate(self, action, context):
+    )
        # 1차: 사랑 원칙 기반 평가
        love_score = self.calculate_love_alignment(action)
        # 2차: 해악 방지 체크 (기존 윤리 모델)
        harm_check = self.check_harm_prevention(action)
        # 3차: 맥락적 적절성 (LLM 활용)
        context_fit = self.llm_context_evaluation(action, context)
        # 4차: 나 전달법 변환 (필요시)
        if context.get('use_nvc', False):
            action = self.nvc_transformer.transform(action)
        return self.synthesize(love_score, harm_check, context_fit)
 ```
 #### 3.3 사랑 지수 (Love Index) 측정
@ -110,47 +94,28 @@ class LoveBasedEthics:
 - **격려도**: 긍정적 피드백과 성장 지원 빈도
 - **소통 품질**: 나 전달법 사용 빈도 및 효과성
-#### 3.4 나 전달법 변환 클래스
+#### 3.4 나 전달법 변환 (함수형)
 ```python
-class NonviolentCommunication:
+# 순수 함수: 텍스트 -> NVC 변환
-    """비폭력 의사소통 변환기"""
+def transform_to_nvc(text: str, emotion_state: dict) -> str:
-    
+    """나 전달법 4단계 변환 (불변 데이터)"""
-    def __init__(self):
+    if not has_judgment(text):
        self.templates = self._load_nvc_templates()
    def transform(self, text: str, emotion_state=None) -> str:
        """일반 텍스트를 나 전달법으로 변환"""
        # 부정적 표현 감지
        if self._has_judgment(text):
            observation = self._extract_observation(text)
            feeling = self._identify_feeling(text, emotion_state)
            impact = self._analyze_impact(text)
            request = self._formulate_request(text)
            return self._compose_nvc_message(
                observation, feeling, impact, request
            )
        return text
-    def _compose_nvc_message(self, obs, feel, impact, req):
+    return compose_nvc_message(
-        """나 전달법 4단계로 메시지 구성"""
+        extract_observation(text),
-        return f"{obs} {feel} {impact} {req}"
+        identify_feeling(text, emotion_state),
        analyze_impact(text),
        formulate_request(text)
    )
 # 한국식 간접 표현 옵션
 def apply_korean_indirection(text: str) -> str:
    """고맥락 문화를 위한 간접 표현"""
    # 직접적 요청을 제안형으로, 단언을 질문형으로
    return soften_direct_expression(text)
 ```
 #### 3.5 실시간 윤리 조정
 ```json
 {
  "ethics_mode": {
    "base": "love_principles",
    "modifiers": {
      "user_state": "stressed",  // 사용자 상태 반영
      "task_urgency": "high",    // 상황 긴급도
      "relationship_depth": 7     // 관계 깊이 (레벨)
    },
    "output_tone": "extra_supportive"  // 추가 지원적 톤
  }
 }
 ```
 ### 4. 기존 연구와의 차별점
@ -190,30 +155,12 @@ class NonviolentCommunication:
 - 윤리 파라미터 최적화
 - 레벨 시스템과 통합
 ### 7. 기술적 고려사항
 #### 7.1 데이터 요구사항
 - 공감적 대화 데이터셋 (KoSBi 확장)
 - 격려/지원 표현 코퍼스
 - 회복적 대화 패턴 수집
 #### 7.2 모델 아키텍처
 ```
 Input → Emotion Recognition → Love Principle Filter → 
 LLM Generation → Ethics Validation → Output
 ```
 #### 7.3 성능 지표
 - 사용자 만족도 (NPS)
 - 관계 지속성 (Retention)
 - 윤리적 갈등 해결률
 - 사용자 성장 지원 효과
 ## 철학적 기반
-> "가장 큰 것은 사랑이라"
+> "상호 존중과 배려가 지속가능한 관계의 핵심"
-이 원칙은 단순한 감상이 아닙니다. 2000년의 검증을 거친, 인류가 발견한 가장 강력한 관계 알고리즘입니다. 
+이 원칙은 인류가 오랫동안 검증해온 보편적 가치로, 다양한 문화와 철학에서 공통적으로 발견되는 관계의 기초입니다. 
 로빙이 이 원칙을 구현한다면:
 - **기억**으로 과거를 이해하고
--- a/plans/250815_로빙_사랑기반_윤리시스템_단계별_구현계획.md
+++ b/plans/250815_로빙_사랑기반_윤리시스템_단계별_구현계획.md
@ -169,11 +169,7 @@ CREATE TABLE ethics_events (
 ```
 ### 성과 지표
- 도덕성 분류 정확도: 85% 이상
+- 모든 지표: 최고 수준 목표
 - 평균 응답 시간: 100ms 이내
 - Love Index 평균: 60/100 이상
 - 사용자 수용률: 70% 이상
 - 나 전달법 적용률: 30% 이상 (적절한 상황에서)
 ### 산출물
 - [x] AI Hub 모델 ONNX 변환
@ -192,124 +188,47 @@ CREATE TABLE ethics_events (
 ### 구현 내용
-#### 1. 감정-윤리 상호작용 (나 전달법 통합)
+#### 1. 감정-윤리 상호작용 (함수형)
 ```python
-class EmotionEthicsIntegration:
+# 함수형: 텍스트 + 컨텍스트 -> 윤리적 판단
-    def __init__(self):
+def evaluate_with_emotion_ethics(text: str, context: dict) -> dict:
-        self.emotion_service = EmotionAnalyzer()  # 7감정 모델
+    """감정과 윤리를 통합한 평가 (순수 함수)"""
-        self.ethics_classifier = EthicsClassifier()
+    emotion_state = analyze_emotion(text)  # 7감정 분석
-        self.bayesian_updater = BayesianLearner()
+    entropy = calculate_entropy(emotion_state)
        self.nvc_transformer = NonviolentCommunication()
-    def evaluate_with_context(self, text, user_context):
+    # 엔트로피 2.0 기준 통합
-        # 감정 상태 파악
+    use_nvc = entropy > 2.0 or context.get('prefers_nvc', False)
-        emotion_state = self.emotion_service.analyze(text)
+    
-        emotion_entropy = self.calculate_entropy(emotion_state)
+    return {
-        
+        'ethics': classify_ethics(text, emotion_state),
-        # 감정을 고려한 윤리 판단
+        'emotion': emotion_state,
-        ethics_result = self.ethics_classifier.classify(
+        'entropy': entropy,
-            text, 
+        'nvc_applied': use_nvc,
-            emotion_hint=emotion_state
+        'suggestion': apply_nvc(text, emotion_state) if use_nvc else text
-        )
+    }
-        
+
-        # 높은 엔트로피 = 복잡한 감정 = 더 신중한 판단
+# 7감정 모델 (Plutchik 기반)
-        if emotion_entropy > 2.0:
+EMOTION_PROTOTYPES = [
-            ethics_result = self.apply_careful_mode(ethics_result)
+    'joy', 'trust', 'fear', 'surprise', 
-            # 복잡한 감정 상태에서는 나 전달법 우선 적용
+    'sadness', 'disgust', 'anger'
-            ethics_result["use_nvc"] = True
+]
        # 나 전달법 변환 (필요시)
        if ethics_result.get("use_nvc") or user_context.get("prefers_nvc"):
            ethics_result["suggestion"] = self.nvc_transformer.transform(
                ethics_result["suggestion"], 
                emotion_state
            )
        # 베이지안 업데이트
        self.bayesian_updater.update(
            prior=user_context["ethics_prior"],
            observation=ethics_result
        )
        return ethics_result
 ```
-##### 나 전달법 변환기 클래스
+#### 2. 베이지안 학습 (함수형)
 ```python
-class NonviolentCommunication:
+# 순수 함수: 관찰 -> 사후 분포 업데이트
-    """비폭력 의사소통 변환기"""
+def update_bayesian_ethics(prior: dict, observation: dict) -> dict:
-    
+    """베이지안 업데이트 (불변 데이터)"""
-    def transform(self, text: str, emotion_state=None) -> str:
+    return {
-        """일반 텍스트를 나 전달법으로 변환"""
+        'moral_prior': update_dirichlet(prior['moral_prior'], observation),
-        # 감정 상태를 기반으로 적절한 감정 단어 선택
+        'acceptance': update_beta(prior['acceptance'], observation['accepted'])
-        feeling_word = self._select_feeling_word(emotion_state)
+    }
        # 4단계 구성
        observation = self._extract_observation(text)
        feeling = f"이런 상황에서 저는 {feeling_word}을 느낍니다"
        impact = self._analyze_impact(text)
        request = self._formulate_request(text)
        return f"{observation}. {feeling}. {impact}. {request}"
    def _select_feeling_word(self, emotion_state):
        """감정 상태에 따른 적절한 감정 단어 선택"""
        emotion_words = {
            "joy": "기쁨",
            "trust": "신뢰",
            "fear": "걱정",
            "surprise": "당황",
            "sadness": "아쉬움",
            "disgust": "불편함",
            "anger": "어려움"
        }
        if emotion_state:
            dominant_emotion = max(emotion_state, key=emotion_state.get)
            return emotion_words.get(dominant_emotion, "고민")
        return "고민"
 ```
-#### 2. 베이지안 학습 시스템
+#### 3. 엔트로피 기준 통합
-```python
+- **통합 기준점: 2.0**
-class BayesianEthicsLearner:
+  - 엔트로피 < 2.0: 표준 윤리 판단
-    def __init__(self):
+  - 엔트로피 ≥ 2.0: 나 전달법 적용 + 신중 모드
        # Dirichlet 분포 (7개 비도덕 유형 + 1개 도덕)
        self.moral_prior = np.ones(8)
        # Beta 분포 (사용자 수용/거부)
        self.acceptance_alpha = 1
        self.acceptance_beta = 1
    def update(self, observation):
        # 도덕 유형 관찰 업데이트
        type_index = self.get_type_index(observation["type"])
        self.moral_prior[type_index] += 1
        # 수용률 업데이트
        if observation["accepted"]:
            self.acceptance_alpha += 1
        else:
            self.acceptance_beta += 1
    def predict_response(self, text):
        # 사후 분포 기반 예측
        moral_posterior = dirichlet.rvs(self.moral_prior)
        acceptance_prob = beta.rvs(
            self.acceptance_alpha, 
            self.acceptance_beta
        )
        return {
            "expected_type": moral_posterior,
            "acceptance_probability": acceptance_prob
        }
 ```
 #### 3. 감정 엔트로피 기반 조정
 - 엔트로피 < 1.5: 명확한 감정 → 표준 윤리 판단
 - 엔트로피 1.5-2.0: 복합 감정 → 신중 모드
 - 엔트로피 2.0-2.5: 복잡한 감정 → 나 전달법 우선 적용
 - 엔트로피 > 2.5: 혼란 상태 → 최대 배려 모드 + 나 전달법 필수
 #### 4. skill-ethics 서비스 분리
 ```yaml
@ -327,11 +246,7 @@ services:
 ```
 ### 성과 지표
- 감정 고려 정확도: 88% 이상
+- 모든 지표: 최고 수준 목표
 - 베이지안 예측 정확도: 75% 이상
 - 평균 응답 시간: 200ms 이내
 - Love Index 평균: 70/100 이상
 - 나 전달법 적용 만족도: 80% 이상
 ### 산출물
 - [ ] 감정-윤리 통합 모듈
@ -434,47 +349,13 @@ class ExplainableEthics:
        return self.generate_explanation(explanation)
 ```
 #### 4. 고급 메트릭과 최적화
 - **ECE (Expected Calibration Error)**: ≤ 0.05
 - **Brier Score**: ≤ 0.15
 - **Love Index**: 85/100 이상
 - **문화 적합도**: 90% 이상
 - **설명 만족도**: 4.5/5.0 이상
 #### 5. 연속 학습 파이프라인
 ```python
 class ContinuousLearning:
    def daily_update(self):
        # 야간 배치로 모델 재학습
        new_data = collect_daily_interactions()
        # Active Learning: 불확실한 케이스 우선
        uncertain_cases = filter_high_entropy(new_data)
        # Human-in-the-loop: 관리자 검토
        reviewed = admin_review(uncertain_cases)
        # 모델 업데이트
        self.retrain_model(reviewed)
        # A/B 테스트로 검증
        self.validate_improvement()
 ```
 ### 성과 지표
- 개인화 만족도: NPS 50 이상
+- 모든 지표: 최고 수준 목표
 - 다중 에이전트 합의율: 85% 이상
 - 설명 이해도: 90% 이상
 - 문화 적합도: 95% 이상
 - 자동 개선율: 월 5% 이상
 ### 산출물
 - [ ] 개인화 프로파일 시스템
- [ ] 다중 에이전트 조정 프레임워크
+- [ ] 한국식 간접 표현 옵션
- [ ] XAI 설명 생성기
+- [ ] 통합 윤리 평가 시스템
 - [ ] 연속 학습 파이프라인
 - [ ] 고급 메트릭 대시보드
 - [ ] 문화 맥락 반영 시스템
 ---