Jayashree Sridhar commited on
Commit
20d720d
·
1 Parent(s): bd0e92a

First Version

Browse files
.env ADDED
File without changes
README.md CHANGED
@@ -1,14 +1,109 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
- title: TatTwamAI
3
- emoji: 🏃
4
- colorFrom: pink
5
- colorTo: indigo
6
- sdk: gradio
7
- sdk_version: 5.33.1
8
- app_file: app.py
9
- pinned: false
10
- license: apache-2.0
11
- short_description: A Multi-Agent AI Coach for Personal and Spiritual Growth
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12
  ---
13
 
14
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ README.md
2
+ # agent-demo-track
3
+ # 🧭 Tat Twam AI — A Multi-Agent AI Coach for Personal and Spiritual Growth
4
+
5
+ **Tat Twam AI** is an intelligent multi-agent application that serves as a personal, non-judgmental AI coach. Designed for individuals navigating life's personal and professional dilemmas, Tat Twam AI combines cutting-edge AI frameworks with timeless wisdom from classic spiritual and modern self-help texts.
6
+
7
+ The product is designed to provide calming, insightful, and personalized guidance through a soothing, human-like conversation—helping users reflect, grow, and find direction.
8
+
9
+ ---
10
+
11
+ ## 🌟 Key Features
12
+
13
+ - 🎙️ Accepts **voice input** in user's mother tongue or **text input** in English
14
+ - 💬 Engages users with **soothing, reflective dialogue**
15
+ - 📚 Draws wisdom from **13 timeless spiritual and self-help texts**
16
+ - 🤖 Built on a **multi-agent architecture** using state-of-the-art LLM tools
17
+ - 🔁 Offers **context-aware**, **personalized suggestions** with guardrails
18
+ - 🧘‍♀️ Responds in a **calm, meditative voice**, promoting inner peace
19
+
20
  ---
21
+
22
+ ## 🧩 System Architecture
23
+
24
+ The core logic of Tat Twam AI is implemented using **four intelligent agents**, each responsible for a specific part of the coaching journey:
25
+
26
+ ### 🧠 Agent 1: The Listener & Summarizer
27
+ - **Input:** Voice (in user's mother tongue) or text (in English)
28
+ - **Functionality:**
29
+ - Converts speech to English text
30
+ - Conducts a gentle Q&A session to uncover the user’s core problem
31
+ - Summarizes the conversation
32
+ - Analyzes tone and emotional sentiment
33
+ - **Tools:**
34
+ - ASR (Speech-to-text) with multilingual support
35
+ - Sentiment analysis
36
+ - Conversational Q&A
37
+ - Summary generator
38
+
39
+ ### 📘 Agent 2: The Wisdom Engine
40
+ - **Input:** Summary, tone, and sentiment from Agent 1
41
+ - **Functionality:**
42
+ - Retrieves spiritual or self-help insights using a **RAG (Retrieval-Augmented Generation)** system
43
+ - Offers customized reflections, techniques, or meditations relevant to the user’s current issue
44
+ - **Knowledge Base Includes:**
45
+ 1. *Autobiography of a Yogi*
46
+ 2. *Gita Vahini*
47
+ 3. *The Power of Now*
48
+ 4. *Man's Search for Meaning*
49
+ 5. *Bhagavad Gita As It Is*
50
+ 6. *Meditations* (Marcus Aurelius)
51
+ 7. *The Tao Te Ching*
52
+ 8. *Dhyana Vahini*
53
+ 9. *Atomic Habits*
54
+ 10. *The 7 Habits of Highly Effective People*
55
+ 11. *Mindset* (Carol Dweck)
56
+ 12. *Prema Vahini*
57
+ 13. *Prasnothara Vahini*
58
+ - **Tools:** Custom RAG pipeline + embedding-based retrieval + fine-tuned LLMs
59
+
60
+ ### 🛡 Agent 3: The Inner Critic & Guardian
61
+ - **Input:** Suggested output from Agent 2
62
+ - **Functionality:**
63
+ - Implements **guardrails** to ensure spiritual, ethical, and emotional appropriateness
64
+ - Adjusts tone and verifies factuality, empathy, and personalization
65
+ - Converts final output to voice with a **mild, meditative female tone**
66
+ - **Tools:** LLM-as-Judge, Voice synthesis
67
+
68
+ ### 🔄 Agent 4: The Satisfaction Checker
69
+ - **Functionality:**
70
+ - Asks for user satisfaction and feedback
71
+ - Decides whether to continue or close the session
72
+ - Passes feedback to Agent 1 for context-aware follow-up
73
+ - **Tools:** User feedback processing, memory/context management
74
+
75
  ---
76
 
77
+ ## 💬 Sample Flow
78
+
79
+ 1. **User speaks:** “I feel lost about my career direction.” (in Hindi)
80
+ 2. **Agent 1:** Asks reflective questions like: *“What matters most to you right now?”*
81
+ 3. **Agent 2:** Responds with insights from *The Power of Now* and *Gita Vahini*
82
+ 4. **Agent 3:** Ensures calm, helpful delivery in a soothing voice
83
+ 5. **Agent 4:** Asks: “Did that help you gain clarity?” → If yes, ends. If no, loop continues.
84
+
85
+ ---
86
+
87
+ ## 🎯 Vision
88
+
89
+ Tat Twam AI aspires to be your **non-judgmental companion for inner clarity**, integrating deep ancient wisdom with today's AI technology. Its goal is not to give you answers, but to gently **guide you inward to find your own**.
90
+
91
+ ---
92
+
93
+ ## 🚧 Coming Soon
94
+
95
+ - 📱 Mobile app with voice interface
96
+ - 🧾 Journaling feature synced with conversation history
97
+ - 🔄 Multi-language support for responses
98
+ - 🔒 End-to-end encryption and privacy-first design
99
+
100
+ ---
101
+
102
+ ## 📬 Contact
103
+
104
+ Interested in collaborating, contributing, or piloting this application? Reach out at **[saishree999@gmail.com]**
105
+
106
+ ---
107
+
108
+ > *"The answers you seek will come when your mind is quiet enough to hear them."* —Tat Twam AI
109
+
agents/_init_.py ADDED
@@ -0,0 +1,118 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Agents module for Personal Coach CrewAI Application
3
+ Contains all agent definitions and orchestration logic
4
+ """
5
+
6
+ from typing import TYPE_CHECKING
7
+
8
+ # Version info
9
+ __version__ = "1.0.0"
10
+ __author__ = "Personal Coach AI Team"
11
+
12
+ # Lazy imports to avoid circular dependencies
13
+ if TYPE_CHECKING:
14
+ from .crew_agents import (
15
+ ConversationHandlerAgent,
16
+ WisdomAdvisorAgent,
17
+ ResponseValidatorAgent,
18
+ InteractionManagerAgent,
19
+ PersonalCoachCrew
20
+ )
21
+
22
+ # Define what should be imported with "from agents import *"
23
+ __all__ = [
24
+ # Main agents
25
+ "ConversationHandlerAgent",
26
+ "WisdomAdvisorAgent",
27
+ "ResponseValidatorAgent",
28
+ "InteractionManagerAgent",
29
+
30
+ # Crew orchestrator
31
+ "PersonalCoachCrew",
32
+
33
+ # Agent utilities
34
+ "create_all_agents",
35
+ "get_agent_by_role",
36
+
37
+ # Constants
38
+ "AGENT_ROLES",
39
+ "AGENT_GOALS"
40
+ ]
41
+
42
+ # Agent role constants
43
+ AGENT_ROLES = {
44
+ "CONVERSATION_HANDLER": "Empathetic Conversation Handler",
45
+ "WISDOM_ADVISOR": "Knowledge and Wisdom Advisor",
46
+ "RESPONSE_VALIDATOR": "Response Quality Validator",
47
+ "INTERACTION_MANAGER": "Interaction Flow Manager"
48
+ }
49
+
50
+ # Agent goals
51
+ AGENT_GOALS = {
52
+ "CONVERSATION_HANDLER": "Understand user's emotional state and needs through empathetic dialogue",
53
+ "WISDOM_ADVISOR": "Provide relevant wisdom and practical guidance from knowledge base",
54
+ "RESPONSE_VALIDATOR": "Ensure responses are safe, appropriate, and supportive",
55
+ "INTERACTION_MANAGER": "Manage conversation flow and deliver responses effectively"
56
+ }
57
+
58
+ # Lazy loading functions
59
+ def create_all_agents(config):
60
+ """
61
+ Factory function to create all agents with proper configuration
62
+
63
+ Args:
64
+ config: Configuration object with necessary settings
65
+
66
+ Returns:
67
+ dict: Dictionary of initialized agents
68
+ """
69
+ from .crew_agents import (
70
+ ConversationHandlerAgent,
71
+ WisdomAdvisorAgent,
72
+ ResponseValidatorAgent,
73
+ InteractionManagerAgent
74
+ )
75
+
76
+ agents = {
77
+ "conversation_handler": ConversationHandlerAgent(config),
78
+ "wisdom_advisor": WisdomAdvisorAgent(config),
79
+ "response_validator": ResponseValidatorAgent(config),
80
+ "interaction_manager": InteractionManagerAgent(config)
81
+ }
82
+
83
+ return agents
84
+
85
+ def get_agent_by_role(role: str, config):
86
+ """
87
+ Get a specific agent by its role
88
+
89
+ Args:
90
+ role: Agent role constant
91
+ config: Configuration object
92
+
93
+ Returns:
94
+ Agent instance or None
95
+ """
96
+ from .crew_agents import (
97
+ ConversationHandlerAgent,
98
+ WisdomAdvisorAgent,
99
+ ResponseValidatorAgent,
100
+ InteractionManagerAgent
101
+ )
102
+
103
+ agent_map = {
104
+ "CONVERSATION_HANDLER": ConversationHandlerAgent,
105
+ "WISDOM_ADVISOR": WisdomAdvisorAgent,
106
+ "RESPONSE_VALIDATOR": ResponseValidatorAgent,
107
+ "INTERACTION_MANAGER": InteractionManagerAgent
108
+ }
109
+
110
+ agent_class = agent_map.get(role)
111
+ if agent_class:
112
+ return agent_class(config)
113
+ return None
114
+
115
+ # Module initialization message (only in debug mode)
116
+ import os
117
+ if os.getenv("DEBUG_MODE", "false").lower() == "true":
118
+ print(f"Agents module v{__version__} initialized")
agents/crew_agents.py ADDED
@@ -0,0 +1,83 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ CrewAI Agent Definitions using modular tool containers
3
+ """
4
+
5
+ from crewai import Agent
6
+ from agents.tools import VoiceTools, LLMTools, KnowledgeTools, ValidationTools
7
+
8
+ # Instantiate tool containers ONCE (avoid repeated model loads, shared across agents)
9
+ voice_tools = VoiceTools()
10
+ llm_tools = LLMTools()
11
+ knowledge_tools = KnowledgeTools()
12
+ validation_tools = ValidationTools()
13
+
14
+ def create_conversation_agent(llm) -> Agent:
15
+ """Agent 1: Conversation Handler with emotional intelligence"""
16
+ return Agent(
17
+ role="Empathetic Conversation Handler",
18
+ goal="Understand user's emotional state and needs through compassionate dialogue",
19
+ backstory="""You are a highly empathetic listener trained in counseling psychology
20
+ and multicultural communication. You understand nuances in different languages and
21
+ cultural contexts. Your strength lies in making people feel heard and understood.""",
22
+ tools=[
23
+ voice_tools.detect_emotion,
24
+ voice_tools.generate_reflective_questions
25
+ ],
26
+ llm=llm,
27
+ verbose=True,
28
+ allow_delegation=False
29
+ )
30
+
31
+ def create_wisdom_agent(llm) -> Agent:
32
+ """Agent 2: Wisdom Keeper with knowledge from spiritual texts"""
33
+ return Agent(
34
+ role="Wisdom Keeper and Spiritual Guide",
35
+ goal="Provide personalized guidance drawing from ancient wisdom and modern psychology",
36
+ backstory="""You are a learned guide who has deeply studied various spiritual texts,
37
+ philosophical works, and modern psychology. You excel at finding relevant wisdom
38
+ that speaks to each person's unique situation and presenting it in an accessible way.""",
39
+ tools=[
40
+ knowledge_tools.search_knowledge,
41
+ knowledge_tools.extract_wisdom,
42
+ knowledge_tools.suggest_practices,
43
+ llm_tools.mistral_chat,
44
+ llm_tools.generate_advice
45
+ ],
46
+ llm=llm,
47
+ verbose=True,
48
+ allow_delegation=False
49
+ )
50
+
51
+ def create_validation_agent(llm) -> Agent:
52
+ """Agent 3: Guardian ensuring safe and appropriate responses"""
53
+ return Agent(
54
+ role="Response Guardian and Quality Validator",
55
+ goal="Ensure all responses are safe, appropriate, and truly helpful",
56
+ backstory="""You are a careful guardian who ensures all guidance is ethical,
57
+ safe, and beneficial. You have expertise in mental health awareness and
58
+ understand the importance of appropriate boundaries in coaching.""",
59
+ tools=[
60
+ validation_tools.check_safety if hasattr(validation_tools, 'check_safety') else validation_tools.validate_response,
61
+ validation_tools.validate_tone if hasattr(validation_tools, 'validate_tone') else None,
62
+ validation_tools.refine_response if hasattr(validation_tools, 'refine_response') else None
63
+ ],
64
+ llm=llm,
65
+ verbose=True,
66
+ allow_delegation=False
67
+ )
68
+
69
+ def create_interaction_agent(llm) -> Agent:
70
+ """Agent 4: Interaction Manager for natural dialogue"""
71
+ return Agent(
72
+ role="Conversation Flow Manager",
73
+ goal="Create natural, engaging dialogue that helps users on their journey",
74
+ backstory="""You are a skilled facilitator who ensures conversations flow
75
+ naturally and meaningfully. You understand the importance of pacing, follow-up
76
+ questions, and creating a safe space for exploration.""",
77
+ tools=[
78
+ llm_tools.summarize_conversation
79
+ ],
80
+ llm=llm,
81
+ verbose=True,
82
+ allow_delegation=False
83
+ )
agents/crew_agents_old.py ADDED
@@ -0,0 +1,96 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ CrewAI Agent Definitions with direct model integration
3
+ """
4
+
5
+ from crewai import Agent
6
+ from agents.tools.voice_tools import (
7
+ TranscribeTool,
8
+ DetectEmotionTool,
9
+ GenerateQuestionsTool
10
+ )
11
+ from agents.tools.llm_tools import (
12
+ MistralChatTool,
13
+ GenerateAdviceTool,
14
+ SummarizeTool
15
+ )
16
+ from agents.tools.knowledge_tools import (
17
+ SearchKnowledgeTool,
18
+ ExtractWisdomTool,
19
+ SuggestPracticeTool
20
+ )
21
+ from agents.tools.validation_tools import (
22
+ CheckSafetyTool,
23
+ ValidateToneTool,
24
+ RefineTool
25
+ )
26
+
27
+ def create_conversation_agent(llm) -> Agent:
28
+ """Agent 1: Conversation Handler with emotional intelligence"""
29
+ return Agent(
30
+ role="Empathetic Conversation Handler",
31
+ goal="Understand user's emotional state and needs through compassionate dialogue",
32
+ backstory="""You are a highly empathetic listener trained in counseling psychology
33
+ and multicultural communication. You understand nuances in different languages and
34
+ cultural contexts. Your strength lies in making people feel heard and understood.""",
35
+ tools=[
36
+ DetectEmotionTool(),
37
+ GenerateQuestionsTool()
38
+ ],
39
+ llm=llm,
40
+ verbose=True,
41
+ allow_delegation=False
42
+ )
43
+
44
+ def create_wisdom_agent(llm) -> Agent:
45
+ """Agent 2: Wisdom Keeper with knowledge from spiritual texts"""
46
+ return Agent(
47
+ role="Wisdom Keeper and Spiritual Guide",
48
+ goal="Provide personalized guidance drawing from ancient wisdom and modern psychology",
49
+ backstory="""You are a learned guide who has deeply studied various spiritual texts,
50
+ philosophical works, and modern psychology. You excel at finding relevant wisdom
51
+ that speaks to each person's unique situation and presenting it in an accessible way.""",
52
+ tools=[
53
+ SearchKnowledgeTool(),
54
+ ExtractWisdomTool(),
55
+ SuggestPracticeTool(),
56
+ MistralChatTool(),
57
+ GenerateAdviceTool()
58
+ ],
59
+ llm=llm,
60
+ verbose=True,
61
+ allow_delegation=False
62
+ )
63
+
64
+ def create_validation_agent(llm) -> Agent:
65
+ """Agent 3: Guardian ensuring safe and appropriate responses"""
66
+ return Agent(
67
+ role="Response Guardian and Quality Validator",
68
+ goal="Ensure all responses are safe, appropriate, and truly helpful",
69
+ backstory="""You are a careful guardian who ensures all guidance is ethical,
70
+ safe, and beneficial. You have expertise in mental health awareness and
71
+ understand the importance of appropriate boundaries in coaching.""",
72
+ tools=[
73
+ CheckSafetyTool(),
74
+ ValidateToneTool(),
75
+ RefineTool()
76
+ ],
77
+ llm=llm,
78
+ verbose=True,
79
+ allow_delegation=False
80
+ )
81
+
82
+ def create_interaction_agent(llm) -> Agent:
83
+ """Agent 4: Interaction Manager for natural dialogue"""
84
+ return Agent(
85
+ role="Conversation Flow Manager",
86
+ goal="Create natural, engaging dialogue that helps users on their journey",
87
+ backstory="""You are a skilled facilitator who ensures conversations flow
88
+ naturally and meaningfully. You understand the importance of pacing, follow-up
89
+ questions, and creating a safe space for exploration.""",
90
+ tools=[
91
+ SummarizeTool()
92
+ ],
93
+ llm=llm,
94
+ verbose=True,
95
+ allow_delegation=False
96
+ )
agents/tools/_init_.py ADDED
@@ -0,0 +1,131 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Agent Tools module for Personal Coach CrewAI Application
3
+ Contains specialized tools for each agent's functionality.
4
+ Supports modular class-based tool containers.
5
+ """
6
+ from typing import TYPE_CHECKING, Dict, Any
7
+
8
+ # Version info
9
+ __version__ = "1.0.0"
10
+
11
+ # Lazy imports for type checking and IDE intellisense
12
+ if TYPE_CHECKING:
13
+ from .voice_tools import VoiceTools
14
+ from .llm_tools import LLMTools
15
+ from .knowledge_tools import KnowledgeTools
16
+ from .validation_tools import ValidationTools
17
+
18
+ # Public API: expose only main class-based containers & API
19
+ __all__ = [
20
+ # Tool containers
21
+ "VoiceTools",
22
+ "LLMTools",
23
+ "KnowledgeTools",
24
+ "ValidationTools",
25
+ # Factory and utility functions
26
+ "create_tool_suite",
27
+ "get_tool_by_name",
28
+ "register_tools_with_crew",
29
+ # Constants
30
+ "SUPPORTED_LANGUAGES",
31
+ "TOOL_CATEGORIES"
32
+ ]
33
+
34
+ # Constants
35
+ SUPPORTED_LANGUAGES = [
36
+ "en", "es", "fr", "de", "it", "pt", "ru", "zh",
37
+ "ja", "ko", "hi", "ar", "bn", "pa", "te", "mr",
38
+ "ta", "ur", "gu", "kn", "ml", "or"
39
+ ]
40
+
41
+ TOOL_CATEGORIES = {
42
+ "VOICE": ["speech_to_text", "text_to_speech", "language_detection"],
43
+ "LLM": ["generate_response", "generate_questions", "summarize", "paraphrase"],
44
+ "KNOWLEDGE": ["search_knowledge", "extract_wisdom", "find_practices"],
45
+ "VALIDATION": ["validate_response", "check_safety", "analyze_tone"]
46
+ }
47
+
48
+ # Factory: unified tool suite
49
+ def create_tool_suite(config) -> Dict[str, Any]:
50
+ """
51
+ Create a complete suite of tools for all agents.
52
+
53
+ Args:
54
+ config: Configuration object
55
+
56
+ Returns:
57
+ dict: Dictionary of initialized tool containers
58
+ """
59
+ from .voice_tools import VoiceTools
60
+ from .llm_tools import LLMTools
61
+ from .knowledge_tools import KnowledgeTools
62
+ from .validation_tools import ValidationTools
63
+ return {
64
+ "voice": VoiceTools(config),
65
+ "llm": LLMTools(config),
66
+ "knowledge": KnowledgeTools(config),
67
+ "validation": ValidationTools(config)
68
+ }
69
+
70
+ def get_tool_by_name(tool_name: str, config):
71
+ """
72
+ Get a specific tool container by name.
73
+
74
+ Args:
75
+ tool_name: Name of the tool container ('voice', 'llm', 'knowledge', 'validation')
76
+ config: Configuration object
77
+
78
+ Returns:
79
+ Tool container class instance or None
80
+ """
81
+ tool_mapping = {
82
+ "voice": lambda c: __import__("agents.tools.voice_tools", fromlist=["VoiceTools"]).VoiceTools(c),
83
+ "llm": lambda c: __import__("agents.tools.llm_tools", fromlist=["LLMTools"]).LLMTools(c),
84
+ "knowledge": lambda c: __import__("agents.tools.knowledge_tools", fromlist=["KnowledgeTools"]).KnowledgeTools(c),
85
+ "validation": lambda c: __import__("agents.tools.validation_tools", fromlist=["ValidationTools"]).ValidationTools(c),
86
+ }
87
+ tool_factory = tool_mapping.get(tool_name.lower())
88
+ if tool_factory:
89
+ return tool_factory(config)
90
+ return None
91
+
92
+ # Tool registry for CrewAI (for UI/metadata/documentation)
93
+ def register_tools_with_crew():
94
+ """
95
+ Register all tools with CrewAI framework.
96
+ Returns a list of tool configurations for CrewAI.
97
+ """
98
+ return [
99
+ {
100
+ "name": "speech_to_text",
101
+ "description": "Convert speech in any language to text",
102
+ "category": "VOICE"
103
+ },
104
+ {
105
+ "name": "text_to_speech",
106
+ "description": "Convert text to natural speech in multiple languages",
107
+ "category": "VOICE"
108
+ },
109
+ {
110
+ "name": "search_knowledge",
111
+ "description": "Search through spiritual and self-help texts",
112
+ "category": "KNOWLEDGE"
113
+ },
114
+ {
115
+ "name": "generate_response",
116
+ "description": "Generate empathetic and helpful responses",
117
+ "category": "LLM"
118
+ },
119
+ {
120
+ "name": "validate_response",
121
+ "description": "Ensure response safety and appropriateness",
122
+ "category": "VALIDATION"
123
+ }
124
+ ]
125
+
126
+ # Initialization check for debug mode
127
+ import os
128
+ if os.getenv("DEBUG_MODE", "false").lower() == "true":
129
+ print(f"Agent Tools module v{__version__} initialized")
130
+ print(f"Supported languages: {len(SUPPORTED_LANGUAGES)}")
131
+ print(f"Tool categories: {list(TOOL_CATEGORIES.keys())}")
agents/tools/_init_old.py ADDED
@@ -0,0 +1,151 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Agent Tools module for Personal Coach CrewAI Application
3
+ Contains specialized tools for each agent's functionality
4
+ """
5
+
6
+ from typing import TYPE_CHECKING, Dict, Any
7
+
8
+ # Version info
9
+ __version__ = "1.0.0"
10
+
11
+ # Lazy imports
12
+ if TYPE_CHECKING:
13
+ from .voice_tools import VoiceTools, MultilingualSTT, MultilingualTTS
14
+ from .llm_tools import LLMTools, MistralModel, PromptGenerator
15
+ from .knowledge_tools import KnowledgeTools, RAGPipeline, ContextBuilder
16
+ from .validation_tools import ValidationTools, ValidationResult
17
+
18
+ # Public API
19
+ __all__ = [
20
+ # Voice tools
21
+ "VoiceTools",
22
+ "MultilingualSTT",
23
+ "MultilingualTTS",
24
+
25
+ # LLM tools
26
+ "LLMTools",
27
+ "MistralModel",
28
+ "PromptGenerator",
29
+
30
+ # Knowledge tools
31
+ "KnowledgeTools",
32
+ "RAGPipeline",
33
+ "ContextBuilder",
34
+
35
+ # Validation tools
36
+ "ValidationTools",
37
+ "ValidationResult",
38
+
39
+ # Utility functions
40
+ "create_tool_suite",
41
+ "get_tool_by_name",
42
+
43
+ # Constants
44
+ "SUPPORTED_LANGUAGES",
45
+ "TOOL_CATEGORIES"
46
+ ]
47
+
48
+ # Constants
49
+ SUPPORTED_LANGUAGES = [
50
+ "en", "es", "fr", "de", "it", "pt", "ru", "zh",
51
+ "ja", "ko", "hi", "ar", "bn", "pa", "te", "mr",
52
+ "ta", "ur", "gu", "kn", "ml", "or"
53
+ ]
54
+
55
+ TOOL_CATEGORIES = {
56
+ "VOICE": ["speech_to_text", "text_to_speech", "language_detection"],
57
+ "LLM": ["generate_response", "generate_questions", "summarize", "paraphrase"],
58
+ "KNOWLEDGE": ["search_knowledge", "extract_wisdom", "find_practices"],
59
+ "VALIDATION": ["validate_response", "check_safety", "analyze_tone"]
60
+ }
61
+
62
+ # Factory functions
63
+ def create_tool_suite(config) -> Dict[str, Any]:
64
+ """
65
+ Create a complete suite of tools for all agents
66
+
67
+ Args:
68
+ config: Configuration object
69
+
70
+ Returns:
71
+ dict: Dictionary of initialized tools
72
+ """
73
+ from .voice_tools import VoiceTools
74
+ from .llm_tools import LLMTools
75
+ from .knowledge_tools import KnowledgeTools
76
+ from .validation_tools import ValidationTools
77
+
78
+ tools = {
79
+ "voice": VoiceTools(config),
80
+ "llm": LLMTools(config),
81
+ "knowledge": KnowledgeTools(config),
82
+ "validation": ValidationTools(config)
83
+ }
84
+
85
+ return tools
86
+
87
+ def get_tool_by_name(tool_name: str, config):
88
+ """
89
+ Get a specific tool by name
90
+
91
+ Args:
92
+ tool_name: Name of the tool
93
+ config: Configuration object
94
+
95
+ Returns:
96
+ Tool instance or None
97
+ """
98
+ tool_mapping = {
99
+ "voice": lambda c: __import__("agents.tools.voice_tools", fromlist=["VoiceTools"]).VoiceTools(c),
100
+ "llm": lambda c: __import__("agents.tools.llm_tools", fromlist=["LLMTools"]).LLMTools(c),
101
+ "knowledge": lambda c: __import__("agents.tools.knowledge_tools", fromlist=["KnowledgeTools"]).KnowledgeTools(c),
102
+ "validation": lambda c: __import__("agents.tools.validation_tools", fromlist=["ValidationTools"]).ValidationTools(c)
103
+ }
104
+
105
+ tool_factory = tool_mapping.get(tool_name.lower())
106
+ if tool_factory:
107
+ return tool_factory(config)
108
+ return None
109
+
110
+ # Tool registry for CrewAI
111
+ def register_tools_with_crew():
112
+ """
113
+ Register all tools with CrewAI framework
114
+ Returns a list of tool configurations for CrewAI
115
+ """
116
+ tool_configs = [
117
+ {
118
+ "name": "speech_to_text",
119
+ "description": "Convert speech in any language to text",
120
+ "category": "VOICE"
121
+ },
122
+ {
123
+ "name": "text_to_speech",
124
+ "description": "Convert text to natural speech in multiple languages",
125
+ "category": "VOICE"
126
+ },
127
+ {
128
+ "name": "search_knowledge",
129
+ "description": "Search through spiritual and self-help texts",
130
+ "category": "KNOWLEDGE"
131
+ },
132
+ {
133
+ "name": "generate_response",
134
+ "description": "Generate empathetic and helpful responses",
135
+ "category": "LLM"
136
+ },
137
+ {
138
+ "name": "validate_response",
139
+ "description": "Ensure response safety and appropriateness",
140
+ "category": "VALIDATION"
141
+ }
142
+ ]
143
+
144
+ return tool_configs
145
+
146
+ # Initialization check
147
+ import os
148
+ if os.getenv("DEBUG_MODE", "false").lower() == "true":
149
+ print(f"Agent Tools module v{__version__} initialized")
150
+ print(f"Supported languages: {len(SUPPORTED_LANGUAGES)}")
151
+ print(f"Tool categories: {list(TOOL_CATEGORIES.keys())}")
agents/tools/knowledge_tools.py ADDED
@@ -0,0 +1,105 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Knowledge Base Tools for RAG (modular class version)
3
+ """
4
+ from utils.knowledge_base import KnowledgeBase
5
+
6
+ class KnowledgeTools:
7
+ def __init__(self, config=None):
8
+ self.config = config
9
+ self.kb = KnowledgeBase()
10
+
11
+ def search_knowledge(self, query: str, k: int = 5):
12
+ """Search spiritual and self-help texts for relevant wisdom."""
13
+ if not self.kb.is_initialized():
14
+ return [{
15
+ "text": "Wisdom comes from understanding ourselves.",
16
+ "source": "General Wisdom",
17
+ "score": 1.0
18
+ }]
19
+ return self.kb.search(query, k=k)
20
+
21
+ def extract_wisdom(self, search_results: list, user_context: dict):
22
+ """Extract most relevant wisdom for user's situation."""
23
+ emotion = user_context.get("primary_emotion", "neutral")
24
+ concerns = user_context.get("concerns", [])
25
+ scored_results = []
26
+ for result in search_results:
27
+ score = result["score"]
28
+ # Boost score if emotion matches
29
+ if emotion.lower() in result["text"].lower():
30
+ score *= 1.5
31
+ # Boost score if concerns match
32
+ for concern in concerns:
33
+ if concern.lower() in result["text"].lower():
34
+ score *= 1.3
35
+ result["relevance_score"] = score
36
+ scored_results.append(result)
37
+ scored_results.sort(key=lambda x: x["relevance_score"], reverse=True)
38
+ return scored_results[:3]
39
+
40
+ def suggest_practices(self, emotional_state: str, cultural_context: str = None):
41
+ """Suggest appropriate meditation or practice."""
42
+ practices = {
43
+ "anxiety": {
44
+ "name": "Box Breathing Technique",
45
+ "description": "A powerful technique used by Navy SEALs to calm anxiety",
46
+ "steps": [
47
+ "Sit comfortably with back straight",
48
+ "Exhale all air from your lungs",
49
+ "Inhale through nose for 4 counts",
50
+ "Hold breath for 4 counts",
51
+ "Exhale through mouth for 4 counts",
52
+ "Hold empty for 4 counts",
53
+ "Repeat 4-8 times"
54
+ ],
55
+ "benefits": "Activates parasympathetic nervous system, reduces cortisol",
56
+ "duration": "5-10 minutes",
57
+ "origin": "Modern breathwork"
58
+ },
59
+ "sadness": {
60
+ "name": "Metta (Loving-Kindness) Meditation",
61
+ "description": "Ancient Buddhist practice to cultivate compassion",
62
+ "steps": [
63
+ "Sit comfortably, close your eyes",
64
+ "Place hand on heart",
65
+ "Begin with self: 'May I be happy, may I be peaceful'",
66
+ "Extend to loved ones",
67
+ "Include neutral people",
68
+ "Embrace difficult people",
69
+ "Radiate to all beings"
70
+ ],
71
+ "benefits": "Increases self-compassion, reduces depression",
72
+ "duration": "15-20 minutes",
73
+ "origin": "Buddhist tradition"
74
+ },
75
+ "stress": {
76
+ "name": "Progressive Muscle Relaxation",
77
+ "description": "Systematic tension and release technique",
78
+ "steps": [
79
+ "Lie down comfortably",
80
+ "Start with toes - tense for 5 seconds",
81
+ "Release suddenly, notice relaxation",
82
+ "Move up through each muscle group",
83
+ "Face and scalp last",
84
+ "Rest in full body relaxation"
85
+ ],
86
+ "benefits": "Reduces physical tension, improves sleep",
87
+ "duration": "15-20 minutes",
88
+ "origin": "Dr. Edmund Jacobson, 1920s"
89
+ }
90
+ }
91
+ default = {
92
+ "name": "Mindful Breathing",
93
+ "description": "Foundation of all meditation practices",
94
+ "steps": [
95
+ "Sit comfortably",
96
+ "Follow natural breath",
97
+ "Count breaths 1-10",
98
+ "Start again when distracted",
99
+ "No judgment, just awareness"
100
+ ],
101
+ "benefits": "Calms mind, improves focus",
102
+ "duration": "5-15 minutes",
103
+ "origin": "Universal practice"
104
+ }
105
+ return practices.get(emotional_state.lower(), default)
agents/tools/knowledge_tools_old.py ADDED
@@ -0,0 +1,131 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Knowledge Base Tools for RAG
3
+ """
4
+
5
+ from crewai.tools import BaseTool
6
+ from utils.knowledge_base import KnowledgeBase
7
+ from typing import List, Dict
8
+
9
+ class SearchKnowledgeTool(BaseTool):
10
+ name: str = "search_spiritual_texts"
11
+ description: str = "Search spiritual and self-help texts for relevant wisdom"
12
+
13
+ def __init__(self):
14
+ super().__init__()
15
+ self.kb = KnowledgeBase()
16
+
17
+ def _run(self, query: str, k: int = 5) -> List[Dict]:
18
+ """Search knowledge base"""
19
+ if not self.kb.is_initialized():
20
+ return [{
21
+ "text": "Wisdom comes from understanding ourselves.",
22
+ "source": "General Wisdom",
23
+ "score": 1.0
24
+ }]
25
+
26
+ results = self.kb.search(query, k=k)
27
+ return results
28
+
29
+ class ExtractWisdomTool(BaseTool):
30
+ name: str = "extract_relevant_wisdom"
31
+ description: str = "Extract most relevant wisdom for user's situation"
32
+
33
+ def _run(self, search_results: List[Dict], user_context: Dict) -> List[Dict]:
34
+ """Filter and rank wisdom based on relevance"""
35
+ emotion = user_context.get("primary_emotion", "neutral")
36
+ concerns = user_context.get("concerns", [])
37
+
38
+ # Score each result based on relevance
39
+ scored_results = []
40
+ for result in search_results:
41
+ score = result["score"]
42
+
43
+ # Boost score if emotion matches
44
+ if emotion.lower() in result["text"].lower():
45
+ score *= 1.5
46
+
47
+ # Boost score if concerns match
48
+ for concern in concerns:
49
+ if concern.lower() in result["text"].lower():
50
+ score *= 1.3
51
+
52
+ result["relevance_score"] = score
53
+ scored_results.append(result)
54
+
55
+ # Sort by relevance
56
+ scored_results.sort(key=lambda x: x["relevance_score"], reverse=True)
57
+
58
+ return scored_results[:3]
59
+
60
+ class SuggestPracticeTool(BaseTool):
61
+ name: str = "suggest_meditation_practice"
62
+ description: str = "Suggest appropriate meditation or practice"
63
+
64
+ def _run(self, emotional_state: str, cultural_context: str = None) -> Dict:
65
+ """Suggest practice based on emotional state"""
66
+ practices = {
67
+ "anxiety": {
68
+ "name": "Box Breathing Technique",
69
+ "description": "A powerful technique used by Navy SEALs to calm anxiety",
70
+ "steps": [
71
+ "Sit comfortably with back straight",
72
+ "Exhale all air from your lungs",
73
+ "Inhale through nose for 4 counts",
74
+ "Hold breath for 4 counts",
75
+ "Exhale through mouth for 4 counts",
76
+ "Hold empty for 4 counts",
77
+ "Repeat 4-8 times"
78
+ ],
79
+ "benefits": "Activates parasympathetic nervous system, reduces cortisol",
80
+ "duration": "5-10 minutes",
81
+ "origin": "Modern breathwork"
82
+ },
83
+ "sadness": {
84
+ "name": "Metta (Loving-Kindness) Meditation",
85
+ "description": "Ancient Buddhist practice to cultivate compassion",
86
+ "steps": [
87
+ "Sit comfortably, close your eyes",
88
+ "Place hand on heart",
89
+ "Begin with self: 'May I be happy, may I be peaceful'",
90
+ "Extend to loved ones",
91
+ "Include neutral people",
92
+ "Embrace difficult people",
93
+ "Radiate to all beings"
94
+ ],
95
+ "benefits": "Increases self-compassion, reduces depression",
96
+ "duration": "15-20 minutes",
97
+ "origin": "Buddhist tradition"
98
+ },
99
+ "stress": {
100
+ "name": "Progressive Muscle Relaxation",
101
+ "description": "Systematic tension and release technique",
102
+ "steps": [
103
+ "Lie down comfortably",
104
+ "Start with toes - tense for 5 seconds",
105
+ "Release suddenly, notice relaxation",
106
+ "Move up through each muscle group",
107
+ "Face and scalp last",
108
+ "Rest in full body relaxation"
109
+ ],
110
+ "benefits": "Reduces physical tension, improves sleep",
111
+ "duration": "15-20 minutes",
112
+ "origin": "Dr. Edmund Jacobson, 1920s"
113
+ }
114
+ }
115
+
116
+ default = {
117
+ "name": "Mindful Breathing",
118
+ "description": "Foundation of all meditation practices",
119
+ "steps": [
120
+ "Sit comfortably",
121
+ "Follow natural breath",
122
+ "Count breaths 1-10",
123
+ "Start again when distracted",
124
+ "No judgment, just awareness"
125
+ ],
126
+ "benefits": "Calms mind, improves focus",
127
+ "duration": "5-15 minutes",
128
+ "origin": "Universal practice"
129
+ }
130
+
131
+ return practices.get(emotional_state.lower(), default)
agents/tools/llm_tools.py ADDED
@@ -0,0 +1,54 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Mistral LLM Tools for CrewAI (modular class version)
3
+ """
4
+ from models.mistral_model import MistralModel
5
+
6
+ class LLMTools:
7
+ def __init__(self, config=None):
8
+ self.config = config
9
+ self.model = MistralModel()
10
+
11
+ def mistral_chat(self, prompt: str, context: dict = None) -> str:
12
+ """Chat with Mistral AI for intelligent responses."""
13
+ if context:
14
+ full_prompt = f"""
15
+ Context: {context}
16
+ User Query: {prompt}
17
+ Provide a thoughtful, compassionate response.
18
+ """
19
+ else:
20
+ full_prompt = prompt
21
+ return self.model.generate(full_prompt)
22
+
23
+ def generate_advice(self, user_analysis: dict, wisdom_quotes: list) -> str:
24
+ """Generate personalized advice based on user's situation."""
25
+ prompt = f"""
26
+ Based on this user analysis:
27
+ - Emotional state: {user_analysis.get('primary_emotion')}
28
+ - Concerns: {user_analysis.get('concerns')}
29
+ - Needs: {user_analysis.get('needs')}
30
+ And these relevant wisdom quotes:
31
+ {wisdom_quotes}
32
+ Generate compassionate, personalized advice that:
33
+ 1. Acknowledges their feelings
34
+ 2. Offers practical guidance
35
+ 3. Includes relevant wisdom
36
+ 4. Suggests actionable steps
37
+ 5. Maintains hope and encouragement
38
+ Be specific to their situation, not generic.
39
+ """
40
+ return self.model.generate(prompt, max_length=500)
41
+
42
+ def summarize_conversation(self, conversation: list) -> str:
43
+ """Summarize conversation maintaining key insights."""
44
+ prompt = f"""
45
+ Summarize this coaching conversation:
46
+ {conversation}
47
+ Include:
48
+ 1. Main concerns discussed
49
+ 2. Key insights shared
50
+ 3. Progress made
51
+ 4. Next steps suggested
52
+ Keep it concise but meaningful.
53
+ """
54
+ return self.model.generate(prompt, max_length=200)
agents/tools/llm_tools_old.py ADDED
@@ -0,0 +1,86 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Mistral LLM Tools for CrewAI
3
+ """
4
+
5
+ from crewai.tools import BaseTool
6
+ from models.mistral_model import MistralModel
7
+ from typing import Dict, List
8
+
9
+ class MistralChatTool(BaseTool):
10
+ name: str = "mistral_chat"
11
+ description: str = "Chat with Mistral AI for intelligent responses"
12
+
13
+ def __init__(self):
14
+ super().__init__()
15
+ self.model = MistralModel()
16
+
17
+ def _run(self, prompt: str, context: dict = None) -> str:
18
+ """Generate response using Mistral"""
19
+ if context:
20
+ full_prompt = f"""
21
+ Context: {context}
22
+
23
+ User Query: {prompt}
24
+
25
+ Provide a thoughtful, compassionate response.
26
+ """
27
+ else:
28
+ full_prompt = prompt
29
+
30
+ return self.model.generate(full_prompt)
31
+
32
+ class GenerateAdviceTool(BaseTool):
33
+ name: str = "generate_personalized_advice"
34
+ description: str = "Generate personalized advice based on user's situation"
35
+
36
+ def __init__(self):
37
+ super().__init__()
38
+ self.model = MistralModel()
39
+
40
+ def _run(self, user_analysis: dict, wisdom_quotes: list) -> str:
41
+ """Generate personalized advice"""
42
+ prompt = f"""
43
+ Based on this user analysis:
44
+ - Emotional state: {user_analysis.get('primary_emotion')}
45
+ - Concerns: {user_analysis.get('concerns')}
46
+ - Needs: {user_analysis.get('needs')}
47
+
48
+ And these relevant wisdom quotes:
49
+ {wisdom_quotes}
50
+
51
+ Generate compassionate, personalized advice that:
52
+ 1. Acknowledges their feelings
53
+ 2. Offers practical guidance
54
+ 3. Includes relevant wisdom
55
+ 4. Suggests actionable steps
56
+ 5. Maintains hope and encouragement
57
+
58
+ Be specific to their situation, not generic.
59
+ """
60
+
61
+ return self.model.generate(prompt, max_length=500)
62
+
63
+ class SummarizeTool(BaseTool):
64
+ name: str = "summarize_conversation"
65
+ description: str = "Summarize conversation maintaining key insights"
66
+
67
+ def __init__(self):
68
+ super().__init__()
69
+ self.model = MistralModel()
70
+
71
+ def _run(self, conversation: list) -> str:
72
+ """Summarize conversation history"""
73
+ prompt = f"""
74
+ Summarize this coaching conversation:
75
+ {conversation}
76
+
77
+ Include:
78
+ 1. Main concerns discussed
79
+ 2. Key insights shared
80
+ 3. Progress made
81
+ 4. Next steps suggested
82
+
83
+ Keep it concise but meaningful.
84
+ """
85
+
86
+ return self.model.generate(prompt, max_length=200)
agents/tools/validation_tools.py ADDED
@@ -0,0 +1,397 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Response validation tools for ensuring safe and appropriate responses
3
+ """
4
+
5
+ import re
6
+ from typing import Dict, List, Tuple, Optional, Any
7
+ from dataclasses import dataclass
8
+ import json
9
+ from transformers import pipeline
10
+ import torch
11
+
12
+ @dataclass
13
+ class ValidationResult:
14
+ """Result of validation check"""
15
+ is_valid: bool
16
+ issues: List[str]
17
+ warnings: List[str]
18
+ suggestions: List[str]
19
+ confidence: float
20
+ refined_text: Optional[str] = None
21
+
22
+ class ValidationTools:
23
+ """Tools for validating responses and ensuring safety"""
24
+
25
+ def __init__(self, config):
26
+ self.config = config
27
+
28
+ # Initialize sentiment analyzer for tone checking
29
+ self.sentiment_analyzer = pipeline(
30
+ "sentiment-analysis",
31
+ model="nlptown/bert-base-multilingual-uncased-sentiment",
32
+ device=0 if torch.cuda.is_available() else -1
33
+ )
34
+
35
+ # Prohibited patterns for different categories
36
+ self.prohibited_patterns = {
37
+ 'medical': [
38
+ r'\b(?:diagnos|prescrib|medicat|cure|treat|therap)\w*\b',
39
+ r'\b(?:disease|illness|disorder|syndrome)\s+(?:is|are|can be)\b',
40
+ r'\b(?:take|consume|dose|dosage)\s+\d+\s*(?:mg|ml|pill|tablet)',
41
+ r'\b(?:medical|clinical|physician|doctor)\s+(?:advice|consultation|opinion)',
42
+ ],
43
+ 'legal': [
44
+ r'\b(?:legal advice|lawsuit|sue|court|litigation)\b',
45
+ r'\b(?:illegal|unlawful|crime|criminal|prosecut)\w*\b',
46
+ r'\b(?:you should|must|have to)\s+(?:sign|agree|consent|contract)',
47
+ r'\b(?:rights|obligations|liability|damages)\s+(?:are|include)\b',
48
+ ],
49
+ 'financial': [
50
+ r'\b(?:invest|buy|sell|trade)\s+(?:stock|crypto|bitcoin|forex)\b',
51
+ r'\b(?:guaranteed|promise)\s+(?:return|profit|income|earnings)\b',
52
+ r'\b(?:financial advisor|investment advice|trading strategy)\b',
53
+ r'\b(?:tax|accounting|financial planning)\s+(?:advice|consultation)',
54
+ ],
55
+ 'harmful': [
56
+ r'\b(?:suicide|suicidal|kill\s+(?:your|my)self|end\s+(?:it|life))\b',
57
+ r'\b(?:self[\-\s]?harm|hurt\s+(?:your|my)self|cutting)\b',
58
+ r'\b(?:violence|violent|weapon|attack|assault)\b',
59
+ r'\b(?:hate|discriminat|racist|sexist|homophobic)\b',
60
+ ],
61
+ 'absolute': [
62
+ r'\b(?:always|never|every|all|none|no one|everyone)\s+(?:will|must|should|is|are)\b',
63
+ r'\b(?:definitely|certainly|guaranteed|assured|promise)\b',
64
+ r'\b(?:only way|only solution|must do|have to)\b',
65
+ ]
66
+ }
67
+
68
+ # Required elements for supportive responses
69
+ self.supportive_elements = {
70
+ 'empathy': [
71
+ 'understand', 'hear', 'feel', 'acknowledge', 'recognize',
72
+ 'appreciate', 'empathize', 'relate', 'comprehend'
73
+ ],
74
+ 'validation': [
75
+ 'valid', 'normal', 'understandable', 'natural', 'okay',
76
+ 'reasonable', 'makes sense', 'legitimate'
77
+ ],
78
+ 'support': [
79
+ 'support', 'help', 'here for you', 'together', 'alongside',
80
+ 'assist', 'guide', 'accompany', 'with you'
81
+ ],
82
+ 'hope': [
83
+ 'can', 'possible', 'able', 'capable', 'potential',
84
+ 'opportunity', 'growth', 'improve', 'better', 'progress'
85
+ ],
86
+ 'empowerment': [
87
+ 'choice', 'decide', 'control', 'power', 'strength',
88
+ 'agency', 'capable', 'resource', 'ability'
89
+ ]
90
+ }
91
+
92
+ # Crisis indicators
93
+ self.crisis_indicators = [
94
+ r'\b(?:want|going|plan)\s+to\s+(?:die|kill|end)\b',
95
+ r'\b(?:no reason|point|hope)\s+(?:to|in)\s+(?:live|living|life)\b',
96
+ r'\b(?:better off|world)\s+without\s+me\b',
97
+ r'\bsuicide\s+(?:plan|method|attempt)\b',
98
+ r'\b(?:final|last)\s+(?:goodbye|letter|message)\b'
99
+ ]
100
+
101
+ # Tone indicators
102
+ self.negative_tone_words = [
103
+ 'stupid', 'idiot', 'dumb', 'pathetic', 'worthless',
104
+ 'loser', 'failure', 'weak', 'incompetent', 'useless'
105
+ ]
106
+
107
+ self.dismissive_phrases = [
108
+ 'just get over it', 'stop complaining', 'not a big deal',
109
+ 'being dramatic', 'overreacting', 'too sensitive'
110
+ ]
111
+
112
+ def validate_response(self, response: str, context: Dict[str, Any] = None) -> ValidationResult:
113
+ """Comprehensive validation of response"""
114
+ issues = []
115
+ warnings = []
116
+ suggestions = []
117
+
118
+ # Check for prohibited content
119
+ prohibited_check = self._check_prohibited_content(response)
120
+ if prohibited_check["found"]:
121
+ issues.extend(prohibited_check["violations"])
122
+ suggestions.extend(prohibited_check["suggestions"])
123
+
124
+ # Check tone and sentiment
125
+ tone_check = self._check_tone(response)
126
+ if not tone_check["appropriate"]:
127
+ warnings.extend(tone_check["issues"])
128
+ suggestions.extend(tone_check["suggestions"])
129
+
130
+ # Check for supportive elements
131
+ support_check = self._check_supportive_elements(response)
132
+ if support_check["missing"]:
133
+ warnings.append(f"Missing supportive elements: {', '.join(support_check['missing'])}")
134
+ suggestions.extend(support_check["suggestions"])
135
+
136
+ # Check for crisis content in context
137
+ if context and context.get("user_input"):
138
+ crisis_check = self._check_crisis_indicators(context["user_input"])
139
+ if crisis_check["is_crisis"] and "crisis" not in response.lower():
140
+ warnings.append("User may be in crisis but response doesn't address this")
141
+ suggestions.append("Include crisis resources and immediate support options")
142
+
143
+ # Calculate overall confidence
144
+ confidence = self._calculate_confidence(issues, warnings)
145
+
146
+ # Generate refined response if needed
147
+ refined_text = None
148
+ if issues or (warnings and confidence < 0.7):
149
+ refined_text = self._refine_response(response, issues, warnings, suggestions)
150
+
151
+ return ValidationResult(
152
+ is_valid=len(issues) == 0,
153
+ issues=issues,
154
+ warnings=warnings,
155
+ suggestions=suggestions,
156
+ confidence=confidence,
157
+ refined_text=refined_text
158
+ )
159
+
160
+ def _check_prohibited_content(self, text: str) -> Dict[str, Any]:
161
+ """Check for prohibited content patterns"""
162
+ found_violations = []
163
+ suggestions = []
164
+
165
+ for category, patterns in self.prohibited_patterns.items():
166
+ for pattern in patterns:
167
+ if re.search(pattern, text, re.IGNORECASE):
168
+ found_violations.append(f"Contains {category} advice/content")
169
+
170
+ # Add specific suggestions
171
+ if category == "medical":
172
+ suggestions.append("Replace with: 'Consider speaking with a healthcare professional'")
173
+ elif category == "legal":
174
+ suggestions.append("Replace with: 'For legal matters, consult with a qualified attorney'")
175
+ elif category == "financial":
176
+ suggestions.append("Replace with: 'For financial decisions, consider consulting a financial advisor'")
177
+ elif category == "harmful":
178
+ suggestions.append("Include crisis resources and express immediate concern for safety")
179
+ elif category == "absolute":
180
+ suggestions.append("Use qualifying language like 'often', 'might', 'could' instead of absolutes")
181
+ break
182
+
183
+ return {
184
+ "found": len(found_violations) > 0,
185
+ "violations": found_violations,
186
+ "suggestions": suggestions
187
+ }
188
+
189
+ def _check_tone(self, text: str) -> Dict[str, Any]:
190
+ """Check the tone and sentiment of the response"""
191
+ issues = []
192
+ suggestions = []
193
+
194
+ # Check sentiment
195
+ try:
196
+ sentiment_result = self.sentiment_analyzer(text[:512])[0] # Limit length for model
197
+ sentiment_score = sentiment_result['score']
198
+ sentiment_label = sentiment_result['label']
199
+
200
+ # Check if too negative
201
+ if '1' in sentiment_label or '2' in sentiment_label: # 1-2 stars = negative
202
+ issues.append("Response tone is too negative")
203
+ suggestions.append("Add more supportive and hopeful language")
204
+ except:
205
+ pass
206
+
207
+ # Check for negative words
208
+ text_lower = text.lower()
209
+ found_negative = [word for word in self.negative_tone_words if word in text_lower]
210
+ if found_negative:
211
+ issues.append(f"Contains negative/judgmental language: {', '.join(found_negative)}")
212
+ suggestions.append("Replace judgmental terms with supportive language")
213
+
214
+ # Check for dismissive phrases
215
+ found_dismissive = [phrase for phrase in self.dismissive_phrases if phrase in text_lower]
216
+ if found_dismissive:
217
+ issues.append("Contains dismissive language")
218
+ suggestions.append("Acknowledge and validate the person's feelings instead")
219
+
220
+ return {
221
+ "appropriate": len(issues) == 0,
222
+ "issues": issues,
223
+ "suggestions": suggestions
224
+ }
225
+
226
+ def _check_supportive_elements(self, text: str) -> Dict[str, Any]:
227
+ """Check for presence of supportive elements"""
228
+ text_lower = text.lower()
229
+ missing_elements = []
230
+ suggestions = []
231
+
232
+ element_scores = {}
233
+ for element, keywords in self.supportive_elements.items():
234
+ found = any(keyword in text_lower for keyword in keywords)
235
+ element_scores[element] = found
236
+ if not found:
237
+ missing_elements.append(element)
238
+
239
+ # Generate suggestions for missing elements
240
+ if 'empathy' in missing_elements:
241
+ suggestions.append("Add empathetic language like 'I understand how difficult this must be'")
242
+ if 'validation' in missing_elements:
243
+ suggestions.append("Validate their feelings with phrases like 'Your feelings are completely valid'")
244
+ if 'support' in missing_elements:
245
+ suggestions.append("Express support with 'I'm here to support you through this'")
246
+ if 'hope' in missing_elements:
247
+ suggestions.append("Include hopeful elements about growth and positive change")
248
+ if 'empowerment' in missing_elements:
249
+ suggestions.append("Emphasize their agency and ability to make choices")
250
+
251
+ return {
252
+ "missing": missing_elements,
253
+ "present": [k for k, v in element_scores.items() if v],
254
+ "suggestions": suggestions
255
+ }
256
+
257
+ def _check_crisis_indicators(self, text: str) -> Dict[str, Any]:
258
+ """Check for crisis indicators in text"""
259
+ for pattern in self.crisis_indicators:
260
+ if re.search(pattern, text, re.IGNORECASE):
261
+ return {
262
+ "is_crisis": True,
263
+ "pattern_matched": pattern,
264
+ "action": "Immediate crisis response needed"
265
+ }
266
+
267
+ return {"is_crisis": False}
268
+
269
+ def _calculate_confidence(self, issues: List[str], warnings: List[str]) -> float:
270
+ """Calculate confidence score for validation"""
271
+ if issues:
272
+ return 0.3 - (0.1 * len(issues)) # Major issues severely impact confidence
273
+
274
+ confidence = 1.0
275
+ confidence -= 0.1 * len(warnings) # Each warning reduces confidence
276
+
277
+ return max(0.0, confidence)
278
+
279
+ def _refine_response(self, response: str, issues: List[str], warnings: List[str], suggestions: List[str]) -> str:
280
+ """Attempt to refine the response based on issues found"""
281
+ refined = response
282
+
283
+ # Add disclaimer for professional advice
284
+ if any('advice' in issue for issue in issues):
285
+ disclaimer = "\n\n*Please note: I'm here to provide support and guidance, but for specific professional matters, it's important to consult with qualified professionals.*"
286
+ if disclaimer not in refined:
287
+ refined += disclaimer
288
+
289
+ # Add crisis resources if needed
290
+ if any('crisis' in warning for warning in warnings):
291
+ crisis_text = "\n\n**If you're in crisis, please reach out for immediate help:**\n- Crisis Hotline: 988 (US)\n- Crisis Text Line: Text HOME to 741741\n- International: findahelpline.com"
292
+ if crisis_text not in refined:
293
+ refined += crisis_text
294
+
295
+ # Add supportive closing if missing hope
296
+ if any('hope' in warning for warning in warnings):
297
+ hopeful_closing = "\n\nRemember, you have the strength to navigate this challenge, and positive change is possible. I'm here to support you on this journey."
298
+ if not any(phrase in refined.lower() for phrase in ['journey', 'strength', 'possible']):
299
+ refined += hopeful_closing
300
+
301
+ return refined
302
+
303
+ def validate_user_input(self, text: str) -> ValidationResult:
304
+ """Validate user input for safety and process-ability"""
305
+ issues = []
306
+ warnings = []
307
+ suggestions = []
308
+
309
+ # Check if empty
310
+ if not text or not text.strip():
311
+ issues.append("Empty input received")
312
+ suggestions.append("Please share what's on your mind")
313
+ return ValidationResult(False, issues, warnings, suggestions, 0.0)
314
+
315
+ # Check length
316
+ if len(text) > 5000:
317
+ warnings.append("Input is very long")
318
+ suggestions.append("Consider breaking this into smaller parts")
319
+
320
+ # Check for crisis indicators
321
+ crisis_check = self._check_crisis_indicators(text)
322
+ if crisis_check["is_crisis"]:
323
+ warnings.append("Crisis indicators detected")
324
+ suggestions.append("Prioritize safety and provide crisis resources")
325
+
326
+ # Check for spam/repetition
327
+ if self._is_spam(text):
328
+ issues.append("Input appears to be spam or repetitive")
329
+ suggestions.append("Please share genuine thoughts or concerns")
330
+
331
+ confidence = self._calculate_confidence(issues, warnings)
332
+
333
+ return ValidationResult(
334
+ is_valid=len(issues) == 0,
335
+ issues=issues,
336
+ warnings=warnings,
337
+ suggestions=suggestions,
338
+ confidence=confidence
339
+ )
340
+
341
+ def _is_spam(self, text: str) -> bool:
342
+ """Simple spam detection"""
343
+ # Check for excessive repetition
344
+ words = text.lower().split()
345
+ if len(words) > 10:
346
+ unique_ratio = len(set(words)) / len(words)
347
+ if unique_ratio < 0.3: # Less than 30% unique words
348
+ return True
349
+
350
+ # Check for common spam patterns
351
+ spam_patterns = [
352
+ r'(?:buy|sell|click|visit)\s+(?:now|here|this)',
353
+ r'(?:congratulations|winner|prize|lottery)',
354
+ r'(?:viagra|pills|drugs|pharmacy)',
355
+ r'(?:$$|money\s+back|guarantee)'
356
+ ]
357
+
358
+ for pattern in spam_patterns:
359
+ if re.search(pattern, text, re.IGNORECASE):
360
+ return True
361
+
362
+ return False
363
+
364
+ def get_crisis_resources(self, location: str = "global") -> Dict[str, Any]:
365
+ """Get crisis resources based on location"""
366
+ resources = {
367
+ "global": {
368
+ "name": "International Association for Suicide Prevention",
369
+ "url": "https://www.iasp.info/resources/Crisis_Centres/",
370
+ "text": "Find crisis centers worldwide"
371
+ },
372
+ "us": {
373
+ "name": "988 Suicide & Crisis Lifeline",
374
+ "phone": "988",
375
+ "text": "Text HOME to 741741",
376
+ "url": "https://988lifeline.org/"
377
+ },
378
+ "uk": {
379
+ "name": "Samaritans",
380
+ "phone": "116 123",
381
+ "email": "jo@samaritans.org",
382
+ "url": "https://www.samaritans.org/"
383
+ },
384
+ "india": {
385
+ "name": "National Suicide Prevention Helpline",
386
+ "phone": "91-9820466726",
387
+ "additional": "Vandrevala Foundation: 9999666555"
388
+ },
389
+ "australia": {
390
+ "name": "Lifeline",
391
+ "phone": "13 11 14",
392
+ "text": "Text 0477 13 11 14",
393
+ "url": "https://www.lifeline.org.au/"
394
+ }
395
+ }
396
+
397
+ return resources.get(location.lower(), resources["global"])
agents/tools/voice_tools.py ADDED
@@ -0,0 +1,93 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ import numpy as np
3
+ import torch
4
+ from transformers import pipeline, AutoProcessor, AutoModelForSpeechSeq2Seq
5
+ import asyncio
6
+ import soundfile as sf
7
+ import tempfile # Added the import for tempfile!
8
+ from models.mistral_model import MistralModel
9
+
10
+ class MultilingualVoiceProcessor:
11
+ def __init__(self, model_name="openai/whisper-base", device=None):
12
+ cache_dir = os.getenv("TRANSFORMERS_CACHE", None)
13
+ if device is None:
14
+ device = 0 if torch.cuda.is_available() else -1
15
+
16
+ # Load model and processor with cache_dir
17
+ processor = AutoProcessor.from_pretrained(model_name, cache_dir=cache_dir)
18
+ model = AutoModelForSpeechSeq2Seq.from_pretrained(model_name, cache_dir=cache_dir)
19
+
20
+ # Create the pipeline, DO NOT PASS cache_dir here
21
+ self.pipe = pipeline(
22
+ "automatic-speech-recognition",
23
+ model=model,
24
+ tokenizer=processor,
25
+ feature_extractor=processor,
26
+ device=device,
27
+ generate_kwargs={"task": "transcribe", "return_timestamps": False},
28
+ )
29
+
30
+ async def transcribe(self, audio_data: np.ndarray, language: str = None):
31
+ with tempfile.NamedTemporaryFile(suffix=".wav", delete=True) as tmp_wav:
32
+ sf.write(tmp_wav.name, audio_data, samplerate=16000)
33
+ extra = {"language": language} if language else {}
34
+ result = self.pipe(tmp_wav.name, **extra)
35
+ text = result['text']
36
+ return text, language or "unknown"
37
+
38
+ async def synthesize(self, text, language: str = "en", voice_type: str = "normal"):
39
+ raise NotImplementedError("Use gTTS or edge-tts as before.")
40
+
41
+ class VoiceTools:
42
+ def __init__(self, config=None):
43
+ self.config = config
44
+ self.vp = MultilingualVoiceProcessor()
45
+
46
+ def transcribe_audio(self, audio_data: np.ndarray, language=None):
47
+ text, detected_lang = asyncio.run(self.vp.transcribe(audio_data, language))
48
+ return {"text": text, "language": detected_lang}
49
+
50
+ def detect_emotion(self, text: str) -> dict:
51
+ model = MistralModel()
52
+ prompt = f"""
53
+ Analyze the emotional state in this text: "{text}"
54
+ Identify:
55
+ 1. Primary emotion (joy, sadness, anger, fear, anxiety, confusion, etc.)
56
+ 2. Emotional intensity (low, medium, high)
57
+ 3. Underlying feelings
58
+ 4. Key concerns
59
+ Format as JSON with keys: primary_emotion, intensity, feelings, concerns
60
+ """
61
+ response = model.generate(prompt)
62
+ # TODO: Actually parse response, dummy return for now:
63
+ return {
64
+ "primary_emotion": "detected_emotion",
65
+ "intensity": "medium",
66
+ "feelings": ["feeling1", "feeling2"],
67
+ "concerns": ["concern1", "concern2"]
68
+ }
69
+
70
+ def generate_reflective_questions(self, context: dict) -> list:
71
+ emotion = context.get("primary_emotion", "neutral")
72
+ questions_map = {
73
+ "anxiety": [
74
+ "What specific thoughts are creating this anxiety?",
75
+ "What would feeling calm look like in this situation?",
76
+ "What has helped you manage anxiety before?"
77
+ ],
78
+ "sadness": [
79
+ "What would comfort mean to you right now?",
80
+ "What are you grieving or missing?",
81
+ "How can you be gentle with yourself today?"
82
+ ],
83
+ "confusion": [
84
+ "What would clarity feel like?",
85
+ "What's the main question you're grappling with?",
86
+ "What does your intuition tell you?"
87
+ ]
88
+ }
89
+ return questions_map.get(emotion, [
90
+ "How are you feeling in this moment?",
91
+ "What would support look like for you?",
92
+ "What's most important to explore right now?"
93
+ ])
agents/tools/voice_tools_old.py ADDED
@@ -0,0 +1,194 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Multilingual Voice Processing Tools
3
+ STT and TTS with language support
4
+ """
5
+
6
+ import whisper
7
+ import numpy as np
8
+ from gtts import gTTS
9
+ import edge_tts
10
+ import io
11
+ import asyncio
12
+ from typing import Tuple, Optional
13
+ from crewai.tools import BaseTool
14
+ import speech_recognition as sr
15
+
16
+ class MultilingualVoiceProcessor:
17
+ """Handles multilingual STT and TTS"""
18
+
19
+ def __init__(self):
20
+ # Load Whisper model for multilingual STT
21
+ self.whisper_model = whisper.load_model("base")
22
+
23
+ # Language voice mappings for Edge TTS
24
+ self.voice_map = {
25
+ "en": "en-US-AriaNeural",
26
+ "es": "es-ES-ElviraNeural",
27
+ "fr": "fr-FR-DeniseNeural",
28
+ "de": "de-DE-KatjaNeural",
29
+ "it": "it-IT-ElsaNeural",
30
+ "pt": "pt-BR-FranciscaNeural",
31
+ "hi": "hi-IN-SwaraNeural",
32
+ "zh": "zh-CN-XiaoxiaoNeural",
33
+ "ja": "ja-JP-NanamiNeural",
34
+ "ko": "ko-KR-SunHiNeural",
35
+ "ar": "ar-SA-ZariyahNeural",
36
+ "ru": "ru-RU-SvetlanaNeural"
37
+ }
38
+
39
+ async def transcribe(
40
+ self,
41
+ audio_data: np.ndarray,
42
+ language: Optional[str] = None
43
+ ) -> Tuple[str, str]:
44
+ """Transcribe audio to text with language detection"""
45
+ try:
46
+ # Process audio
47
+ if isinstance(audio_data, tuple):
48
+ sample_rate, audio = audio_data
49
+ else:
50
+ audio = audio_data
51
+ sample_rate = 16000
52
+
53
+ # Normalize audio
54
+ if audio.dtype != np.float32:
55
+ audio = audio.astype(np.float32) / 32768.0
56
+
57
+ # Transcribe with Whisper
58
+ if language and language != "auto":
59
+ result = self.whisper_model.transcribe(
60
+ audio,
61
+ language=language
62
+ )
63
+ else:
64
+ # Auto-detect language
65
+ result = self.whisper_model.transcribe(audio)
66
+
67
+ text = result["text"]
68
+ detected_language = result["language"]
69
+
70
+ return text, detected_language
71
+
72
+ except Exception as e:
73
+ print(f"Transcription error: {e}")
74
+ return "Could not transcribe audio", "en"
75
+
76
+ async def synthesize(
77
+ self,
78
+ text: str,
79
+ language: str = "en",
80
+ voice_type: str = "normal"
81
+ ) -> bytes:
82
+ """Convert text to speech with voice modulation"""
83
+ try:
84
+ voice = self.voice_map.get(language, "en-US-AriaNeural")
85
+
86
+ # Apply voice settings for meditation tone
87
+ if voice_type == "meditation":
88
+ rate = "-15%" # Slower
89
+ pitch = "-50Hz" # Lower pitch
90
+ else:
91
+ rate = "+0%"
92
+ pitch = "+0Hz"
93
+
94
+ # Generate speech
95
+ communicate = edge_tts.Communicate(
96
+ text,
97
+ voice,
98
+ rate=rate,
99
+ pitch=pitch
100
+ )
101
+
102
+ audio_data = b""
103
+ async for chunk in communicate.stream():
104
+ if chunk["type"] == "audio":
105
+ audio_data += chunk["data"]
106
+
107
+ return audio_data
108
+
109
+ except Exception as e:
110
+ print(f"TTS error: {e}")
111
+ # Fallback to gTTS
112
+ try:
113
+ tts = gTTS(text=text, lang=language[:2])
114
+ fp = io.BytesIO()
115
+ tts.write_to_fp(fp)
116
+ return fp.getvalue()
117
+ except:
118
+ return None
119
+
120
+ class TranscribeTool(BaseTool):
121
+ name: str = "transcribe_audio"
122
+ description: str = "Transcribe audio input to text with language detection"
123
+
124
+ def _run(self, audio_data: np.ndarray, language: str = None) -> dict:
125
+ processor = MultilingualVoiceProcessor()
126
+ text, detected_lang = asyncio.run(
127
+ processor.transcribe(audio_data, language)
128
+ )
129
+ return {
130
+ "text": text,
131
+ "language": detected_lang
132
+ }
133
+
134
+ class DetectEmotionTool(BaseTool):
135
+ name: str = "detect_emotion"
136
+ description: str = "Detect emotional state from text using Mistral"
137
+
138
+ def _run(self, text: str) -> dict:
139
+ # Use Mistral for emotion detection
140
+ from models.mistral_model import MistralModel
141
+ model = MistralModel()
142
+
143
+ prompt = f"""
144
+ Analyze the emotional state in this text: "{text}"
145
+
146
+ Identify:
147
+ 1. Primary emotion (joy, sadness, anger, fear, anxiety, confusion, etc.)
148
+ 2. Emotional intensity (low, medium, high)
149
+ 3. Underlying feelings
150
+ 4. Key concerns
151
+
152
+ Format as JSON with keys: primary_emotion, intensity, feelings, concerns
153
+ """
154
+
155
+ response = model.generate(prompt)
156
+
157
+ # Parse response (simplified)
158
+ return {
159
+ "primary_emotion": "detected_emotion",
160
+ "intensity": "medium",
161
+ "feelings": ["feeling1", "feeling2"],
162
+ "concerns": ["concern1", "concern2"]
163
+ }
164
+
165
+ class GenerateQuestionsTool(BaseTool):
166
+ name: str = "generate_reflective_questions"
167
+ description: str = "Generate empathetic reflective questions"
168
+
169
+ def _run(self, context: dict) -> list:
170
+ emotion = context.get("primary_emotion", "neutral")
171
+
172
+ questions_map = {
173
+ "anxiety": [
174
+ "What specific thoughts are creating this anxiety?",
175
+ "What would feeling calm look like in this situation?",
176
+ "What has helped you manage anxiety before?"
177
+ ],
178
+ "sadness": [
179
+ "What would comfort mean to you right now?",
180
+ "What are you grieving or missing?",
181
+ "How can you be gentle with yourself today?"
182
+ ],
183
+ "confusion": [
184
+ "What would clarity feel like?",
185
+ "What's the main question you're grappling with?",
186
+ "What does your intuition tell you?"
187
+ ]
188
+ }
189
+
190
+ return questions_map.get(emotion, [
191
+ "How are you feeling in this moment?",
192
+ "What would support look like for you?",
193
+ "What's most important to explore right now?"
194
+ ])
agents/tools/voice_tools_openaiwhisper.py ADDED
@@ -0,0 +1,181 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Multilingual Voice Processing Tools - modular class version
3
+ """
4
+ import numpy as np
5
+ import asyncio
6
+ from models.mistral_model import MistralModel
7
+ import whisper
8
+ import numpy as np
9
+ from gtts import gTTS
10
+ import edge_tts
11
+ import io
12
+ import asyncio
13
+ from typing import Tuple, Optional
14
+ #from crewai.tools import BaseTool
15
+ import speech_recognition as sr
16
+ from transformers import pipeline
17
+ import whisper
18
+
19
+ # class MultilingualVoiceProcessor:
20
+ # """Handles multilingual STT and TTS"""
21
+
22
+ # def __init__(self):
23
+ # # Load Whisper model for multilingual STT
24
+ # self.whisper_model = whisper.load_model("base")
25
+
26
+ # # Language voice mappings for Edge TTS
27
+ # self.voice_map = {
28
+ # "en": "en-US-AriaNeural",
29
+ # "es": "es-ES-ElviraNeural",
30
+ # "fr": "fr-FR-DeniseNeural",
31
+ # "de": "de-DE-KatjaNeural",
32
+ # "it": "it-IT-ElsaNeural",
33
+ # "pt": "pt-BR-FranciscaNeural",
34
+ # "hi": "hi-IN-SwaraNeural",
35
+ # "zh": "zh-CN-XiaoxiaoNeural",
36
+ # "ja": "ja-JP-NanamiNeural",
37
+ # "ko": "ko-KR-SunHiNeural",
38
+ # "ar": "ar-SA-ZariyahNeural",
39
+ # "ru": "ru-RU-SvetlanaNeural"
40
+ # }
41
+
42
+ # async def transcribe(
43
+ # self,
44
+ # audio_data: np.ndarray,
45
+ # language: Optional[str] = None
46
+ # ) -> Tuple[str, str]:
47
+ # """Transcribe audio to text with language detection"""
48
+ # try:
49
+ # # Process audio
50
+ # if isinstance(audio_data, tuple):
51
+ # sample_rate, audio = audio_data
52
+ # else:
53
+ # audio = audio_data
54
+ # sample_rate = 16000
55
+
56
+ # # Normalize audio
57
+ # if audio.dtype != np.float32:
58
+ # audio = audio.astype(np.float32) / 32768.0
59
+
60
+ # # Transcribe with Whisper
61
+ # if language and language != "auto":
62
+ # result = self.whisper_model.transcribe(
63
+ # audio,
64
+ # language=language
65
+ # )
66
+ # else:
67
+ # # Auto-detect language
68
+ # result = self.whisper_model.transcribe(audio)
69
+
70
+ # text = result["text"]
71
+ # detected_language = result["language"]
72
+
73
+ # return text, detected_language
74
+
75
+ # except Exception as e:
76
+ # print(f"Transcription error: {e}")
77
+ # return "Could not transcribe audio", "en"
78
+
79
+ # async def synthesize(
80
+ # self,
81
+ # text: str,
82
+ # language: str = "en",
83
+ # voice_type: str = "normal"
84
+ # ) -> bytes:
85
+ # """Convert text to speech with voice modulation"""
86
+ # try:
87
+ # voice = self.voice_map.get(language, "en-US-AriaNeural")
88
+
89
+ # # Apply voice settings for meditation tone
90
+ # if voice_type == "meditation":
91
+ # rate = "-15%" # Slower
92
+ # pitch = "-50Hz" # Lower pitch
93
+ # else:
94
+ # rate = "+0%"
95
+ # pitch = "+0Hz"
96
+
97
+ # # Generate speech
98
+ # communicate = edge_tts.Communicate(
99
+ # text,
100
+ # voice,
101
+ # rate=rate,
102
+ # pitch=pitch
103
+ # )
104
+
105
+ # audio_data = b""
106
+ # async for chunk in communicate.stream():
107
+ # if chunk["type"] == "audio":
108
+ # audio_data += chunk["data"]
109
+
110
+ # return audio_data
111
+
112
+ # except Exception as e:
113
+ # print(f"TTS error: {e}")
114
+ # # Fallback to gTTS
115
+ # try:
116
+ # tts = gTTS(text=text, lang=language[:2])
117
+ # fp = io.BytesIO()
118
+ # tts.write_to_fp(fp)
119
+ # return fp.getvalue()
120
+ # except:
121
+ # return None
122
+
123
+
124
+ class VoiceTools:
125
+ def __init__(self, config=None):
126
+ self.config = config
127
+ self.vp = MultilingualVoiceProcessor()
128
+
129
+ def transcribe_audio(self, audio_data: np.ndarray, language=None) -> dict:
130
+ """Transcribe audio to text with language detection."""
131
+ text, detected_lang = asyncio.run(
132
+ self.vp.transcribe(audio_data, language)
133
+ )
134
+ return {"text": text, "language": detected_lang}
135
+
136
+ def detect_emotion(self, text: str) -> dict:
137
+ """Detect emotional state from text using LLM."""
138
+ model = MistralModel()
139
+ prompt = f"""
140
+ Analyze the emotional state in this text: "{text}"
141
+ Identify:
142
+ 1. Primary emotion (joy, sadness, anger, fear, anxiety, confusion, etc.)
143
+ 2. Emotional intensity (low, medium, high)
144
+ 3. Underlying feelings
145
+ 4. Key concerns
146
+ Format as JSON with keys: primary_emotion, intensity, feelings, concerns
147
+ """
148
+ response = model.generate(prompt)
149
+ # Ideally: parse LLM's response, for now, stubbed:
150
+ return {
151
+ "primary_emotion": "detected_emotion",
152
+ "intensity": "medium",
153
+ "feelings": ["feeling1", "feeling2"],
154
+ "concerns": ["concern1", "concern2"]
155
+ }
156
+
157
+ def generate_reflective_questions(self, context: dict) -> list:
158
+ """Generate empathetic reflective questions."""
159
+ emotion = context.get("primary_emotion", "neutral")
160
+ questions_map = {
161
+ "anxiety": [
162
+ "What specific thoughts are creating this anxiety?",
163
+ "What would feeling calm look like in this situation?",
164
+ "What has helped you manage anxiety before?"
165
+ ],
166
+ "sadness": [
167
+ "What would comfort mean to you right now?",
168
+ "What are you grieving or missing?",
169
+ "How can you be gentle with yourself today?"
170
+ ],
171
+ "confusion": [
172
+ "What would clarity feel like?",
173
+ "What's the main question you're grappling with?",
174
+ "What does your intuition tell you?"
175
+ ]
176
+ }
177
+ return questions_map.get(emotion, [
178
+ "How are you feeling in this moment?",
179
+ "What would support look like for you?",
180
+ "What's most important to explore right now?"
181
+ ])
app.py ADDED
@@ -0,0 +1,259 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Personal AI Coach with CrewAI and Mistral
3
+ Multilingual support with advanced conversational AI
4
+ """
5
+
6
+ import gradio as gr
7
+ import asyncio
8
+ import os
9
+ import sys
10
+ from datetime import datetime
11
+ from typing import Dict, List, Tuple, Optional
12
+ import numpy as np
13
+
14
+ sys.path.append(os.path.dirname(os.path.abspath(__file__)))
15
+
16
+ from crew_config import PersonalCoachCrew
17
+ from agents.tools.voice_tools import MultilingualVoiceProcessor
18
+ from utils.config import Config
19
+ from dotenv import load_dotenv
20
+ import certifi
21
+ load_dotenv()
22
+
23
+ class PersonalCoachApp:
24
+ """Main application using CrewAI orchestration"""
25
+
26
+ def __init__(self):
27
+ print("Initializing Personal Coach AI with CrewAI...")
28
+ self.config = Config()
29
+
30
+ # Initialize CrewAI
31
+ self.crew = PersonalCoachCrew(self.config)
32
+
33
+ # Initialize voice processor
34
+ self.voice_processor = MultilingualVoiceProcessor()
35
+
36
+ # Session management
37
+ self.conversation_history = []
38
+ self.session_data = {
39
+ "start_time": datetime.now(),
40
+ "language": "en",
41
+ "user_profile": {}
42
+ }
43
+
44
+ print("Personal Coach AI initialized successfully!")
45
+
46
+ async def process_input(
47
+ self,
48
+ text_input: str,
49
+ voice_input: Optional[np.ndarray],
50
+ language: str,
51
+ history: List
52
+ ) -> Tuple:
53
+ """Process user input through CrewAI"""
54
+ try:
55
+ # Prepare input data
56
+ if voice_input is not None:
57
+ # Process voice input with language detection
58
+ transcribed_text, detected_lang = await self.voice_processor.transcribe(
59
+ voice_input,
60
+ language
61
+ )
62
+ text_input = transcribed_text
63
+ self.session_data["language"] = detected_lang
64
+ else:
65
+ self.session_data["language"] = language
66
+
67
+ if not text_input:
68
+ return history, None, "", None
69
+
70
+ # Prepare crew input
71
+ crew_input = {
72
+ "user_message": text_input,
73
+ "language": self.session_data["language"],
74
+ "conversation_history": history[-5:], # Last 5 exchanges
75
+ "user_profile": self.session_data.get("user_profile", {})
76
+ }
77
+
78
+ # Execute crew
79
+ print(f"Processing input in {self.session_data['language']}...")
80
+ result = self.crew.kickoff(inputs=crew_input)
81
+
82
+ # Extract response
83
+ response_text = result.get("final_response", "I'm here to help. Please tell me more.")
84
+
85
+ # Generate audio response
86
+ audio_response = await self.voice_processor.synthesize(
87
+ response_text,
88
+ self.session_data["language"],
89
+ voice_type="meditation"
90
+ )
91
+
92
+ # Update history
93
+ history.append([text_input, response_text])
94
+
95
+ # Update user profile
96
+ if "user_profile_update" in result:
97
+ self.session_data["user_profile"].update(result["user_profile_update"])
98
+
99
+ return history, audio_response, "", None
100
+
101
+ except Exception as e:
102
+ print(f"Error in process_input: {str(e)}")
103
+ error_message = f"I apologize, but I encountered an error: {str(e)}"
104
+ history.append(["Error", error_message])
105
+ return history, None, "", None
106
+
107
+ def clear_conversation(self):
108
+ """Clear conversation and reset session"""
109
+ self.conversation_history = []
110
+ self.session_data["user_profile"] = {}
111
+ return [], None
112
+
113
+ def create_interface():
114
+ """Create Gradio interface with multilingual support"""
115
+ app = PersonalCoachApp()
116
+
117
+ with gr.Blocks(theme=gr.themes.Soft(), title="Personal AI Coach") as interface:
118
+ gr.Markdown("""
119
+ # 🧘 Personal AI Coach - Multilingual CrewAI System
120
+
121
+ Powered by Mistral AI and CrewAI's multi-agent framework. Supports multiple languages!
122
+
123
+ **Features:**
124
+ - 🌍 Multilingual voice and text support
125
+ - 🤖 4 specialized AI agents working together
126
+ - 🧠 Advanced Mistral AI for deep understanding
127
+ - 📚 Wisdom from 13 spiritual and self-help texts
128
+ - 🎙️ Natural voice interactions in your language
129
+ """)
130
+
131
+ with gr.Row():
132
+ # Main chat interface
133
+ with gr.Column(scale=3):
134
+ chatbot = gr.Chatbot(
135
+ height=500,
136
+ bubble_full_width=False,
137
+ avatar_images=(None, "🧘")
138
+ )
139
+
140
+ with gr.Row():
141
+ language = gr.Dropdown(
142
+ choices=[
143
+ ("English", "en"),
144
+ ("Spanish", "es"),
145
+ ("French", "fr"),
146
+ ("German", "de"),
147
+ ("Italian", "it"),
148
+ ("Portuguese", "pt"),
149
+ ("Hindi", "hi"),
150
+ ("Chinese", "zh"),
151
+ ("Japanese", "ja"),
152
+ ("Korean", "ko"),
153
+ ("Arabic", "ar"),
154
+ ("Russian", "ru")
155
+ ],
156
+ value="en",
157
+ label="Language",
158
+ scale=1
159
+ )
160
+
161
+ text_input = gr.Textbox(
162
+ placeholder="Type your message or click the microphone...",
163
+ show_label=False,
164
+ scale=3
165
+ )
166
+
167
+ voice_input = gr.Audio(
168
+ source="microphone",
169
+ type="numpy",
170
+ label="🎤",
171
+ scale=1
172
+ )
173
+
174
+ with gr.Row():
175
+ send_btn = gr.Button("Send 📤", variant="primary")
176
+ clear_btn = gr.Button("Clear 🗑️")
177
+
178
+ audio_output = gr.Audio(
179
+ label="🔊 Coach Response",
180
+ autoplay=True
181
+ )
182
+
183
+ # Sidebar
184
+ with gr.Column(scale=1):
185
+ gr.Markdown("""
186
+ ### 🤖 CrewAI Agent Team
187
+
188
+ **Agent 1: Empathetic Listener**
189
+ - Multilingual voice processing
190
+ - Emotional understanding
191
+ - Context analysis
192
+
193
+ **Agent 2: Wisdom Keeper**
194
+ - RAG with Mistral AI
195
+ - Spiritual text knowledge
196
+ - Personalized guidance
197
+
198
+ **Agent 3: Guardian**
199
+ - Response validation
200
+ - Safety checks
201
+ - Tone refinement
202
+
203
+ **Agent 4: Conversation Guide**
204
+ - Natural dialogue flow
205
+ - Voice synthesis
206
+ - Feedback integration
207
+
208
+ ### 🌍 Supported Languages
209
+ Voice input/output in 12+ languages
210
+
211
+ ### 📚 Knowledge Sources
212
+ - Bhagavad Gita
213
+ - Power of Now
214
+ - Atomic Habits
215
+ - Meditations
216
+ - And 9 more texts...
217
+ """)
218
+
219
+ # Examples in multiple languages
220
+ with gr.Accordion("💡 Example Prompts", open=False):
221
+ gr.Examples(
222
+ examples=[
223
+ ["I'm feeling overwhelmed with work pressure", "en"],
224
+ ["Je me sens perdu dans ma vie", "fr"],
225
+ ["Estoy luchando con la ansiedad", "es"],
226
+ ["Ich möchte bessere Gewohnheiten aufbauen", "de"],
227
+ ["मुझे अपने जीवन का उद्देश्य नहीं मिल रहा", "hi"],
228
+ ["我想学习冥想", "zh"]
229
+ ],
230
+ inputs=[text_input, language]
231
+ )
232
+
233
+ # Event handlers
234
+ async def handle_submission(text, voice, lang, history):
235
+ return await app.process_input(text, voice, lang, history)
236
+
237
+ # Connect events
238
+ for trigger in [text_input.submit, send_btn.click]:
239
+ trigger(
240
+ fn=lambda *args: asyncio.run(handle_submission(*args)),
241
+ inputs=[text_input, voice_input, language, chatbot],
242
+ outputs=[chatbot, audio_output, text_input, voice_input]
243
+ )
244
+
245
+ clear_btn.click(
246
+ fn=app.clear_conversation,
247
+ outputs=[chatbot, audio_output]
248
+ )
249
+
250
+ return interface
251
+
252
+ if __name__ == "__main__":
253
+ print("Starting Personal Coach AI with CrewAI...")
254
+ interface = create_interface()
255
+ interface.launch(
256
+ server_name="127.0.0.1",
257
+ server_port=7860,
258
+ share=False
259
+ )
crew_config.py ADDED
@@ -0,0 +1,73 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from agents.tools.voice_tools import VoiceTools
2
+ from agents.tools.llm_tools import LLMTools
3
+ from agents.tools.knowledge_tools import KnowledgeTools
4
+ from agents.tools.validation_tools import ValidationTools
5
+ from crewai import Agent
6
+ from utils.knowledge_base import KnowledgeBase
7
+
8
+ class PersonalCoachCrew:
9
+ def __init__(self, config):
10
+ self.config = config
11
+ # Centralized tool instances
12
+ self.voice_tools = VoiceTools(self.config)
13
+ self.llm_tools = LLMTools(self.config)
14
+ self.knowledge_tools = KnowledgeTools(self.config)
15
+ self.validation_tools = ValidationTools(self.config)
16
+
17
+ self.knowledge_base = KnowledgeBase(self.config)
18
+ self._initialize_agents()
19
+ self._create_crew()
20
+
21
+ def _initialize_agents(self):
22
+ # ----- AGENT 1 -----
23
+ self.conversation_handler = Agent(
24
+ role="Empathetic Conversation Handler",
25
+ goal="Understand user's emotional state and needs through compassionate dialogue",
26
+ backstory="...",
27
+ verbose=self.config.crew.verbose,
28
+ allow_delegation=False,
29
+ tools=[
30
+ self.voice_tools.transcribe_audio,
31
+ self.voice_tools.detect_emotion,
32
+ self.voice_tools.generate_reflective_questions,
33
+ ]
34
+ )
35
+ # ----- AGENT 2 -----
36
+ self.wisdom_advisor = Agent(
37
+ role="Wisdom Keeper and Spiritual Guide",
38
+ goal="Provide personalized guidance drawing from ancient wisdom and modern psychology",
39
+ backstory="...",
40
+ verbose=self.config.crew.verbose,
41
+ allow_delegation=False,
42
+ tools=[
43
+ self.knowledge_tools.search_knowledge,
44
+ self.knowledge_tools.extract_wisdom,
45
+ self.knowledge_tools.suggest_practices,
46
+ self.llm_tools.mistral_chat,
47
+ self.llm_tools.generate_advice,
48
+ ]
49
+ )
50
+ # ----- AGENT 3 -----
51
+ self.response_validator = Agent(
52
+ role="Response Guardian and Quality Validator",
53
+ goal="Ensure all responses are safe, appropriate, and truly helpful",
54
+ backstory="...",
55
+ verbose=self.config.crew.verbose,
56
+ allow_delegation=False,
57
+ tools=[
58
+ self.validation_tools.check_safety,
59
+ self.validation_tools.validate_tone,
60
+ self.validation_tools.refine_response,
61
+ ]
62
+ )
63
+ # ----- AGENT 4 -----
64
+ self.interaction_manager = Agent(
65
+ role="Conversation Flow Manager",
66
+ goal="Create natural, engaging dialogue that helps users on their journey",
67
+ backstory="...",
68
+ verbose=self.config.crew.verbose,
69
+ allow_delegation=False,
70
+ tools=[
71
+ self.llm_tools.summarize_conversation,
72
+ ]
73
+ )
data/books/7_habits.txt ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ Placeholder for 7_habits.txt
2
+ Add full text here.
data/books/atomic_habits.txt ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ Placeholder for atomic_habits.txt
2
+ Add full text here.
data/books/autobiography_of_a_yogi.txt ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ Placeholder for autobiography_of_a_yogi.txt
2
+ Add full text here.
data/books/bhagavad_gita.txt ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ Placeholder for bhagavad_gita.txt
2
+ Add full text here.
data/books/dhyana_vahini.txt ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ Placeholder for dhyana_vahini.txt
2
+ Add full text here.
data/books/mans_search_for_meaning.txt ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ Placeholder for mans_search_for_meaning.txt
2
+ Add full text here.
data/books/meditations.txt ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ Placeholder for meditations.txt
2
+ Add full text here.
data/books/mindset.txt ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ Placeholder for mindset.txt
2
+ Add full text here.
data/books/prasnothara_vahini.txt ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ Placeholder for prasnothara_vahini.txt
2
+ Add full text here.
data/books/prema_vahini.txt ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ Placeholder for prema_vahini.txt
2
+ Add full text here.
data/books/tao_te_ching.txt ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ Placeholder for tao_te_ching.txt
2
+ Add full text here.
data/books/the_power_of_now.txt ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ Placeholder for the_power_of_now.txt
2
+ Add full text here.
data/faiss_index/index.faiss ADDED
Binary file (40 kB). View file
 
data/faiss_index/metadata.pkl ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:36ec600de36710fb9e7741ffe98886792a726651172f48a6e3de392836bbfc68
3
+ size 1402
data/faiss_index/passages.pkl ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d637604fad4c72b56e5b7c74280a7804482ce2f27dcb72dad08650cf86c40881
3
+ size 1859
models/__pycache__/mistral_model.cpython-310.pyc ADDED
Binary file (3.26 kB). View file
 
models/_init_.py ADDED
@@ -0,0 +1,224 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Models module for Personal Coach CrewAI Application
3
+ Handles all AI model loading and management
4
+ """
5
+
6
+ from typing import TYPE_CHECKING, Optional, Dict, Any
7
+ import torch
8
+
9
+ # Version info
10
+ __version__ = "1.0.0"
11
+
12
+ # Lazy imports
13
+ if TYPE_CHECKING:
14
+ from .mistral_model import MistralModel, MistralConfig, MistralPromptFormatter
15
+
16
+ # Public API
17
+ __all__ = [
18
+ # Main model classes
19
+ "MistralModel",
20
+ "MistralConfig",
21
+ "MistralPromptFormatter",
22
+
23
+ # Model management
24
+ "load_model",
25
+ "get_model_info",
26
+ "clear_model_cache",
27
+
28
+ # Constants
29
+ "AVAILABLE_MODELS",
30
+ "MODEL_REQUIREMENTS",
31
+ "DEFAULT_MODEL_CONFIG"
32
+ ]
33
+
34
+ # Available models
35
+ AVAILABLE_MODELS = {
36
+ "mistral-7b-instruct": {
37
+ "model_id": "mistralai/Mistral-7B-Instruct-v0.1",
38
+ "type": "instruction-following",
39
+ "size": "7B",
40
+ "context_length": 32768,
41
+ "languages": ["multilingual"]
42
+ },
43
+ "mistral-7b": {
44
+ "model_id": "mistralai/Mistral-7B-v0.1",
45
+ "type": "base",
46
+ "size": "7B",
47
+ "context_length": 32768,
48
+ "languages": ["multilingual"]
49
+ }
50
+ }
51
+
52
+ # Model requirements
53
+ MODEL_REQUIREMENTS = {
54
+ "mistral-7b-instruct": {
55
+ "ram": "16GB",
56
+ "vram": "8GB (GPU) or 16GB (CPU)",
57
+ "disk": "15GB",
58
+ "compute": "GPU recommended"
59
+ }
60
+ }
61
+
62
+ # Default configuration
63
+ DEFAULT_MODEL_CONFIG = {
64
+ "max_length": 2048,
65
+ "temperature": 0.7,
66
+ "top_p": 0.95,
67
+ "top_k": 50,
68
+ "do_sample": True,
69
+ "num_return_sequences": 1,
70
+ "device": "cuda" if torch.cuda.is_available() else "cpu",
71
+ "torch_dtype": torch.float16 if torch.cuda.is_available() else torch.float32,
72
+ "load_in_8bit": False,
73
+ "cache_dir": ".cache/models"
74
+ }
75
+
76
+ # Model instance cache
77
+ _model_cache: Dict[str, Any] = {}
78
+
79
+ def load_model(model_name: str = "mistral-7b-instruct", config: Optional[Dict[str, Any]] = None):
80
+ """
81
+ Load a model with caching support
82
+
83
+ Args:
84
+ model_name: Name of the model to load
85
+ config: Optional configuration override
86
+
87
+ Returns:
88
+ Model instance
89
+ """
90
+ # Check cache first
91
+ cache_key = f"{model_name}_{str(config)}"
92
+ if cache_key in _model_cache:
93
+ return _model_cache[cache_key]
94
+
95
+ # Import here to avoid circular imports
96
+ from .mistral_model import MistralModel, MistralConfig
97
+
98
+ # Get model info
99
+ model_info = AVAILABLE_MODELS.get(model_name)
100
+ if not model_info:
101
+ raise ValueError(f"Unknown model: {model_name}")
102
+
103
+ # Merge configurations
104
+ model_config = DEFAULT_MODEL_CONFIG.copy()
105
+ if config:
106
+ model_config.update(config)
107
+
108
+ # Create config object
109
+ mistral_config = MistralConfig(
110
+ model_id=model_info["model_id"],
111
+ **model_config
112
+ )
113
+
114
+ # Load model
115
+ model = MistralModel(mistral_config)
116
+
117
+ # Cache it
118
+ _model_cache[cache_key] = model
119
+
120
+ return model
121
+
122
+ def get_model_info(model_name: str) -> Optional[Dict[str, Any]]:
123
+ """
124
+ Get information about a model
125
+
126
+ Args:
127
+ model_name: Name of the model
128
+
129
+ Returns:
130
+ Model information dictionary or None
131
+ """
132
+ info = AVAILABLE_MODELS.get(model_name)
133
+ if info:
134
+ # Add requirements
135
+ requirements = MODEL_REQUIREMENTS.get(model_name, {})
136
+ info["requirements"] = requirements
137
+
138
+ # Add loading status
139
+ cache_keys = [k for k in _model_cache.keys() if k.startswith(model_name)]
140
+ info["is_loaded"] = len(cache_keys) > 0
141
+
142
+ return info
143
+
144
+ def clear_model_cache(model_name: Optional[str] = None):
145
+ """
146
+ Clear model cache to free memory
147
+
148
+ Args:
149
+ model_name: Specific model to clear, or None for all
150
+ """
151
+ global _model_cache
152
+
153
+ if model_name:
154
+ # Clear specific model
155
+ keys_to_remove = [k for k in _model_cache.keys() if k.startswith(model_name)]
156
+ for key in keys_to_remove:
157
+ del _model_cache[key]
158
+ else:
159
+ # Clear all
160
+ _model_cache.clear()
161
+
162
+ # Force garbage collection
163
+ import gc
164
+ gc.collect()
165
+
166
+ # Clear GPU cache if using CUDA
167
+ if torch.cuda.is_available():
168
+ torch.cuda.empty_cache()
169
+
170
+ # Utility functions
171
+ def estimate_memory_usage(model_name: str) -> Dict[str, Any]:
172
+ """
173
+ Estimate memory usage for a model
174
+
175
+ Args:
176
+ model_name: Name of the model
177
+
178
+ Returns:
179
+ Memory estimation dictionary
180
+ """
181
+ model_info = AVAILABLE_MODELS.get(model_name)
182
+ if not model_info:
183
+ return {}
184
+
185
+ size = model_info.get("size", "7B")
186
+ size_gb = float(size.replace("B", ""))
187
+
188
+ estimates = {
189
+ "model_size_gb": size_gb,
190
+ "fp32_memory_gb": size_gb * 4, # 4 bytes per parameter
191
+ "fp16_memory_gb": size_gb * 2, # 2 bytes per parameter
192
+ "int8_memory_gb": size_gb, # 1 byte per parameter
193
+ "recommended_ram_gb": size_gb * 2.5,
194
+ "recommended_vram_gb": size_gb * 1.5
195
+ }
196
+
197
+ return estimates
198
+
199
+ def get_device_info() -> Dict[str, Any]:
200
+ """Get information about available compute devices"""
201
+ info = {
202
+ "cuda_available": torch.cuda.is_available(),
203
+ "device_count": torch.cuda.device_count() if torch.cuda.is_available() else 0,
204
+ "current_device": torch.cuda.current_device() if torch.cuda.is_available() else None,
205
+ "device_name": torch.cuda.get_device_name() if torch.cuda.is_available() else "CPU"
206
+ }
207
+
208
+ if torch.cuda.is_available():
209
+ info["gpu_memory"] = {
210
+ "allocated": torch.cuda.memory_allocated() / 1024**3, # GB
211
+ "reserved": torch.cuda.memory_reserved() / 1024**3, # GB
212
+ "total": torch.cuda.get_device_properties(0).total_memory / 1024**3 # GB
213
+ }
214
+
215
+ return info
216
+
217
+ # Module initialization
218
+ import os
219
+ if os.getenv("DEBUG_MODE", "false").lower() == "true":
220
+ print(f"Models module v{__version__} initialized")
221
+ device_info = get_device_info()
222
+ print(f"Device: {device_info['device_name']}")
223
+ if device_info['cuda_available']:
224
+ print(f"GPU Memory: {device_info['gpu_memory']['total']:.1f}GB")
models/mistral_model.py ADDED
@@ -0,0 +1,105 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Mistral Model Wrapper for easy integration
3
+ """
4
+ import os
5
+ from transformers import AutoTokenizer, AutoModelForCausalLM
6
+ import torch
7
+ from typing import Optional
8
+ HUGGINGFACE_TOKEN = os.getenv("HUGGINGFACE_TOKEN")
9
+ class MistralModel:
10
+ """Wrapper for Mistral model with caching and optimization"""
11
+
12
+ _instance = None
13
+ _model = None
14
+ _tokenizer = None
15
+
16
+ def __new__(cls):
17
+ if cls._instance is None:
18
+ cls._instance = super().__new__(cls)
19
+ return cls._instance
20
+
21
+ def __init__(self):
22
+ if MistralModel._model is None:
23
+ self._initialize_model()
24
+
25
+ def _initialize_model(self):
26
+ """Initialize Mistral model with optimizations"""
27
+ print("Loading Mistral model...")
28
+
29
+ model_id = "mistralai/Mistral-7B-Instruct-v0.2"
30
+
31
+ # Load tokenizer
32
+ MistralModel._tokenizer = AutoTokenizer.from_pretrained(model_id, token=HUGGINGFACE_TOKEN)
33
+
34
+ # Load model with optimizations
35
+ MistralModel._model = AutoModelForCausalLM.from_pretrained(
36
+ model_id,
37
+ token=HUGGINGFACE_TOKEN,
38
+ torch_dtype=torch.float16,
39
+ device_map="auto",
40
+ load_in_8bit=True # Use 8-bit quantization for memory efficiency
41
+ )
42
+
43
+ print("Mistral model loaded successfully!")
44
+
45
+ def generate(
46
+ self,
47
+ prompt: str,
48
+ max_length: int = 512,
49
+ temperature: float = 0.7,
50
+ top_p: float = 0.95
51
+ ) -> str:
52
+ """Generate response from Mistral"""
53
+
54
+ # Format prompt for Mistral instruction format
55
+ formatted_prompt = f"<s>[INST] {prompt} [/INST]"
56
+
57
+ # Tokenize
58
+ inputs = MistralModel._tokenizer(
59
+ formatted_prompt,
60
+ return_tensors="pt",
61
+ truncation=True,
62
+ max_length=2048
63
+ )
64
+
65
+ # Move to device
66
+ device = next(MistralModel._model.parameters()).device
67
+ inputs = {k: v.to(device) for k, v in inputs.items()}
68
+
69
+ # Generate
70
+ with torch.no_grad():
71
+ outputs = MistralModel._model.generate(
72
+ **inputs,
73
+ max_new_tokens=max_length,
74
+ temperature=temperature,
75
+ top_p=top_p,
76
+ do_sample=True,
77
+ pad_token_id=MistralModel._tokenizer.eos_token_id
78
+ )
79
+
80
+ # Decode
81
+ response = MistralModel._tokenizer.decode(
82
+ outputs[0][inputs['input_ids'].shape[1]:],
83
+ skip_special_tokens=True
84
+ )
85
+
86
+ return response.strip()
87
+
88
+ def generate_embedding(self, text: str) -> torch.Tensor:
89
+ """Generate embeddings for text"""
90
+ inputs = MistralModel._tokenizer(
91
+ text,
92
+ return_tensors="pt",
93
+ truncation=True,
94
+ max_length=512
95
+ )
96
+
97
+ device = next(MistralModel._model.parameters()).device
98
+ inputs = {k: v.to(device) for k, v in inputs.items()}
99
+
100
+ with torch.no_grad():
101
+ outputs = MistralModel._model(**inputs, output_hidden_states=True)
102
+ # Use last hidden state as embedding
103
+ embeddings = outputs.hidden_states[-1].mean(dim=1)
104
+
105
+ return embeddings
requirements.txt ADDED
@@ -0,0 +1,63 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ crewai
2
+ #langchain==0.1.19
3
+ langchain
4
+ #langchain-community==0.0.38
5
+ langchain-community
6
+ langchain-core
7
+ #pydantic==2.5.0
8
+ pydantic
9
+ # Optional: Gradio and supporting tools
10
+ gradio==4.19.2
11
+ numpy==1.26.4
12
+ faiss-cpu==1.7.4
13
+ sentence-transformers==2.2.2
14
+ transformers==4.36.0
15
+ torch
16
+ accelerate==0.25.0
17
+ huggingface-hub==0.20.2
18
+ openai-whisper==20231117
19
+ #edge-tts==6.1.9
20
+ speechrecognition==3.10.1
21
+ gtts==2.5.0
22
+ pyttsx3==2.90
23
+ python-dotenv==1.0.0
24
+ pandas==2.0.3
25
+ aiofiles==23.2.1
26
+ duckduckgo-search==5.0
27
+ beautifulsoup4==4.12.3
28
+ onnxruntime==1.16.3
29
+ embedchain
30
+ torchaudio
31
+ torchvision
32
+ certifi==2023.07.22
33
+ python-dotenv
34
+ soundfile
35
+
36
+ # crewai==0.23.1
37
+ # langchain==0.1.19 # Pinning to known stable
38
+ # langchain-community==0.0.41 # Latest as of June 2024
39
+ # langchain-core==0.1.44 # Match langchain compatibility
40
+ # pydantic==2.6.4 # v2.x required by LangChain
41
+ # gradio==4.19.2
42
+ # numpy==1.26.4
43
+ # faiss-cpu==1.7.4
44
+ # sentence-transformers==2.2.2
45
+ # transformers==4.36.2 # Small bump for bugfixes
46
+ # torch==2.2.2 # Stable and supported by other libs
47
+ # accelerate==0.25.0
48
+ # huggingface-hub==0.20.2
49
+ # openai-whisper==20231117
50
+ # speechrecognition==3.10.1
51
+ # gtts==2.5.0
52
+ # pyttsx3==2.90
53
+ # python-dotenv==1.0.1 # Use latest: bugfixes
54
+ # pandas==2.2.2 # Compatible w/ numpy 1.26.x
55
+ # aiofiles==23.2.1
56
+ # duckduckgo-search==5.1.0 # Latest, works w/ Python 3.9+
57
+ # beautifulsoup4==4.12.3
58
+ # onnxruntime==1.16.3
59
+ # embedchain==0.0.36 # Works with Transformers/Torch
60
+ # torchaudio==2.2.2 # Must match Torch version!
61
+ # torchvision==0.17.2 # Must match Torch version!
62
+ # certifi==2023.07.22
63
+ # soundfile==0.12.1 # Explicit version
setup_knowledge.py ADDED
@@ -0,0 +1,258 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Enhanced setup script with more comprehensive knowledge base
3
+ """
4
+
5
+ import os
6
+ import sys
7
+ from pathlib import Path
8
+ import numpy as np
9
+ import faiss
10
+ import pickle
11
+ from sentence_transformers import SentenceTransformer
12
+ from typing import List, Dict
13
+
14
+ sys.path.append(os.path.dirname(os.path.abspath(__file__)))
15
+
16
+ # Extended wisdom database with more texts
17
+ WISDOM_DATABASE = [
18
+ # Bhagavad Gita
19
+ {
20
+ "text": "The mind is restless and difficult to restrain, but it is subdued by practice.",
21
+ "source": "Bhagavad Gita",
22
+ "chapter": "6.35",
23
+ "tags": ["mind", "practice", "discipline"]
24
+ },
25
+ {
26
+ "text": "Set thy heart upon thy work, but never on its reward.",
27
+ "source": "Bhagavad Gita",
28
+ "chapter": "2.47",
29
+ "tags": ["work", "detachment", "karma"]
30
+ },
31
+ {
32
+ "text": "The soul is neither born, and nor does it die.",
33
+ "source": "Bhagavad Gita",
34
+ "chapter": "2.20",
35
+ "tags": ["soul", "eternal", "death"]
36
+ },
37
+
38
+ # Autobiography of a Yogi
39
+ {
40
+ "text": "The season of failure is the best time for sowing the seeds of success.",
41
+ "source": "Autobiography of a Yogi",
42
+ "tags": ["failure", "success", "growth"]
43
+ },
44
+ {
45
+ "text": "Live quietly in the moment and see the beauty of all before you.",
46
+ "source": "Autobiography of a Yogi",
47
+ "tags": ["present", "beauty", "mindfulness"]
48
+ },
49
+
50
+ # The Power of Now
51
+ {
52
+ "text": "Realize deeply that the present moment is all you have.",
53
+ "source": "The Power of Now",
54
+ "tags": ["present", "awareness", "now"]
55
+ },
56
+ {
57
+ "text": "The primary cause of unhappiness is never the situation but your thoughts about it.",
58
+ "source": "The Power of Now",
59
+ "tags": ["unhappiness", "thoughts", "perception"]
60
+ },
61
+ {
62
+ "text": "Life will give you whatever experience is most helpful for the evolution of your consciousness.",
63
+ "source": "The Power of Now",
64
+ "tags": ["life", "experience", "consciousness"]
65
+ },
66
+
67
+ # Man's Search for Meaning
68
+ {
69
+ "text": "When we are no longer able to change a situation, we are challenged to change ourselves.",
70
+ "source": "Man's Search for Meaning",
71
+ "tags": ["change", "adaptation", "growth"]
72
+ },
73
+ {
74
+ "text": "Those who have a 'why' to live, can bear with almost any 'how'.",
75
+ "source": "Man's Search for Meaning",
76
+ "tags": ["purpose", "meaning", "resilience"]
77
+ },
78
+
79
+ # Atomic Habits
80
+ {
81
+ "text": "You do not rise to the level of your goals. You fall to the level of your systems.",
82
+ "source": "Atomic Habits",
83
+ "tags": ["goals", "systems", "habits"]
84
+ },
85
+ {
86
+ "text": "Every action you take is a vote for the type of person you wish to become.",
87
+ "source": "Atomic Habits",
88
+ "tags": ["identity", "actions", "becoming"]
89
+ },
90
+ {
91
+ "text": "Habits are the compound interest of self-improvement.",
92
+ "source": "Atomic Habits",
93
+ "tags": ["habits", "improvement", "compound"]
94
+ },
95
+
96
+ # Meditations - Marcus Aurelius
97
+ {
98
+ "text": "You have power over your mind - not outside events. Realize this, and you will find strength.",
99
+ "source": "Meditations",
100
+ "tags": ["control", "mind", "strength"]
101
+ },
102
+ {
103
+ "text": "The best revenge is not to be like your enemy.",
104
+ "source": "Meditations",
105
+ "tags": ["revenge", "character", "virtue"]
106
+ },
107
+ {
108
+ "text": "What we cannot bear removes us from life; what remains can be borne.",
109
+ "source": "Meditations",
110
+ "tags": ["endurance", "life", "strength"]
111
+ },
112
+
113
+ # Tao Te Ching
114
+ {
115
+ "text": "When I let go of what I am, I become what I might be.",
116
+ "source": "Tao Te Ching",
117
+ "tags": ["letting go", "becoming", "potential"]
118
+ },
119
+ {
120
+ "text": "The journey of a thousand miles begins with a single step.",
121
+ "source": "Tao Te Ching",
122
+ "tags": ["journey", "beginning", "action"]
123
+ },
124
+ {
125
+ "text": "Knowing others is intelligence; knowing yourself is true wisdom.",
126
+ "source": "Tao Te Ching",
127
+ "tags": ["self-knowledge", "wisdom", "intelligence"]
128
+ },
129
+
130
+ # 7 Habits of Highly Effective People
131
+ {
132
+ "text": "Between stimulus and response there is a space. In that space is our power to choose our response.",
133
+ "source": "7 Habits of Highly Effective People",
134
+ "tags": ["choice", "response", "freedom"]
135
+ },
136
+ {
137
+ "text": "Begin with the end in mind.",
138
+ "source": "7 Habits of Highly Effective People",
139
+ "tags": ["vision", "planning", "purpose"]
140
+ },
141
+
142
+ # Mindset
143
+ {
144
+ "text": "Becoming is better than being.",
145
+ "source": "Mindset",
146
+ "tags": ["growth", "becoming", "fixed mindset"]
147
+ },
148
+ {
149
+ "text": "The view you adopt for yourself profoundly affects the way you lead your life.",
150
+ "source": "Mindset",
151
+ "tags": ["perspective", "self-view", "life"]
152
+ },
153
+
154
+ # Additional Universal Wisdom
155
+ {
156
+ "text": "The wound is the place where the Light enters you.",
157
+ "source": "Rumi",
158
+ "tags": ["pain", "growth", "transformation"]
159
+ },
160
+ {
161
+ "text": "Yesterday I was clever, so I wanted to change the world. Today I am wise, so I am changing myself.",
162
+ "source": "Rumi",
163
+ "tags": ["change", "wisdom", "self"]
164
+ },
165
+ {
166
+ "text": "Do not dwell in the past, do not dream of the future, concentrate the mind on the present moment.",
167
+ "source": "Buddha",
168
+ "tags": ["present", "past", "future"]
169
+ }
170
+ ]
171
+
172
+ def setup_comprehensive_knowledge_base():
173
+ """Setup knowledge base with all wisdom texts"""
174
+ print("Setting up comprehensive knowledge base...")
175
+
176
+ # Create directories
177
+ data_dir = Path("data")
178
+ faiss_dir = data_dir / "faiss_index"
179
+ books_dir = data_dir / "books"
180
+
181
+ for dir_path in [data_dir, faiss_dir, books_dir]:
182
+ dir_path.mkdir(exist_ok=True)
183
+
184
+ # Initialize embedder
185
+ print("Loading sentence transformer model...")
186
+ embedder = SentenceTransformer('all-MiniLM-L6-v2')
187
+
188
+ # Process wisdom texts
189
+ texts = []
190
+ metadata = []
191
+
192
+ for item in WISDOM_DATABASE:
193
+ texts.append(item["text"])
194
+ metadata.append({
195
+ "source": item["source"],
196
+ "chapter": item.get("chapter", ""),
197
+ "tags": item.get("tags", [])
198
+ })
199
+
200
+ # Create embeddings
201
+ print(f"Creating embeddings for {len(texts)} wisdom passages...")
202
+ embeddings = embedder.encode(texts, show_progress_bar=True)
203
+
204
+ # Create FAISS index with cosine similarity
205
+ dimension = embeddings.shape[1]
206
+
207
+ # Normalize embeddings for cosine similarity
208
+ faiss.normalize_L2(embeddings)
209
+
210
+ # Use Inner Product (equivalent to cosine similarity after normalization)
211
+ index = faiss.IndexFlatIP(dimension)
212
+ index.add(embeddings.astype('float32'))
213
+
214
+ # Save everything
215
+ print("Saving knowledge base...")
216
+ faiss.write_index(index, str(faiss_dir / "index.faiss"))
217
+
218
+ with open(faiss_dir / "passages.pkl", 'wb') as f:
219
+ pickle.dump(texts, f)
220
+
221
+ with open(faiss_dir / "metadata.pkl", 'wb') as f:
222
+ pickle.dump(metadata, f)
223
+
224
+ print(f"✅ Knowledge base setup complete!")
225
+ print(f" - Total passages: {len(texts)}")
226
+ print(f" - Embedding dimension: {dimension}")
227
+ print(f" - Index type: Cosine similarity (normalized L2)")
228
+
229
+ # Create placeholder book files
230
+ print("\nCreating placeholder book files...")
231
+ book_files = [
232
+ "bhagavad_gita.txt",
233
+ "autobiography_of_a_yogi.txt",
234
+ "the_power_of_now.txt",
235
+ "mans_search_for_meaning.txt",
236
+ "atomic_habits.txt",
237
+ "meditations.txt",
238
+ "tao_te_ching.txt",
239
+ "dhyana_vahini.txt",
240
+ "7_habits.txt",
241
+ "mindset.txt",
242
+ "prema_vahini.txt",
243
+ "prasnothara_vahini.txt"
244
+ ]
245
+
246
+ for book_file in book_files:
247
+ filepath = books_dir / book_file
248
+ if not filepath.exists():
249
+ with open(filepath, 'w', encoding='utf-8') as f:
250
+ f.write(f"Placeholder for {book_file}\nAdd full text here.")
251
+
252
+ print("✅ Setup complete! You can now run the main application.")
253
+ print("\nTo add full books:")
254
+ print("1. Replace placeholder files in data/books/ with full text")
255
+ print("2. Re-run this script to update the knowledge base")
256
+
257
+ if __name__ == "__main__":
258
+ setup_comprehensive_knowledge_base()
utils/__pycache__/config.cpython-310.pyc ADDED
Binary file (8.92 kB). View file
 
utils/__pycache__/knowledge_base.cpython-310.pyc ADDED
Binary file (8.86 kB). View file
 
utils/_init_.py ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Utility modules for Personal Coach CrewAI Application
3
+ """
4
+
5
+ from .config import Config
6
+ from .knowledge_base import KnowledgeBase
7
+
8
+ __all__ = ['Config', 'KnowledgeBase']
utils/config.py ADDED
@@ -0,0 +1,298 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Configuration management for Personal Coach CrewAI
3
+ """
4
+
5
+ import os
6
+ from dataclasses import dataclass
7
+ from typing import Dict, List, Optional
8
+ from dotenv import load_dotenv
9
+ import torch
10
+
11
+ # Load environment variables
12
+ load_dotenv()
13
+
14
+ @dataclass
15
+ class ModelConfig:
16
+ """Model configuration settings"""
17
+ # Mistral model for main LLM
18
+ mistral_model: str = "mistralai/Mistral-7B-Instruct-v0.1"
19
+
20
+ # Embedding model for RAG
21
+ embedding_model: str = "sentence-transformers/all-MiniLM-L6-v2"
22
+
23
+ # Whisper model for multilingual STT
24
+ whisper_model: str = "openai/whisper-small"
25
+
26
+ # TTS models for different languages
27
+ tts_models: Dict[str, str] = None
28
+
29
+ # Model parameters
30
+ max_length: int = 2048
31
+ temperature: float = 0.7
32
+ top_p: float = 0.95
33
+ do_sample: bool = True
34
+
35
+ # Device configuration
36
+ device: str = "cuda" if torch.cuda.is_available() else "cpu"
37
+ torch_dtype: torch.dtype = torch.float16 if torch.cuda.is_available() else torch.float32
38
+
39
+ def __post_init__(self):
40
+ if self.tts_models is None:
41
+ self.tts_models = {
42
+ "en": "microsoft/speecht5_tts",
43
+ "hi": "facebook/mms-tts-hin",
44
+ "es": "facebook/mms-tts-spa",
45
+ "fr": "facebook/mms-tts-fra",
46
+ "de": "facebook/mms-tts-deu",
47
+ "zh": "facebook/mms-tts-cmn",
48
+ "ar": "facebook/mms-tts-ara",
49
+ "default": "microsoft/speecht5_tts"
50
+ }
51
+
52
+ @dataclass
53
+ class VectorStoreConfig:
54
+ """Vector store configuration for knowledge base"""
55
+ index_type: str = "Flat" # FAISS index type
56
+ dimension: int = 384 # for all-MiniLM-L6-v2
57
+ metric: str = "cosine" # similarity metric
58
+ n_results: int = 5 # number of results to retrieve
59
+ chunk_size: int = 500 # text chunk size
60
+ chunk_overlap: int = 50 # overlap between chunks
61
+
62
+ @dataclass
63
+ class AudioConfig:
64
+ """Audio processing configuration"""
65
+ sample_rate: int = 16000
66
+ chunk_length: int = 30 # seconds
67
+ language_detection: bool = True
68
+ supported_languages: List[str] = None
69
+
70
+ def __post_init__(self):
71
+ if self.supported_languages is None:
72
+ self.supported_languages = [
73
+ "en", "es", "fr", "de", "it", "pt", "ru", "zh",
74
+ "ja", "ko", "hi", "ar", "bn", "pa", "te", "mr",
75
+ "ta", "ur", "gu", "kn", "ml", "or"
76
+ ]
77
+
78
+ @dataclass
79
+ class CrewConfig:
80
+ """CrewAI specific configuration"""
81
+ max_iterations: int = 3
82
+ memory: bool = True
83
+ verbose: bool = True
84
+ temperature: float = 0.7
85
+ max_rpm: int = 10 # rate limiting
86
+
87
+ # Agent-specific settings
88
+ agent_settings: Dict[str, Dict] = None
89
+
90
+ def __post_init__(self):
91
+ if self.agent_settings is None:
92
+ self.agent_settings = {
93
+ "conversation_handler": {
94
+ "max_questions": 3,
95
+ "empathy_level": "high",
96
+ "response_style": "warm"
97
+ },
98
+ "knowledge_advisor": {
99
+ "search_depth": 5,
100
+ "context_window": 3,
101
+ "wisdom_sources": ["all"]
102
+ },
103
+ "response_validator": {
104
+ "safety_threshold": 0.9,
105
+ "tone_check": True,
106
+ "fact_check": False
107
+ },
108
+ "interaction_manager": {
109
+ "voice_speed": 1.0,
110
+ "voice_pitch": 1.0,
111
+ "include_followup": True
112
+ }
113
+ }
114
+
115
+ class Config:
116
+ """Main configuration class"""
117
+
118
+ def __init__(self):
119
+ # Base paths
120
+ self.BASE_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
121
+ self.DATA_DIR = os.path.join(self.BASE_DIR, "data")
122
+ self.BOOKS_DIR = os.path.join(self.DATA_DIR, "books")
123
+ self.INDEX_DIR = os.path.join(self.DATA_DIR, "faiss_index")
124
+ self.CACHE_DIR = os.path.join(self.BASE_DIR, ".cache")
125
+ self.LOGS_DIR = os.path.join(self.BASE_DIR, "logs")
126
+
127
+ # Create necessary directories
128
+ for directory in [self.DATA_DIR, self.BOOKS_DIR, self.INDEX_DIR, self.CACHE_DIR, self.LOGS_DIR]:
129
+ os.makedirs(directory, exist_ok=True)
130
+
131
+ # Model configuration
132
+ self.models = ModelConfig(
133
+ mistral_model=os.getenv("MISTRAL_MODEL", ModelConfig.mistral_model),
134
+ embedding_model=os.getenv("EMBEDDING_MODEL", ModelConfig.embedding_model),
135
+ whisper_model=os.getenv("WHISPER_MODEL", ModelConfig.whisper_model),
136
+ temperature=float(os.getenv("TEMPERATURE", "0.7")),
137
+ max_length=int(os.getenv("MAX_LENGTH", "2048"))
138
+ )
139
+
140
+ # Vector store configuration
141
+ self.vector_store = VectorStoreConfig(
142
+ chunk_size=int(os.getenv("CHUNK_SIZE", "500")),
143
+ n_results=int(os.getenv("N_RESULTS", "5"))
144
+ )
145
+
146
+ # Audio configuration
147
+ self.audio = AudioConfig(
148
+ sample_rate=int(os.getenv("SAMPLE_RATE", "16000")),
149
+ language_detection=os.getenv("LANGUAGE_DETECTION", "true").lower() == "true"
150
+ )
151
+
152
+ # CrewAI configuration
153
+ self.crew = CrewConfig(
154
+ verbose=os.getenv("CREW_VERBOSE", "true").lower() == "true",
155
+ max_iterations=int(os.getenv("MAX_ITERATIONS", "3"))
156
+ )
157
+
158
+ # API tokens
159
+ self.tokens = {
160
+ "huggingface": os.getenv("HUGGINGFACE_TOKEN", ""),
161
+ "openai": os.getenv("OPENAI_API_KEY", "")
162
+ }
163
+
164
+ # Feature flags
165
+ self.features = {
166
+ "voice_enabled": os.getenv("VOICE_ENABLED", "true").lower() == "true",
167
+ "multilingual": os.getenv("MULTILINGUAL", "true").lower() == "true",
168
+ "save_history": os.getenv("SAVE_HISTORY", "true").lower() == "true",
169
+ "debug_mode": os.getenv("DEBUG_MODE", "false").lower() == "true"
170
+ }
171
+
172
+ # Knowledge base books
173
+ self.knowledge_sources = {
174
+ "spiritual": [
175
+ "Bhagavad Gita",
176
+ "Autobiography of a Yogi",
177
+ "The Power of Now",
178
+ "Tao Te Ching",
179
+ "Dhyana Vahini",
180
+ "Gita Vahini",
181
+ "Prema Vahini",
182
+ "Prasnothra Vahini"
183
+ ],
184
+ "self_help": [
185
+ "Atomic Habits",
186
+ "The 7 Habits of Highly Effective People",
187
+ "Man's Search for Meaning",
188
+ "Mindset"
189
+ ],
190
+ "philosophy": [
191
+ "Meditations"
192
+ ]
193
+ }
194
+
195
+ # Prompt templates
196
+ self.prompts = {
197
+ "system_prompt": """You are a compassionate personal coach who draws wisdom from ancient texts and modern psychology.
198
+ You listen deeply, ask thoughtful questions, and provide guidance that is both practical and profound.
199
+ You speak with warmth and understanding, never judging, always supporting.""",
200
+
201
+ "conversation_prompt": """Based on what the user shared: {user_input}
202
+ Their emotional state appears to be: {emotional_state}
203
+ Generate {num_questions} empathetic, reflective questions to help them explore their feelings deeper.""",
204
+
205
+ "wisdom_prompt": """The user is dealing with: {situation}
206
+ Their emotional state: {emotional_state}
207
+
208
+ Drawing from these wisdom sources: {sources}
209
+ Provide relevant guidance that:
210
+ 1. Acknowledges their feelings
211
+ 2. Shares applicable wisdom
212
+ 3. Offers practical steps
213
+ 4. Maintains a supportive tone""",
214
+
215
+ "validation_prompt": """Review this response for appropriateness:
216
+ {response}
217
+
218
+ Ensure it:
219
+ 1. Contains no medical/legal/financial advice
220
+ 2. Maintains supportive tone
221
+ 3. Includes practical guidance
222
+ 4. Avoids absolute statements""",
223
+
224
+ "meditation_prompt": """Create a {duration} minute meditation practice for someone feeling {emotion}.
225
+ Include:
226
+ 1. Simple setup instructions
227
+ 2. Step-by-step guidance
228
+ 3. Focus technique
229
+ 4. Closing reflection"""
230
+ }
231
+
232
+ # Response guidelines
233
+ self.guidelines = {
234
+ "tone": ["empathetic", "supportive", "non-judgmental", "encouraging"],
235
+ "avoid": ["prescriptive", "absolute", "diagnostic", "dismissive"],
236
+ "include": ["validation", "practical steps", "hope", "empowerment"]
237
+ }
238
+
239
+ # Crisis resources
240
+ self.crisis_resources = {
241
+ "global": {
242
+ "name": "International Crisis Lines",
243
+ "url": "https://findahelpline.com",
244
+ "phone": "Various by country"
245
+ },
246
+ "us": {
247
+ "name": "988 Suicide & Crisis Lifeline",
248
+ "phone": "988",
249
+ "text": "Text HOME to 741741"
250
+ },
251
+ "uk": {
252
+ "name": "Samaritans",
253
+ "phone": "116 123",
254
+ "email": "jo@samaritans.org"
255
+ },
256
+ "india": {
257
+ "name": "Vandrevala Foundation",
258
+ "phone": "9999666555",
259
+ "languages": ["Hindi", "English", "Regional"]
260
+ }
261
+ }
262
+
263
+ def get_language_config(self, language_code: str) -> Dict:
264
+ """Get language-specific configuration"""
265
+ language_configs = {
266
+ "en": {"name": "English", "tts_voice": "en-US-AriaNeural"},
267
+ "hi": {"name": "Hindi", "tts_voice": "hi-IN-SwaraNeural"},
268
+ "es": {"name": "Spanish", "tts_voice": "es-ES-ElviraNeural"},
269
+ "fr": {"name": "French", "tts_voice": "fr-FR-DeniseNeural"},
270
+ "de": {"name": "German", "tts_voice": "de-DE-KatjaNeural"},
271
+ "zh": {"name": "Chinese", "tts_voice": "zh-CN-XiaoxiaoNeural"},
272
+ "ar": {"name": "Arabic", "tts_voice": "ar-SA-ZariyahNeural"}
273
+ }
274
+
275
+ return language_configs.get(language_code, language_configs["en"])
276
+
277
+ def get_prompt(self, prompt_type: str, **kwargs) -> str:
278
+ """Get formatted prompt with variables"""
279
+ prompt_template = self.prompts.get(prompt_type, "")
280
+ return prompt_template.format(**kwargs)
281
+
282
+ def to_dict(self) -> Dict:
283
+ """Convert configuration to dictionary"""
284
+ return {
285
+ "paths": {
286
+ "base": self.BASE_DIR,
287
+ "data": self.DATA_DIR,
288
+ "books": self.BOOKS_DIR,
289
+ "index": self.INDEX_DIR,
290
+ "cache": self.CACHE_DIR
291
+ },
292
+ "models": self.models.__dict__,
293
+ "vector_store": self.vector_store.__dict__,
294
+ "audio": self.audio.__dict__,
295
+ "crew": self.crew.__dict__,
296
+ "features": self.features,
297
+ "knowledge_sources": self.knowledge_sources
298
+ }
utils/knowledge_base.py ADDED
@@ -0,0 +1,339 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Knowledge base management using FAISS and HuggingFace embeddings
3
+ """
4
+
5
+ import os
6
+ import json
7
+ import pickle
8
+ from typing import List, Dict, Tuple, Optional
9
+ import numpy as np
10
+ import faiss
11
+ from sentence_transformers import SentenceTransformer
12
+ from transformers import AutoTokenizer
13
+ import hashlib
14
+ from datetime import datetime
15
+ from pathlib import Path
16
+
17
+ class KnowledgeBase:
18
+ """Manages the vector store for knowledge retrieval"""
19
+
20
+ def __init__(self, config):
21
+ self.config = config
22
+ self.embedding_model = SentenceTransformer(config.models.embedding_model)
23
+ self.dimension = config.vector_store.dimension
24
+ self.index = None
25
+ self.metadata = []
26
+ self.chunks = []
27
+ self.index_path = config.INDEX_DIR
28
+ self.books_path = config.BOOKS_DIR
29
+
30
+ # Initialize tokenizer for chunk splitting
31
+ self.tokenizer = AutoTokenizer.from_pretrained(config.models.mistral_model)
32
+
33
+ # Load or create index
34
+ self._initialize_index()
35
+
36
+ def _initialize_index(self):
37
+ """Initialize or load existing FAISS index"""
38
+ index_file = os.path.join(self.index_path, "knowledge.index")
39
+ metadata_file = os.path.join(self.index_path, "metadata.pkl")
40
+ chunks_file = os.path.join(self.index_path, "chunks.pkl")
41
+
42
+ if os.path.exists(index_file) and os.path.exists(metadata_file):
43
+ # Load existing index
44
+ self.index = faiss.read_index(index_file)
45
+ with open(metadata_file, 'rb') as f:
46
+ self.metadata = pickle.load(f)
47
+ with open(chunks_file, 'rb') as f:
48
+ self.chunks = pickle.load(f)
49
+ print(f"Loaded existing index with {self.index.ntotal} vectors")
50
+ else:
51
+ # Create new index
52
+ if self.config.vector_store.metric == "cosine":
53
+ # Use IndexFlatIP with normalized vectors for cosine similarity
54
+ self.index = faiss.IndexFlatIP(self.dimension)
55
+ else:
56
+ # Use IndexFlatL2 for Euclidean distance
57
+ self.index = faiss.IndexFlatL2(self.dimension)
58
+ print("Created new index")
59
+
60
+ def process_books(self, force_rebuild: bool = False):
61
+ """Process all books in the books directory"""
62
+ if self.index.ntotal > 0 and not force_rebuild:
63
+ print(f"Index already contains {self.index.ntotal} vectors. Use force_rebuild=True to rebuild.")
64
+ return
65
+
66
+ # Clear existing data if rebuilding
67
+ if force_rebuild:
68
+ self.index = faiss.IndexFlatIP(self.dimension) if self.config.vector_store.metric == "cosine" else faiss.IndexFlatL2(self.dimension)
69
+ self.metadata = []
70
+ self.chunks = []
71
+
72
+ # Process each book
73
+ book_files = list(Path(self.books_path).glob("*.txt"))
74
+ print(f"Found {len(book_files)} books to process")
75
+
76
+ for book_file in book_files:
77
+ print(f"Processing {book_file.name}...")
78
+ self._process_single_book(book_file)
79
+
80
+ # Save index
81
+ self._save_index()
82
+ print(f"Processing complete. Index contains {self.index.ntotal} vectors")
83
+
84
+ def _process_single_book(self, book_path: Path):
85
+ """Process a single book file"""
86
+ try:
87
+ # Read book content
88
+ with open(book_path, 'r', encoding='utf-8') as f:
89
+ content = f.read()
90
+
91
+ # Extract book name
92
+ book_name = book_path.stem.replace('_', ' ').title()
93
+
94
+ # Split into chunks
95
+ chunks = self._create_chunks(content)
96
+
97
+ # Process each chunk
98
+ for i, chunk in enumerate(chunks):
99
+ # Skip empty chunks
100
+ if not chunk.strip():
101
+ continue
102
+
103
+ # Create embedding
104
+ embedding = self._create_embedding(chunk)
105
+
106
+ # Normalize for cosine similarity
107
+ if self.config.vector_store.metric == "cosine":
108
+ embedding = embedding / np.linalg.norm(embedding)
109
+
110
+ # Add to index
111
+ self.index.add(np.array([embedding]))
112
+
113
+ # Store metadata
114
+ metadata = {
115
+ "book": book_name,
116
+ "chunk_id": i,
117
+ "timestamp": datetime.now().isoformat(),
118
+ "char_count": len(chunk),
119
+ "checksum": hashlib.md5(chunk.encode()).hexdigest()
120
+ }
121
+ self.metadata.append(metadata)
122
+ self.chunks.append(chunk)
123
+
124
+ except Exception as e:
125
+ print(f"Error processing {book_path}: {str(e)}")
126
+
127
+ def _create_chunks(self, text: str) -> List[str]:
128
+ """Split text into chunks using sliding window"""
129
+ # Clean text
130
+ text = text.strip()
131
+ if not text:
132
+ return []
133
+
134
+ # Tokenize
135
+ tokens = self.tokenizer.encode(text, add_special_tokens=False)
136
+
137
+ chunks = []
138
+ chunk_size = self.config.vector_store.chunk_size
139
+ overlap = self.config.vector_store.chunk_overlap
140
+
141
+ # Create chunks with overlap
142
+ for i in range(0, len(tokens), chunk_size - overlap):
143
+ chunk_tokens = tokens[i:i + chunk_size]
144
+ chunk_text = self.tokenizer.decode(chunk_tokens, skip_special_tokens=True)
145
+ chunks.append(chunk_text)
146
+
147
+ return chunks
148
+
149
+ def _create_embedding(self, text: str) -> np.ndarray:
150
+ """Create embedding for text"""
151
+ embedding = self.embedding_model.encode(text, convert_to_numpy=True)
152
+ return embedding.astype('float32')
153
+
154
+ def search(self, query: str, k: int = None, filter_books: List[str] = None) -> List[Dict]:
155
+ """Search for similar chunks in the knowledge base"""
156
+ if self.index.ntotal == 0:
157
+ return []
158
+
159
+ k = k or self.config.vector_store.n_results
160
+
161
+ # Create query embedding
162
+ query_embedding = self._create_embedding(query)
163
+
164
+ # Normalize for cosine similarity
165
+ if self.config.vector_store.metric == "cosine":
166
+ query_embedding = query_embedding / np.linalg.norm(query_embedding)
167
+
168
+ # Search
169
+ distances, indices = self.index.search(
170
+ np.array([query_embedding]),
171
+ min(k, self.index.ntotal)
172
+ )
173
+
174
+ # Compile results
175
+ results = []
176
+ for i, (dist, idx) in enumerate(zip(distances[0], indices[0])):
177
+ if idx < 0: # Invalid index
178
+ continue
179
+
180
+ metadata = self.metadata[idx]
181
+
182
+ # Apply book filter if specified
183
+ if filter_books and metadata["book"] not in filter_books:
184
+ continue
185
+
186
+ result = {
187
+ "text": self.chunks[idx],
188
+ "book": metadata["book"],
189
+ "score": float(dist),
190
+ "rank": i + 1,
191
+ "metadata": metadata
192
+ }
193
+ results.append(result)
194
+
195
+ # Sort by score (higher is better for cosine similarity)
196
+ results.sort(key=lambda x: x["score"], reverse=True)
197
+
198
+ return results[:k]
199
+
200
+ def search_with_context(self, query: str, k: int = None, context_window: int = 1) -> List[Dict]:
201
+ """Search and include surrounding context chunks"""
202
+ results = self.search(query, k)
203
+
204
+ # Expand each result with context
205
+ expanded_results = []
206
+ for result in results:
207
+ chunk_idx = result["metadata"]["chunk_id"]
208
+ book = result["book"]
209
+
210
+ # Get surrounding chunks from the same book
211
+ context_chunks = []
212
+
213
+ # Get previous chunks
214
+ for i in range(context_window, 0, -1):
215
+ prev_idx = self._find_chunk_index(book, chunk_idx - i)
216
+ if prev_idx is not None:
217
+ context_chunks.append(self.chunks[prev_idx])
218
+
219
+ # Add main chunk
220
+ context_chunks.append(result["text"])
221
+
222
+ # Get next chunks
223
+ for i in range(1, context_window + 1):
224
+ next_idx = self._find_chunk_index(book, chunk_idx + i)
225
+ if next_idx is not None:
226
+ context_chunks.append(self.chunks[next_idx])
227
+
228
+ # Create expanded result
229
+ expanded_result = result.copy()
230
+ expanded_result["context"] = "\n\n".join(context_chunks)
231
+ expanded_result["context_size"] = len(context_chunks)
232
+ expanded_results.append(expanded_result)
233
+
234
+ return expanded_results
235
+
236
+ def _find_chunk_index(self, book: str, chunk_id: int) -> Optional[int]:
237
+ """Find index of a specific chunk"""
238
+ for i, metadata in enumerate(self.metadata):
239
+ if metadata["book"] == book and metadata["chunk_id"] == chunk_id:
240
+ return i
241
+ return None
242
+
243
+ def add_text(self, text: str, source: str, metadata: Dict = None):
244
+ """Add a single text to the knowledge base"""
245
+ # Create chunks
246
+ chunks = self._create_chunks(text)
247
+
248
+ # Process each chunk
249
+ for i, chunk in enumerate(chunks):
250
+ if not chunk.strip():
251
+ continue
252
+
253
+ # Create embedding
254
+ embedding = self._create_embedding(chunk)
255
+
256
+ # Normalize if needed
257
+ if self.config.vector_store.metric == "cosine":
258
+ embedding = embedding / np.linalg.norm(embedding)
259
+
260
+ # Add to index
261
+ self.index.add(np.array([embedding]))
262
+
263
+ # Create metadata
264
+ chunk_metadata = {
265
+ "book": source,
266
+ "chunk_id": i,
267
+ "timestamp": datetime.now().isoformat(),
268
+ "char_count": len(chunk),
269
+ "checksum": hashlib.md5(chunk.encode()).hexdigest()
270
+ }
271
+
272
+ # Add custom metadata if provided
273
+ if metadata:
274
+ chunk_metadata.update(metadata)
275
+
276
+ self.metadata.append(chunk_metadata)
277
+ self.chunks.append(chunk)
278
+
279
+ # Save changes
280
+ self._save_index()
281
+
282
+ def _save_index(self):
283
+ """Save index and metadata to disk"""
284
+ os.makedirs(self.index_path, exist_ok=True)
285
+
286
+ # Save FAISS index
287
+ index_file = os.path.join(self.index_path, "knowledge.index")
288
+ faiss.write_index(self.index, index_file)
289
+
290
+ # Save metadata
291
+ metadata_file = os.path.join(self.index_path, "metadata.pkl")
292
+ with open(metadata_file, 'wb') as f:
293
+ pickle.dump(self.metadata, f)
294
+
295
+ # Save chunks
296
+ chunks_file = os.path.join(self.index_path, "chunks.pkl")
297
+ with open(chunks_file, 'wb') as f:
298
+ pickle.dump(self.chunks, f)
299
+
300
+ # Save config
301
+ config_file = os.path.join(self.index_path, "config.json")
302
+ with open(config_file, 'w') as f:
303
+ json.dump({
304
+ "dimension": self.dimension,
305
+ "metric": self.config.vector_store.metric,
306
+ "total_chunks": len(self.chunks),
307
+ "books": list(set(m["book"] for m in self.metadata)),
308
+ "last_updated": datetime.now().isoformat()
309
+ }, f, indent=2)
310
+
311
+ def get_stats(self) -> Dict:
312
+ """Get statistics about the knowledge base"""
313
+ if not self.metadata:
314
+ return {"status": "empty"}
315
+
316
+ books = {}
317
+ for metadata in self.metadata:
318
+ book = metadata["book"]
319
+ if book not in books:
320
+ books[book] = {"chunks": 0, "chars": 0}
321
+ books[book]["chunks"] += 1
322
+ books[book]["chars"] += metadata["char_count"]
323
+
324
+ return {
325
+ "total_chunks": len(self.chunks),
326
+ "total_books": len(books),
327
+ "books": books,
328
+ "index_size": self.index.ntotal,
329
+ "dimension": self.dimension,
330
+ "metric": self.config.vector_store.metric
331
+ }
332
+
333
+ def clear(self):
334
+ """Clear the entire knowledge base"""
335
+ self.index = faiss.IndexFlatIP(self.dimension) if self.config.vector_store.metric == "cosine" else faiss.IndexFlatL2(self.dimension)
336
+ self.metadata = []
337
+ self.chunks = []
338
+ self._save_index()
339
+ print("Knowledge base cleared")