Data Sources Management¶
Data sources are the knowledge bases that AI assistants learn from. They provide the information that AI uses to answer questions and provide accurate responses.
📚 Overview¶
Data sources come in three types:
- Model Data: Information from your Odoo records
- URL Content: Web pages and documentation
- Text Data: Manual content and documents
Each type serves different purposes and has specific setup requirements.
🧠 Understanding Data Sources¶
How Data Sources Work¶
- Indexing Process: Data is processed and converted into searchable format
- Vector Embeddings: Content is converted to numerical vectors for semantic search
- Knowledge Retrieval: AI searches these vectors to find relevant information
- Context Integration: Retrieved data is included in AI responses
Data Source Types¶
Model Data Sources
- Index information from your Odoo records
- Automatically updates when records change
- Supports all Odoo models and fields
- Best for business data and records
URL Data Sources
- Index content from websites and documentation
- Processes sitemaps automatically
- Supports regex patterns for URL filtering
- Best for external documentation and websites
Text Data Sources
- Manual content entry and management
- Flexible for any type of information
- Easy to update and maintain
- Best for policies, procedures, and custom content
🛠️ Creating Model Data Sources¶
Basic Setup¶
-
Access Data Sources
- Go to Settings → Technical → AI → Data Sources
- Click Create to add a new data source
-
Basic Information
- Name: Descriptive name (e.g., "Product Catalog")
- Type: Select "Model"
- Description: Brief description of the data source
-
Model Selection
- Model: Choose the Odoo model to index
- Common models:
product.template,res.partner,sale.order - Only non-transient models are available
Field Configuration¶
Field Selection
- Model Fields: Select specific fields to index
- Leave empty to index all non-binary fields
- Focus on relevant fields for better performance
- Exclude sensitive or unnecessary fields
Recommended Fields by Model Type:
Product Catalog (product.template)
name- Product namedescription- Product descriptionlist_price- Product pricecateg_id- Product categorydefault_code- Product code
Customer Database (res.partner)
name- Customer nameemail- Email addressphone- Phone numberstreet- Addresscomment- Notes
Sales Orders (sale.order)
name- Order referencepartner_id- Customeramount_total- Total amountstate- Order status
Record Filtering¶
Domain Configuration
- Record Domain: Filter which records to include
- Use Odoo domain syntax
- Examples:
[('active', '=', True)]- Active records only[('state', 'in', ['draft', 'confirmed'])]- Specific states[('create_date', '>=', '2024-01-01')]- Recent records
- Note: In additional to the record domain, when interating with an user, AI can only access records that the user have permission to view.
Common Domain Examples:
Active Products Only
Confirmed Sales Orders
Customers with Email
🌐 Creating URL Data Sources¶
Basic Setup¶
-
Access Data Sources
- Go to Settings → Technical → AI → Data Sources
- Click Create to add a new data source
-
Basic Information
- Name: Descriptive name (e.g., "Product Documentation")
- Type: Select "URL"
- Description: Brief description of the content
URL Pattern Configuration¶
URL Regex Patterns
- Define which URLs to index
- Use regex patterns for flexible matching
- Separate multiple patterns with line breaks
Pattern Examples:
Specific Pages
Section Patterns
External Websites
Mixed Patterns
Sitemap Processing¶
How It Works
- System automatically processes sitemaps
- Matches URLs against your regex patterns
- Indexes content from matching pages
- Handles both internal and external websites
Sitemap Requirements
- Website must have a sitemap.xml
- URLs must be accessible
- Content should be in HTML format
📝 Creating Text Data Sources¶
Basic Setup¶
-
Access Data Sources
- Go to Settings → Technical → AI → Data Sources
- Click Create to add a new data source
-
Basic Information
- Name: Descriptive name (e.g., "Company Policies")
- Type: Select "Text"
- Description: Brief description of the content
Adding Content¶
Content Structure
- Title: Brief description of the content
- Content: The actual text to be indexed
- Add multiple entries for different topics
Content Examples:
Company Policies
Title: Vacation Policy
Content: Employees are entitled to 20 days of vacation per year. Vacation requests must be submitted at least 2 weeks in advance.
Title: Remote Work Policy
Content: Employees can work remotely up to 3 days per week. Remote work requires manager approval and stable internet connection.
Product Information
Title: Product Return Policy
Content: Customers can return products within 30 days of purchase. Returns must be in original condition with all packaging.
Title: Shipping Information
Content: Standard shipping takes 3-5 business days. Express shipping is available for an additional fee.
FAQ Content
Title: How to Reset Password
Content: To reset your password, click the "Forgot Password" link on the login page. You will receive an email with reset instructions.
Title: Contact Support
Content: For technical support, email [email protected] or call 1-800-SUPPORT during business hours.
🔍 Indexing Data Sources¶
Manual Indexing Process¶
-
Prepare Data Source
- Ensure all configuration is complete
- Verify data is accessible
- Check field selections and domains
-
Start Indexing
- Go to the data source record
- Click Index button
- Monitor progress in the logs
-
Monitor Progress
- Check Data Item Count to see indexed items
- Review logs for any indexing errors
- Verify data quality in the data items list
Indexing Process Details¶
Batch Processing
- System processes records in batches of 100
- Creates vector embeddings for semantic search
- Stores data in PostgreSQL with pgvector
- Handles large datasets efficiently
Performance Considerations
- Large datasets may take time to index
- Monitor system resources during indexing
- Consider indexing during off-peak hours
- Check database storage space
Monitoring and Maintenance¶
Data Quality Checks
- Review indexed data items
- Verify content accuracy
- Check for missing or incorrect data
- Update data sources as needed
Regular Maintenance
- Re-index when data changes significantly
- Monitor data source performance
- Update content regularly
- Remove outdated information
📋 Best Practices¶
Data Source Design¶
Choose the Right Type
- Use Model sources for business data
- Use URL sources for external documentation
- Use Text sources for policies and procedures
Content Quality
- Ensure information is accurate and up-to-date
- Use clear, concise language
- Organize content logically
- Include relevant keywords
Performance Optimization
- Index only necessary fields
- Use appropriate record domains
- Limit URL patterns to relevant content
- Monitor indexing performance
Security Considerations¶
Data Privacy
- Avoid indexing sensitive information
- Use record domains to exclude private data
- Review field selections carefully
- Monitor data access permissions
Access Control
- Limit data source access appropriately
- Review assistant permissions
- Monitor data usage patterns
- Regular security audits
🔧 Troubleshooting¶
Common Issues¶
Indexing Failures
- Check model permissions
- Verify URL accessibility
- Review domain syntax
- Check system resources
Poor Search Results
- Review field selections
- Check content quality
- Verify indexing completion
- Test with simple queries
Performance Issues
- Optimize record domains
- Reduce field selections
- Monitor system resources
- Consider batch processing
Getting Help¶
- Check the troubleshooting guide for detailed solutions
- Review system logs for error messages
- Contact support with specific error details
- Verify all prerequisites are met
This guide covers the essential aspects of data source management. Properly configured data sources are crucial for AI assistant performance and accuracy.