Skip to content

Data Sources Management

Data sources are the knowledge bases that AI assistants learn from. They provide the information that AI uses to answer questions and provide accurate responses.

📚 Overview

Data sources come in three types:

  • Model Data: Information from your Odoo records
  • URL Content: Web pages and documentation
  • Text Data: Manual content and documents

Each type serves different purposes and has specific setup requirements.

🧠 Understanding Data Sources

How Data Sources Work

  1. Indexing Process: Data is processed and converted into searchable format
  2. Vector Embeddings: Content is converted to numerical vectors for semantic search
  3. Knowledge Retrieval: AI searches these vectors to find relevant information
  4. Context Integration: Retrieved data is included in AI responses

Data Source Types

Model Data Sources

  • Index information from your Odoo records
  • Automatically updates when records change
  • Supports all Odoo models and fields
  • Best for business data and records

URL Data Sources

  • Index content from websites and documentation
  • Processes sitemaps automatically
  • Supports regex patterns for URL filtering
  • Best for external documentation and websites

Text Data Sources

  • Manual content entry and management
  • Flexible for any type of information
  • Easy to update and maintain
  • Best for policies, procedures, and custom content

🛠️ Creating Model Data Sources

Basic Setup

  1. Access Data Sources

    • Go to Settings → Technical → AI → Data Sources
    • Click Create to add a new data source
  2. Basic Information

    • Name: Descriptive name (e.g., "Product Catalog")
    • Type: Select "Model"
    • Description: Brief description of the data source
  3. Model Selection

    • Model: Choose the Odoo model to index
    • Common models: product.template, res.partner, sale.order
    • Only non-transient models are available

Field Configuration

Field Selection

  • Model Fields: Select specific fields to index
  • Leave empty to index all non-binary fields
  • Focus on relevant fields for better performance
  • Exclude sensitive or unnecessary fields

Recommended Fields by Model Type:

Product Catalog (product.template)

  • name - Product name
  • description - Product description
  • list_price - Product price
  • categ_id - Product category
  • default_code - Product code

Customer Database (res.partner)

  • name - Customer name
  • email - Email address
  • phone - Phone number
  • street - Address
  • comment - Notes

Sales Orders (sale.order)

  • name - Order reference
  • partner_id - Customer
  • amount_total - Total amount
  • state - Order status

Record Filtering

Domain Configuration

  • Record Domain: Filter which records to include
  • Use Odoo domain syntax
  • Examples:
    • [('active', '=', True)] - Active records only
    • [('state', 'in', ['draft', 'confirmed'])] - Specific states
    • [('create_date', '>=', '2024-01-01')] - Recent records
  • Note: In additional to the record domain, when interating with an user, AI can only access records that the user have permission to view.

Common Domain Examples:

Active Products Only

[('active', '=', True)]

Confirmed Sales Orders

[('state', 'in', ['sale', 'done'])]

Customers with Email

[('email', '!=', False)]

🌐 Creating URL Data Sources

Basic Setup

  1. Access Data Sources

    • Go to Settings → Technical → AI → Data Sources
    • Click Create to add a new data source
  2. Basic Information

    • Name: Descriptive name (e.g., "Product Documentation")
    • Type: Select "URL"
    • Description: Brief description of the content

URL Pattern Configuration

URL Regex Patterns

  • Define which URLs to index
  • Use regex patterns for flexible matching
  • Separate multiple patterns with line breaks

Pattern Examples:

Specific Pages

/contactus
/about-us
/help/faq

Section Patterns

/products/*
/support/*
/documentation/*

External Websites

https://example.com/help/*
https://docs.company.com/*

Mixed Patterns

/contactus
/products/*
https://support.company.com/*

Sitemap Processing

How It Works

  • System automatically processes sitemaps
  • Matches URLs against your regex patterns
  • Indexes content from matching pages
  • Handles both internal and external websites

Sitemap Requirements

  • Website must have a sitemap.xml
  • URLs must be accessible
  • Content should be in HTML format

📝 Creating Text Data Sources

Basic Setup

  1. Access Data Sources

    • Go to Settings → Technical → AI → Data Sources
    • Click Create to add a new data source
  2. Basic Information

    • Name: Descriptive name (e.g., "Company Policies")
    • Type: Select "Text"
    • Description: Brief description of the content

Adding Content

Content Structure

  • Title: Brief description of the content
  • Content: The actual text to be indexed
  • Add multiple entries for different topics

Content Examples:

Company Policies

Title: Vacation Policy
Content: Employees are entitled to 20 days of vacation per year. Vacation requests must be submitted at least 2 weeks in advance.

Title: Remote Work Policy
Content: Employees can work remotely up to 3 days per week. Remote work requires manager approval and stable internet connection.

Product Information

Title: Product Return Policy
Content: Customers can return products within 30 days of purchase. Returns must be in original condition with all packaging.

Title: Shipping Information
Content: Standard shipping takes 3-5 business days. Express shipping is available for an additional fee.

FAQ Content

Title: How to Reset Password
Content: To reset your password, click the "Forgot Password" link on the login page. You will receive an email with reset instructions.

Title: Contact Support
Content: For technical support, email [email protected] or call 1-800-SUPPORT during business hours.

🔍 Indexing Data Sources

Manual Indexing Process

  1. Prepare Data Source

    • Ensure all configuration is complete
    • Verify data is accessible
    • Check field selections and domains
  2. Start Indexing

    • Go to the data source record
    • Click Index button
    • Monitor progress in the logs
  3. Monitor Progress

    • Check Data Item Count to see indexed items
    • Review logs for any indexing errors
    • Verify data quality in the data items list

Indexing Process Details

Batch Processing

  • System processes records in batches of 100
  • Creates vector embeddings for semantic search
  • Stores data in PostgreSQL with pgvector
  • Handles large datasets efficiently

Performance Considerations

  • Large datasets may take time to index
  • Monitor system resources during indexing
  • Consider indexing during off-peak hours
  • Check database storage space

Monitoring and Maintenance

Data Quality Checks

  • Review indexed data items
  • Verify content accuracy
  • Check for missing or incorrect data
  • Update data sources as needed

Regular Maintenance

  • Re-index when data changes significantly
  • Monitor data source performance
  • Update content regularly
  • Remove outdated information

📋 Best Practices

Data Source Design

Choose the Right Type

  • Use Model sources for business data
  • Use URL sources for external documentation
  • Use Text sources for policies and procedures

Content Quality

  • Ensure information is accurate and up-to-date
  • Use clear, concise language
  • Organize content logically
  • Include relevant keywords

Performance Optimization

  • Index only necessary fields
  • Use appropriate record domains
  • Limit URL patterns to relevant content
  • Monitor indexing performance

Security Considerations

Data Privacy

  • Avoid indexing sensitive information
  • Use record domains to exclude private data
  • Review field selections carefully
  • Monitor data access permissions

Access Control

  • Limit data source access appropriately
  • Review assistant permissions
  • Monitor data usage patterns
  • Regular security audits

🔧 Troubleshooting

Common Issues

Indexing Failures

  • Check model permissions
  • Verify URL accessibility
  • Review domain syntax
  • Check system resources

Poor Search Results

  • Review field selections
  • Check content quality
  • Verify indexing completion
  • Test with simple queries

Performance Issues

  • Optimize record domains
  • Reduce field selections
  • Monitor system resources
  • Consider batch processing

Getting Help

  • Check the troubleshooting guide for detailed solutions
  • Review system logs for error messages
  • Contact support with specific error details
  • Verify all prerequisites are met

This guide covers the essential aspects of data source management. Properly configured data sources are crucial for AI assistant performance and accuracy.