Scheduled Crawling
Set up automatic crawls to keep your knowledge base synchronized with website changes.
Creating a Schedule
- Go to "Knowledge Base" > "Schedules"
- Click "Create Schedule"
- Configure the schedule:
- URL to crawl
- Crawl mode and options
- Frequency
- Click "Save"
Schedule Frequencies
| Frequency | Description | Best For |
|---|---|---|
| Hourly | Every hour | Rapidly changing content |
| Daily | Once per day | Regular updates |
| Weekly | Once per week | Stable content |
| Monthly | Once per month | Documentation |
note
Higher frequencies use more resources and may count against your plan limits.
Schedule Options
Update Mode
| Mode | Behavior |
|---|---|
| Replace | Delete old content, add new |
| Merge | Keep existing, add new pages |
| Diff | Only update changed pages |
Diff mode is recommended for most use cases - it's faster and preserves existing content.
Notifications
Get notified about crawl results:
- On completion - When crawl finishes
- On failure - When crawl fails
- On changes - When new content is found
Managing Schedules
View Schedules
- Go to "Knowledge Base" > "Schedules"
- See all active schedules with:
- Next run time
- Last run status
- Page count
Edit a Schedule
- Click on the schedule
- Modify settings
- Click "Save"
Pause a Schedule
Temporarily stop scheduled crawls:
- Click on the schedule
- Click "Pause"
Delete a Schedule
- Click on the schedule
- Click "Delete"
- Confirm deletion
note
Deleting a schedule doesn't delete already crawled content.
Crawl History
View past crawl runs:
- Go to "Knowledge Base" > "Schedules"
- Click on a schedule
- View the "History" tab
History shows:
- Run date and time
- Pages crawled
- Errors encountered
- Duration
URL Management
Adding URLs
Add multiple URLs to a single schedule:
- Edit the schedule
- Add URLs to the list
- Each URL will be crawled with the same settings
URL Patterns
Use patterns to include related URLs:
https://docs.example.com/v1/*
https://docs.example.com/v2/*
Best Practices
Choosing Frequency
Consider:
- How often does the content change?
- Plan limits and resource usage
- Impact of stale content
Monitoring
- Review crawl history regularly
- Check for failed crawls
- Verify content is being updated
Organization
- Group related URLs in one schedule
- Use descriptive schedule names
- Document schedule purposes
Troubleshooting
Scheduled Crawl Not Running
Check:
- Schedule is not paused
- Correct time zone settings
- Plan limits not exceeded
Content Not Updating
Check:
- Update mode settings
- URL patterns matching correctly
- Website not blocking crawls