Skip to main content

Scheduled Crawling

Set up automatic crawls to keep your knowledge base synchronized with website changes.

Creating a Schedule

  1. Go to "Knowledge Base" > "Schedules"
  2. Click "Create Schedule"
  3. Configure the schedule:
    • URL to crawl
    • Crawl mode and options
    • Frequency
  4. Click "Save"

Schedule Frequencies

FrequencyDescriptionBest For
HourlyEvery hourRapidly changing content
DailyOnce per dayRegular updates
WeeklyOnce per weekStable content
MonthlyOnce per monthDocumentation
note

Higher frequencies use more resources and may count against your plan limits.

Schedule Options

Update Mode

ModeBehavior
ReplaceDelete old content, add new
MergeKeep existing, add new pages
DiffOnly update changed pages

Diff mode is recommended for most use cases - it's faster and preserves existing content.

Notifications

Get notified about crawl results:

  • On completion - When crawl finishes
  • On failure - When crawl fails
  • On changes - When new content is found

Managing Schedules

View Schedules

  1. Go to "Knowledge Base" > "Schedules"
  2. See all active schedules with:
    • Next run time
    • Last run status
    • Page count

Edit a Schedule

  1. Click on the schedule
  2. Modify settings
  3. Click "Save"

Pause a Schedule

Temporarily stop scheduled crawls:

  1. Click on the schedule
  2. Click "Pause"

Delete a Schedule

  1. Click on the schedule
  2. Click "Delete"
  3. Confirm deletion
note

Deleting a schedule doesn't delete already crawled content.

Crawl History

View past crawl runs:

  1. Go to "Knowledge Base" > "Schedules"
  2. Click on a schedule
  3. View the "History" tab

History shows:

  • Run date and time
  • Pages crawled
  • Errors encountered
  • Duration

URL Management

Adding URLs

Add multiple URLs to a single schedule:

  1. Edit the schedule
  2. Add URLs to the list
  3. Each URL will be crawled with the same settings

URL Patterns

Use patterns to include related URLs:

https://docs.example.com/v1/*
https://docs.example.com/v2/*

Best Practices

Choosing Frequency

Consider:

  • How often does the content change?
  • Plan limits and resource usage
  • Impact of stale content

Monitoring

  • Review crawl history regularly
  • Check for failed crawls
  • Verify content is being updated

Organization

  • Group related URLs in one schedule
  • Use descriptive schedule names
  • Document schedule purposes

Troubleshooting

Scheduled Crawl Not Running

Check:

  • Schedule is not paused
  • Correct time zone settings
  • Plan limits not exceeded

Content Not Updating

Check:

  • Update mode settings
  • URL patterns matching correctly
  • Website not blocking crawls