LogoLogo
TwitterWebsite
  • Getting Started
    • Introduction
    • Human UI
    • Examples
    • Monitoring
    • Workflows
    • Getting Help
  • Documentation
    • Humans
      • Introduction
      • Prompts
      • Tools
      • Knowledge
      • Memory
      • Storage
      • Structured Output
      • Reasoning
      • Teams
    • Models
      • Introduction
      • Open AI
      • Open AI Like
      • Anthropic Claude
      • AWS Bedrock Claude
      • Azure
      • Cohere
      • DeepSeek
      • Fireworks
      • Gemini
      • Gemini - VertexAI
      • Groq
      • HuggingFace
      • Mistral
      • Nvidia
      • Ollama
      • OpenRouter
      • Sambanova
      • Together
      • xAI
    • Tools
      • Introduction
      • Functions
      • Writing your own Toolkit
      • Airflow
      • Apify
      • Arxiv
      • AWS Lambda
      • BaiduSearch
      • Calculator
      • Cal.com
      • Composio
      • Crawl4AI
      • CSV
      • Dalle
      • DuckDb
      • DuckDuckGo
      • Email
      • Exa
      • Fal
      • File
      • Firecrawl
      • Giphy
      • Github
      • Google Calendar
      • Google Search
      • Hacker News
      • Jina Reader
      • Jira
      • Linear
      • Lumalabs
      • MLX Transcribe
      • ModelsLabs
      • Newspaper
      • Newspaper4k
      • OpenBB
      • Bitca
      • Postgres
      • Pubmed
      • Pyton
      • Replicate
      • Resend
      • Searxng
      • Serpapi
      • Shell
      • Slack
      • Sleep
      • Spider
      • SQL
      • Tavily
      • Twitter
      • Website
      • Yfinance
      • Zendesk
    • Knowledges
      • Introduction
      • ArXiv Knowledge Base
      • Combined KnowledgeBase
      • CSV Knowledge Base
      • CSV URL Knowledge Base
      • Docx Knowledge Base
      • Document Knowledge Base
      • JSON Knowledge Base
      • LangChain Knowledge Base
      • LlamaIndex Knowledge Base
      • PDF Knowledge Base
      • PDF URL Knowledge Base
      • S3 PDF Knowledge Base
      • S3 Text Knowledge Base
      • Text Knowledge Base
      • Website Knowledge Base
    • Chunking
      • Fixed Size Chunking
      • Agentic Chunking
      • Semantic Chunking
      • Recursive Chunking
      • Document Chunking
    • VectorDBS
      • Introduction
      • PgVector Agent Knowledge
      • Qdrant Agent Knowledge
      • Pinecone Agent Knowledge
      • LanceDB Agent Knowledge
      • ChromaDB Agent Knowledge
      • SingleStore Agent Knowledge
    • Storage
      • Introduction
      • Postgres Agent Storage
      • Sqlite Agent Storage
      • Singlestore Agent Storage
      • DynamoDB Agent Storage
      • JSON Agent Storage
      • YAML Agent Storage
    • Embeddings
      • Introduction
      • OpenAI Embedder
      • Gemini Embedder
      • Ollama Embedder
      • Voyage AI Embedder
      • Azure OpenAI Embedder
      • Mistral Embedder
      • Fireworks Embedder
      • Together Embedder
      • HuggingFace Embedder
      • Qdrant FastEmbed Embedder
      • SentenceTransformers Embedder
    • Workflows
      • Introduction
      • Session State
      • Streaming
      • Advanced Example - News Report Generator
  • How To
    • Install & Upgrade
    • Upgrade to v2.5.0
Powered by GitBook
LogoLogo

© 2025 Bitca. All rights reserved.

On this page
  • ​Prerequisites
  • ​Example
  • ​Toolkit Params
  • ​Toolkit Functions
Export as PDF
  1. Documentation
  2. Tools

Spider

PreviousSleepNextSQL

Last updated 4 months ago

SpiderTools is an open source web Scraper & Crawler that returns LLM-ready data. To start using Spider, you need an API key from the .

Prerequisites

The following example requires the spider-client library.

pip install -U spider-client

Example

The following agent will run a search query to get the latest news in USA and scrape the first search result. The agent will return the scraped data in markdown format.

cookbook/tools/spider_tools.py

from bitca.agent import Agent
from bitca.tools.spider import SpiderTools

agent = Agent(tools=[SpiderTools()])
agent.print_response('Can you scrape the first search result from a search on "news in USA"?', markdown=True)

Toolkit Params

Parameter
Type
Default
Description

max_results

int

-

The maximum number of search results to return

url

str

-

The url to be scraped or crawled

Toolkit Functions

Function
Description

search

Searches the web for the given query.

scrape

Scrapes the given url.

crawl

Crawls the given url.

Spider dashboard
​
​
​
​