refactor: migrate crawler to Scrapy framework

Refactored the CERT-Bund security advisory crawler from a basic Python script to a full Scrapy-based implementation for improved scalability and maintainability.

Changes:

  • Replaced simple requests-based crawler with Scrapy spider
  • Implemented proper Scrapy architecture (spiders, items, pipelines)
  • Added comprehensive .gitignore for Python, Scrapy, IDE, and OS files
  • Moved database operations to dedicated pipeline class
  • Updated README with Scrapy-specific usage and configuration
  • Added configurable fetch size and database path via Scrapy settings
  • Maintained all existing features (change tracking, database structure)

Benefits:

  • More maintainable and scalable architecture
  • Better separation of concerns (spider, items, pipelines)
  • Leverages Scrapy's built-in features for crawling and data handling
  • Easier to extend with additional spiders or export formats
  • Improved logging and error handling through Scrapy framework

Merge request reports

Loading