#21 closed enhancement (fixed)
crawler: add support to re-crawl zones
Reported by: | wander | Owned by: | |
---|---|---|---|
Priority: | important | Component: | crawler |
Keywords: | Cc: |
Description
When a zone has been completely crawled, we should re-crawl it from time to time to find new NSEC3 hashes. Unfortunately we have to send all n queries again to the server but at least we already have the query names and do not need to do BF-range computation.
Attachments (2)
Change History (9)
comment:1 Changed 12 years ago by
Component: | nsec3breaker → crawler |
---|
Changed 12 years ago by
Attachment: | vc-20130506.txt added |
---|
Changed 12 years ago by
Attachment: | vc-historic.txt added |
---|
comment:2 Changed 12 years ago by
comment:4 Changed 12 years ago by
comment:5 Changed 11 years ago by
Resolution: | → fixed |
---|---|
Status: | new → closed |
(In [1017]) added re-crawl feature, closes #21
design choices:
- when using re-crawl, all current NSEC entries are marked as obsolete, thus duplicating all database entries if nothing has changed
- all obsolete NSEC entries are hashed at *EVERY STARTUP* with non-optimized Python code to identify gap cutters without brute-force (TODO: we should either move this into OpenCL or at least make this a CLI argument)
Note: See
TracTickets for help on using
tickets.
See the attached dump files of vc. today and vc. from some point in the past. Both seem to be complete for the crawler without re-crawling feature.