$ bin/nutch readdb tmp-crawl -stats
CrawlDb statistics start: tmp-crawldb
Statistics for CrawlDb: tmp-crawldb
TOTAL urls: 400
retry 0: 400
min score: 1.0
avg score: 1.0
max score: 1.0
status 1 (db_unfetched): 400
CrawlDb statistics: done
$ touch tmp-segments/20070703074227/fetcher.done
$ bin/nutch updatedb tmp-crawldb -dir tmp-segments -filter -noAdditions
...
$ bin/nutch readdb tmp-crawl -stats
CrawlDb statistics start: tmp-crawldb
Statistics for CrawlDb: tmp-crawldb
TOTAL urls: 400
retry 0: 399
retry 1: 1
min score: 1.0
avg score: 1.014
max score: 1.092
status 1 (db_unfetched): 54
status 2 (db_fetched): 346
CrawlDb statistics: done
$
|