See the blog post: S3 Throughput: Scans vs Indexes.
A benchmark to explore the speed of reading WARC entries in bulk vs individually.
mvn clean install assembly:single # Build the JAR
NUM_RECORDS=100000 NUM_CORES=16 java -Xmx20g -Dhttp.maxConnections=1000 -cp target/batch-vs-index-warc-1.0-SNAPSHOT-jar-with-dependencies.jar com.code402.Single
NUM_RECORDS=100000 NUM_CORES=16 java -Xmx20g -Dhttp.maxConnections=1000 -cp target/batch-vs-index-warc-1.0-SNAPSHOT-jar-with-dependencies.jar com.code402.Batch