python - DynamoDB Parallel Scan not splitting results -
i'm using segment
, totalsegments
parameters split dynamodb scan on multiple workers (as shown in parallel scan section of developer guide).
however, of results returned 1 worker. issue here? there perhaps issue how i've implemented threading?
import threading import boto3 def scan_foo_table(foo, segment, total_segments): print 'looking @ segment ' + str(segment) session = boto3.session.session() dynamodbclient = session.client('dynamodb') response = dynamodbclient.scan( tablename='footable', filterexpression='bar=:bar', expressionattributevalues={ ':bar': {'s': bar} }, segment=segment, totalsegments=total_segments, ) print 'segment ' + str(segment) + ' returned ' + str(len(response['items'])) + ' items' def create_threads(bar): thread_list = [] total_threads = 3 in range(total_threads): # instantiate , store thread thread = threading.thread(target=scan_foo_table, args=(bar, i, total_threads)) thread_list.append(thread) # start threads thread in thread_list: thread.start() # block main thread until threads finished thread in thread_list: thread.join() def lambda_handler(event, context): create_threads('123')
output:
looking @ segment 0 looking @ segment 1 looking @ segment 2 segment 1 returned 0 items segment 2 returned 0 items segment 0 returned 10000 items
one thing jumps @ me filter expression.
it possible items match filter expression located in first segment.
it worth noting parallel scan doesn't split items, splits key-space searched items. think of dividing large highway multiple lanes. possible le cars in fast lane , won't see cars in other lanes.
though in case seems more filter expression causing 1 segment return items.
Comments
Post a Comment