Skip to content

Performance gap between OCI Python SDK and boto3 for object downloads #755

Open
@dreamtalen

Description

@dreamtalen

Environment details

  • Python version: 3.9.18
  • pip version: 23.2.1
  • oci version: 2.111.0

Issue

We are comparing the download performance of the OCI Python SDK and boto3 (AWS SDK). For the same objects stored in an OCI bucket, we’ve observed that the OCI SDK is approximately 20% to 50% slower than boto3 when downloading to memory.

Methods Tested with OCI SDK

  1. Using response.data.content :
response = self._oci_client.get_object(
    namespace_name=self._namespace, bucket_name=bucket, object_name=key, range=bytes_range
)
return response.data.content 
  1. Using response.data.raw.stream
    Get idea from this issue, this method is ~60% faster than method 1 but still ~20% slower than boto3:
response = self._oci_client.get_object(
    namespace_name=self._namespace, bucket_name=bucket, object_name=key, range=bytes_range
)
content = bytearray()
for chunk in response.data.raw.stream(1024 * 1024, decode_content=False):  # 1MB chunks
    content.extend(chunk)
return bytes(content)

Note: We tested various chunk sizes, but they did not yield further improvements.

boto3 Baseline Implementation

response = s3_client.get_object(Bucket=bucket_name, Key=key)
return response['Body'].read()

Performance Results

With ThreadPoolExecutor(max_workers=16), I got following average throughput downloading 64MB x 1000 objects from the same OCI bucket to memory:

  • boto3 get_object: 9.8 Gbps
  • OCI SDK response.data.content: 4.1 Gbps
  • OCI SDK response.data.raw.stream: 6.8 Gbps

The gap remains consistent across multiple test runs, including various multithreaded and multiprocessed setups.

Questions

  1. Is this performance gap expected?
  2. Are there any recommended optimizations or best practices for improving download performance with the OCI Python SDK?
  3. Are there any internal differences in how OCI supports S3-compatible APIs handling downloads that might explain the performance gap?

Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    SDKIssue pertains to the SDK itself and not specific to any service

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions