SYSTEM Cited by 1 source
Amazon Ion¶
Amazon Ion (amazon-ion.github.io/ion-docs) is Amazon's open-source richly-typed self-describing data serialization format with dual text/binary representations and a formal type system spanning strings, numerics (including arbitrary-precision decimals and timestamps), lists, structs, and symbol tables. It originated internally at Amazon as the format for exchanging structured data between services at company scale and is used by several AWS offerings (QLDB stored its journal as Ion; it appears in internal Amazon data pipelines throughout).
Role for this wiki¶
Ion showed up in the post-Oracle BI migration at Amazon as the schema wrapper applied to the 50+ PB of Oracle table data copied into systems/aws-s3:
"they successfully copied over 50PB of Oracle table data to S3, converting it to universally-consumable content types like delimited text and wrapping each table in a more generic schema based on the Amazon Ion type system enroute." (Source: sources/2024-07-29-aws-amazons-exabyte-scale-migration-from-apache-spark-to-ray-on-ec2)
Ion's rich type system (arbitrary-precision decimals, native timestamps with timezone semantics, annotations) handles the kinds of edge cases the BDT migration later flagged as reasons not to compare table outputs byte-for-byte between Spark and Ray (decimal rounding, timezone representation drift, etc). A rich canonical type system is the upstream-of-that-problem answer; Ion was Amazon's bet.
Seen in¶
- sources/2024-07-29-aws-amazons-exabyte-scale-migration-from-apache-spark-to-ray-on-ec2 โ Ion as the generic schema wrapper for 50+ PB of Oracle table data migrated to S3 during Amazon's BI-stack migration (2016โ2018).
Related¶
- systems/aws-s3 โ where Ion-wrapped table data landed.