Read Schema mismatch
How does Carpet behave when the schema does not exactly match records types?
Nullable column mapped to primitive type
By default Carpet doesn't fail when a column is defined as optional but the record field is primitive.
This parquet schema:
message MyRecord {
required binary id (STRING);
required binary name (STRING);
optional int32 age;
}
is compatible with this record:
When a null value appears in a file, the field is filled with the default value of the primitive (0, 0.0 or false).
If you want to ensure that the application fails if an optional column is mapped to a primitive field, you must enable the flag FailOnNullForPrimitives:
List<MyRecord> data = new CarpetReader<>(file, MyRecord.class)
.withFailOnNullForPrimitives(true)
.toList();
By default, FailOnNullForPrimitives value is false.
Missing fields
When parquet file schema doesn't match with existing record fields, Carpet throws an exception.
This schema:
is not compatible with this record because it contains an additional int age field:
If for some reason you are forced to read the file with an incompatible record, you can disable the schema compatibility check with flag FailOnMissingColumn:
List<MyRecord> data = new CarpetReader<>(file, MyRecord.class)
.withFailOnMissingColumn(false)
.toList();
Carpet will skip the schema verification and fill the value with null in case of Objects or the default value of primitives (0, 0.0 or false).
By default, FailOnMissingColumn value is true.
If a column that exists in the file is not present in the record, Carpet will ignore it and will not throw an exception because it's considered a projection.
Narrowing numeric values
By default Carpet converts between numeric types:
- Any integer type can be converted to another integer type of different size: byte <-> short <-> int <-> long.
- Any decimal type can be converted to another decimal type of different size: float <-> double
This schema
is compatible with this record:
Carpet will cast numeric types using Narrowing Primitive Conversion rules from Java.
If you want to ensure that the application fails if a type is converted to a narrow value, you can enable the flag FailNarrowingPrimitiveConversion:
List<MyRecord> data = new CarpetReader<>(file, MyRecord.class)
.withFailNarrowingPrimitiveConversion(true)
.toList();
By default, FailNarrowingPrimitiveConversion value is false.