-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support of References for AVRO Schemas #926
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cleanup commits to single/coherent commits. Add tests for better coverage.
@jjaakola-aiven Could you suggest the tests that should be added? |
f0f9450
to
0167a75
Compare
@jjaakola-aiven I'm still waiting for your suggestions regarding additional tests for the new functionality. Please keep in mind that we already have integration tests covering various aspects, including:
Additionally, the new code has successfully passed all existing tests related to Avro functionality. I also added a few unit tests for the AvroMerge class today. |
@libretto shall we rebase this work so we get upstream changes and then let's discuss around the comments if you can tell me when you'd like to, we can spare sometime and discuss in one thread here. |
9c26f46
to
0167a75
Compare
@libretto The unit test for |
9cac0ba
to
f4346ea
Compare
f4346ea
to
5ab6ba0
Compare
rebased |
@libretto Hi, sorry for the delay, this took some time to properly review. I think that the implementation using Avro unions is risky and might not work in all cases (hard to tell). I have opened a draft PR implementing the merging using Avro lib here: Please have a look and consider that approach instead. |
@davide-armand Hi, As I understand, the union approach concept has been confirmed by our teams and has passed all existing Karapace tests, including the references tests I provided. This demonstrates that the concept works well. I will try using Your approach with the Avro library to eliminate the need for the union method. I expect the results to be consistent. Regarding tests, I noticed that You added a few tests for recursion. However, the direct recursion test You included is on the Avro-level parser and should be tested within the Avro tests, outside of the Karapace project. Indirect recursion (where Schema A references Schema B, which in turn references Schema A) is not possible in Karapace's design, as all schemas are added simultaneously. A schema cannot be added if any of its references are unresolved, which effectively prevents indirect recursion between schemas. I will include the Your code for the test_avro_reference() test in the my branch tests. |
@davide-armand I utilized the Avro library functionality as you suggested. References now work in the native Avro way. |
#987 implemented there |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@libretto Hi, I left a few comments, please have a look
src/karapace/schema_models.py
Outdated
|
||
return merge | ||
|
||
def resolve(self) -> list: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
resolve()
and build()
can be merged in a single method, since builder()
is only called by resolve()
.
Maybe the whole AvroResolver
class can be just a static method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@davide-armand After this change, we have a pylint issue but all tests are passed. Please check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@libretto Do you mean the error No overload variant of "reversed"
?
It can be fixed by passing a list to reversed
Hello, Thanks, Carlos |
src/karapace/schema_models.py
Outdated
merged_schema = None | ||
for schema in schemas_list: | ||
# Merge dep with all previously merged ones | ||
merged_schema = make_avsc_object(json.loads(schema), names) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
validate_avro_enum_symbols
and validate_avro_names
should be passed to make_avsc_object()
.
I missed that in my PR
tests/test_avro_references.py
Outdated
{ | ||
"type": "record", | ||
"name": "Country", | ||
"namespace": "com.netapp", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't use reference to real companies (com.netapp
).
Applies to other fixtures as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Will look into this!
Fixed accordingly to the review. Hope I didn't miss anything, let me know! Thanks for the help! |
This change adds support for references for AVRO schemas
We aim to add support for references for AVRO schemas to Karapace using a straightforward approach by leveraging the same API used in Protobuf references. Although Avro does not natively support references, we can implement this functionality by utilizing the property of Avro schemas that allows combining types. We utilize Names from avro.name and make_avsc_object to create an Avro schema that includes referenced schemas within as explained in #987.