-
-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement Node class in C, fix memory management #8
Conversation
This prepares for the next commit.
Most methods of Node are implemented directly in C avoiding the overhead of a Ruby wrapper. The wrappers mostly added error handling which is now done in the C functions.
This commit makes sure that cmark nodes are never freed if they're still referenced from Ruby. Take the following code example: doc = Node.parse_string("Hi *there*") text = doc.first_child.last_child.first_child doc = nil GC.start It must be assured that `doc` isn't freed after a GC run because the document is still referenced by `text`. This is solved by implementing a "mark" function that is run during Ruby's GC and marks the parent node of every node that is still alive. Marking the parent node will propagate up to the root of the tree. In order to get the Ruby VALUE of the parent node, a 1:1 mapping between cmark_nodes and Ruby objects is enforced. Upon creation, Ruby objects are stored in the user data field of a cmark_node. If the cmark_node is passed to Ruby again, the cached value is used. This also improves performance if a node is accessed from Ruby multiple times. If Ruby objects are cached, it must be made sure that they're not freed by Ruby's GC as long as the cmark_node is still alive. For this reason, all children of a node are also marked during GC. This essentially means that all Ruby objects referencing any node of a tree are kept alive until the whole tree is destroyed. Last but not least, the finalizer must only be run on root nodes that can be freed safely. This is achieved by updating the finalizer whenever a node's parent is changed. The previous check for document nodes in `rb_free_c_struct` was unsafe because it might access a child node's data that was already freed (and it failed to free root nodes that aren't documents).
Node.new now accepts a symbol.
Wow. Holy cow. This branch name is the understatement of the year. My one complaint is that this is lacking Thank you so much for this contribution! |
Implement Node class in C, fix memory management
Calling |
When it comes down to jumping into C, I don't trust anyine passing in anything. 😺 |
See the commits for comments.