Use footnote name for reference id #300

digitalmoksha · 2023-05-03T01:38:29Z

For example

one[^zz]
[^zz]: two

should generate

<p>one<sup class="footnote-ref"><a href="#fn-zz" id="fnref-zz" data-footnote-ref>1</a></sup></p>
<section class="footnotes" data-footnotes>
<ol>
<li id="fn-zz">
<p>two <a href="#fnref-zz" class="footnote-backref" data-footnote-backref data-footnote-backref-idx="1" aria-label="Back to reference 1">↩</a></p>
</li>
</ol>
</section>

kivikakk · 2023-05-04T02:23:36Z

Thanks so much for this! I've adjusted the AST format to use a struct rather than using a tuple. That's also fixed the example build failure in 46a89b7. There's still a number of CI tests failing -- would you be able to adjust them?

digitalmoksha · 2023-05-04T04:50:27Z

Oh cool, great idea on the struct. Yes I can update the tests.

Still a couple more things we need to solve. Percent encoding of footnote names, and detection of footnote names that begin with _ (and probably other chars).

So this

first[^1] and second[^😄second] and twenty[^_twenty]

[^1]: one
[^😄second]: two
[^_twenty]: twenty

should yield this

<p>first<sup class="footnote-ref"><a href="#fn-1" id="fnref-1" data-footnote-ref>1</a></sup> and second<sup class="footnote-ref"><a href="#fn-%F0%9F%98%84second" id="fnref-%F0%9F%98%84second" data-footnote-ref>2</a></sup> and twenty<sup class="footnote-ref"><a href="#fn-_twenty" id="fnref-_twenty" data-footnote-ref>3</a></sup></p>
<section class="footnotes" data-footnotes>
<ol>
<li id="fn-1">
<p>one <a href="#fnref-1" class="footnote-backref" data-footnote-backref data-footnote-backref-idx="1" aria-label="Back to reference 1">↩</a></p>
</li>
<li id="fn-%F0%9F%98%84second">
<p>two <a href="#fnref-%F0%9F%98%84second" class="footnote-backref" data-footnote-backref data-footnote-backref-idx="2" aria-label="Back to reference 2">↩</a></p>
</li>
<li id="fn-_twenty">
<p>twenty <a href="#fnref-_twenty" class="footnote-backref" data-footnote-backref data-footnote-backref-idx="3" aria-label="Back to reference 3">↩</a></p>
</li>
</ol>
</section>

according to cmark-gfm.

Not sure if I'll get a chance tomorrow, but hopefully over the next couple of days.

digitalmoksha · 2023-05-11T16:07:53Z

@kivikakk wondering if I could ask a question, I'm still a newbie with Rust.

I notice in cm.rs that in methods such as fn format_text(&mut self, literal: &[u8] or fn format_code(&mut self, literal: &[u8], you're passing in a &[u8] for literal.

Why not use a &String (or String - I'm still figuring out when to reference and not). Strings are UTF8 encoded, and it doesn't seem necessary to be operating at the byte level. It also makes debugging a little more challenging, as inspecting the variables shows a list of bytes, rather than a string of characters.

I need to change format_footnote_definition to pass in the name, and I'm wondering why fn format_footnote_definition(&mut self, name: &String would not be the way to do it. 🤔

digitalmoksha · 2023-05-11T17:00:17Z

I see that in outc you are checking for specific characters, such as [, but I'm not sure why you're checking at the byte level rather than the character level.

digitalmoksha · 2023-05-11T22:09:38Z

vendor/cmark-gfm/test/extensions.txt needs to get updated in order for the footnote tests to start passing. I still need to add support for handling multiple references to the same footnote, and encoding.

Would it be worth updating to the latest cmark-gfm in a separate PR, which can focus on any general fixes that might be needed? I'll continue to work on getting tests to pass with the local copy of extensions.txt.

digitalmoksha · 2023-05-18T18:36:22Z

@kivikakk I like to keep PRs as small and focused as possible. I'll create another PR to handle encoding the footnote name, and also a capitalization issue.

So I think this is ready for a review if you have time. It passes the latest footnote spec from upstream, except for encoding the name.

kivikakk · 2023-05-19T01:22:48Z

Why not use a &String (or String - I'm still figuring out when to reference and not). Strings are UTF8 encoded, and it doesn't seem necessary to be operating at the byte level. It also makes debugging a little more challenging, as inspecting the variables shows a list of bytes, rather than a string of characters.

It's usually one of two reasons in Comrak: performance, or to match upstream behaviour where it does checks at a byte level instead of character. (Fwiw, you'll either want String if you're taking ownership of the copy, or &str — very rarely is &String wanted.) It varies from place to place, but in ~95% of cases the choice is intentional.

In the case of cm.rs, we never need to consider any character that would be multibyte in UTF-8, and we need to index buffers in a few inner loops. Getting the nth character (including e.g. the 'last' character) is O(n), and it would be for no gain here.

kivikakk · 2023-05-19T01:58:49Z

So I think this is ready for a review if you have time.

To be honest, I don't have time for as thorough a review as I'd like, and that seems unlikely to change in the short- to medium-term future. But I've looked through, and it seems sound. I'll run fuzzing on it for a while after to be sure, but I'm happy with these changes — thank you!

digitalmoksha · 2023-05-19T16:01:17Z

To be honest, I don't have time for as thorough a review as I'd like, and that seems unlikely to change in the short- to medium-term future

Completely understand! Thanks for getting this merged 🚀

digitalmoksha and others added 2 commits May 4, 2023 12:15

Use footnote name for reference id

6a681f9

nodes: add NodeFootnoteReference

358de5d

kivikakk force-pushed the bw-fix-footnote-numbering branch from 46a89b7 to 358de5d Compare May 4, 2023 02:23

Fix footnote tests

3501ab0

digitalmoksha added 2 commits May 12, 2023 19:37

Output information when a footnote is referenced multiple times

1ab6c30

Fix clippy and rustfmt

03ff9c9

cm: &String -> &str

bf2069e

kivikakk enabled auto-merge May 19, 2023 01:58

kivikakk disabled auto-merge May 19, 2023 02:05

kivikakk merged commit 127da5c into kivikakk:main May 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use footnote name for reference id #300

Use footnote name for reference id #300

digitalmoksha commented May 3, 2023

kivikakk commented May 4, 2023

digitalmoksha commented May 4, 2023

digitalmoksha commented May 11, 2023

digitalmoksha commented May 11, 2023

digitalmoksha commented May 11, 2023

digitalmoksha commented May 18, 2023

kivikakk commented May 19, 2023

kivikakk commented May 19, 2023

digitalmoksha commented May 19, 2023

Use footnote name for reference id #300

Use footnote name for reference id #300

Conversation

digitalmoksha commented May 3, 2023

kivikakk commented May 4, 2023

digitalmoksha commented May 4, 2023

digitalmoksha commented May 11, 2023

digitalmoksha commented May 11, 2023

digitalmoksha commented May 11, 2023

digitalmoksha commented May 18, 2023

kivikakk commented May 19, 2023

kivikakk commented May 19, 2023

digitalmoksha commented May 19, 2023