Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: hide JSX code in notebook view #30171

Closed

Conversation

hesreallyhim
Copy link

@hesreallyhim hesreallyhim commented Mar 7, 2025

I think a simpler solution for now is just to comment out the JSX code in the notebooks, and remove the comment markers in the notebook_convert script.

Description:

The Jupyter notebooks in the docs section are extremely useful and critical for widespread adoption of LangChain amongst new developers. However, because they are also converted to MDX and used to build the HTML for the Docusaurus site, they contain JSX code that degrades readability when opened in a "notebook" setting (local notebook server, google colab, etc.). For instance, here we see the website, with a nice React tab component for installation instructions (pip vs conda):

Screenshot 2025-03-07 at 2 07 15 PM

Now, here is the same notebook viewed in colab:

Screenshot 2025-03-07 at 2 08 41 PM

Note that the text following "To install LangChain run:" contains snippets of JSX code that is (i) confusing, (ii) bad for readability, (iii) potentially misleading for a novice developer, who might take it literally to mean that "to install LangChain I should run import Tabs from..." and then an ill-formed command which mixes the pip and conda installation instructions.

This is due to the way that Docusaurus renders the MDX/JSX code - the import statements appear directly, and the tab contents are mixed together in a confusing way. This is, at the very least, an ugly distraction, and may be a road-block for developers new to the ecosystem or to Python, React, etc. (For technical context, see the bottom of this PR)

Ideally, we would like to have a system that presents a similar/equivalent UI when viewing the notebooks on the documentation site, or when interacting with them in a notebook setting - or, at a minimum, we should not present ill-formed JSX snippets to someone trying to execute the notebooks. As the documentation itself states, running the notebooks yourself is a great way to learn the tools. Therefore, these distracting and ill-formed snippets are contrary to that goal.

## Fixes:
* Achieving an isomorphic UI on the website and in a notebook context is possible but slightly non-trivial. Therefore, in this PR, I attempt to implement a partial solution, which hides the JSX code when viewing the notebook in a notebook setting, while maintaining the UI as-is on the website. The solution works as follows:
* Docusaurus offers a global namespace for importing components under the file @theme/MDXComponents. Any component that is exported there will be available in any other page without requiring an explicit import statement. That means we can simply move all the imports to MDXComponents and delete or comment them out from the .ipynb files. Since there are a relatively small number of components that we use, this should not degrade performance or introduce any other defects. So, in this PR, I create the MDXComponents file, import/export all of the React components that are used within any .ipynb file, and then delete/comment out the import statements from the .ipynb. (I began by deleting them - in a second pass, I decided to comment them out - I'm not sure which approach is better, so I can easily refactor this according to the maintainers' preferences.)
* Jupyter notebooks tend to automatically hide "HTML" code inside of Markdown blocks, so much of the React code can be left untouched. E.g., <MyComponent myProp={"Hello"} /> will already be invisible when you open the notebook in an editor. However, there are some quirks:
- Components with innerText: the JSX code will be invisible, but any innerText will be visible. That's why in the colab screenshot above, you see the contents of the <Tab> components (pip install... etc.), even though you don't see <Tab><CodeBlock>pip install...</CodeBlock></Tab>.
- Components that take objects as props as are also problematic, because, for obscure reasons, the presence of the : character forces the renderer to display the whole component. So, for example, a component like: <ChatModelTabs overrideParams={{openai: {model: \"gpt-4\"}}} /> will not be hidden for the simple reason that it contains a : (which is not completely encapsulated in a string). One potential solution to this would be stringify the prop, and then JSON.parse it in the consumer, but this would require more code and a change to the React component propTypes. Instead, I found that wrapping the problematic component(s) inside a <Fragment> sufficed to get the MDX renderer to ignore this quirk. So, any component that takes an object as a prop gets wrapped in a Fragment. (Fragment is also added to the MDXComponent file, so everything renders as correct React code.)
- To fix the innerText problem, the best solution I found was to create a dummy component, which I call <Div value={"..."} />and to move the innerText into the value of this component (which just renders a div with the value as the innerText).
- As a further enhancement, because of the limited number of places where this pattern appears (in particular, the <Tabs> component that contains the pip vs conda instructions), I added more HTML code to the Markdown blocks in the .ipynb files which displays a clear presentation of the relevant code snippets, and which is wrapped inside of a tag with style={{ display: "none" }}. The effect of this is that when viewed in a notebook context, the innerText with the installation instructions is presented to the user in a clear and meaningful way - while, at the same time, when it gets converted to Docusaurus HTML, that some code is hidden because of the display: none style. This not only hides the illegible JSX in the notebook context, but provides useful instructions to the notebook user, while having no visible effect on the website HTML.

### TL/DR:
* Snippets of meaningless or misleading JSX code littered in the .ipynb files degrades readability and usability of the notebooks when opened in a Jupyter notebook setting. This can be a roadblock to adoption for new users, which is contrary to the goals of the documentation, and is possibly harmful to the LangChain project as a whole - executing a notebook is, in my opinion, far more helpful than just reading it. While this might seem like a lot of code changes for a "cosmetic" improvement, I would argue that the problem is worth taking seriously, and the solution is in fact pretty simple once understood, with most of the diffs being pretty much the same across files, and not "hacky" or over-engineered.
* The solution consists in:
- Moving all React imports to the global MDXComponents theme file (part of the official Docusaurus API), and deleting or commenting out the imports from the .ipynb files (this should probably be consistent, but I don't know which strategy is preferred, can easily switch to either one).
- Wrapping any exposed innerText in a simple wrapper component that puts them in a prop instead of innerText, thereby hiding them in the .ipynb files.
- Most React components are automatically invisible in the notebook context, except for those with complex props - for these cases, wrapping them inside a <Fragment> fixes the problem in a way that does not interfere at all with the rendered website.
- ENHANCEMENT: Add some extra HTML that supplants the hidden installation instructions in the <Tabs> components, using div's that have display: "none" and are thereby visible in the notebook, but hidden on the website.

- [X] Lint and test: Run make format, make lint and make test from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/

~### ADDITIONAL COMMENTS:
* It's possible that my refactor involving the CodeBlock broke the "API Reference" links inserted below codeblocks that have import statements. I haven't been able to confirm this.
* Generally could use a second check on the lint/format/etc. Makefile commands, I found that some of them did not work as expected, and I couldn't get the Docker container to work either. :(
* Need decision about whether to comment our or just delete the (now unnecessary) React import statements that were moved to the global MDXComponents.

### FINAL REMARK:
* Maybe the community thinks this is overkill or unnecessary. I've tried to make the case, but it's an open question. However, going further, if we do think this is a good direction, there are options to implement a more dynamic UI within the notebooks which more or less perfectly replicates the UI in the website - e.g. IPython widgets, %%html magic-strings, etc. I think it would be nice if the notebooks were as pretty as the website, but this is also up for debate. I invite any comments or discussion, please let me know if I should open an issue instead.

Copy link

vercel bot commented Mar 7, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
langchain ✅ Ready (Inspect) Visit Preview 💬 Add feedback Mar 8, 2025 2:33pm

@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. 🤖:docs Changes to documentation and examples, like .md, .rst, .ipynb files. Changes to the docs/ folder labels Mar 7, 2025
@hesreallyhim
Copy link
Author

going to suggest a better solution

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🤖:docs Changes to documentation and examples, like .md, .rst, .ipynb files. Changes to the docs/ folder size:L This PR changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant