-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Specialize OsString::push
and OsString as From
for UTF-8
#137777
Conversation
Is this a performance optimization or a correctness fix? |
Performance. When the LHS ends with a leading surrogate and the RHS starts with a trailing surrogate, then they need to be joined. A WTF-8 |
d078403
to
74d7607
Compare
When concatenating two WTF-8 strings, surrogate pairs at the boundaries need to be joined. However, since UTF-8 strings cannot contain surrogate halves, this check can be skipped when one string is UTF-8. Specialize `OsString::push` to use a more efficient concatenation in this case. Unfortunately, a specialization for `T: AsRef<str>` conflicts with `T: AsRef<OsStr>`, so stamp out string types with a macro.
The WTF-8 version of `OsString` tracks whether it is known to be valid UTF-8 with its `is_known_utf8` field. Specialize `From<AsRef<OsStr>>` so this can be set for UTF-8 string types.
74d7607
to
83407b8
Compare
OsString::push
for stringsOsString::push
and OsString as From
for UTF-8
I have now pushed another similar specialization. Since the WTF-8 |
Great idea! |
…joboet Specialize `OsString::push` and `OsString as From` for UTF-8 When concatenating two WTF-8 strings, surrogate pairs at the boundaries need to be joined. However, since UTF-8 strings cannot contain surrogate halves, this check can be skipped when one string is UTF-8. Specialize `OsString::push` to use a more efficient concatenation in this case. The WTF-8 version of `OsString` tracks whether it is known to be valid UTF-8 with its `is_known_utf8` field. Specialize `From<AsRef<OsStr>>` so this can be set for UTF-8 string types. Unfortunately, a specialization for `T: AsRef<str>` conflicts with `T: AsRef<OsStr>`, so stamp out string types with a macro. r? `@ChrisDenton`
…kingjubilee Rollup of 12 pull requests Successful merges: - rust-lang#136667 (Revert vita's c_char back to i8) - rust-lang#136780 (std: move stdio to `sys`) - rust-lang#137107 (Override default `Write` methods for cursor-like types) - rust-lang#137363 (compiler: factor Windows x86-32 ABI impl into its own file) - rust-lang#137528 (Windows: Fix error in `fs::rename` on Windows 1607) - rust-lang#137537 (Prevent `rmake.rs` from using unstable features, and fix 3 run-make tests that currently do) - rust-lang#137777 (Specialize `OsString::push` and `OsString as From` for UTF-8) - rust-lang#137832 (Fix crash in BufReader::peek()) - rust-lang#137904 (Improve the generic MIR in the default `PartialOrd::le` and friends) - rust-lang#138115 (Suggest typo fix for static lifetime) - rust-lang#138125 (Simplify `printf` and shell format suggestions) - rust-lang#138129 (Stabilize const_char_classify, const_sockaddr_setters) r? `@ghost` `@rustbot` modify labels: rollup
…iaskrgr Rollup of 8 pull requests Successful merges: - rust-lang#136667 (Revert vita's c_char back to i8) - rust-lang#137107 (Override default `Write` methods for cursor-like types) - rust-lang#137777 (Specialize `OsString::push` and `OsString as From` for UTF-8) - rust-lang#137832 (Fix crash in BufReader::peek()) - rust-lang#137904 (Improve the generic MIR in the default `PartialOrd::le` and friends) - rust-lang#138115 (Suggest typo fix for static lifetime) - rust-lang#138125 (Simplify `printf` and shell format suggestions) - rust-lang#138129 (Stabilize const_char_classify, const_sockaddr_setters) r? `@ghost` `@rustbot` modify labels: rollup
Rollup merge of rust-lang#137777 - thaliaarchi:os_string-push-str, r=joboet Specialize `OsString::push` and `OsString as From` for UTF-8 When concatenating two WTF-8 strings, surrogate pairs at the boundaries need to be joined. However, since UTF-8 strings cannot contain surrogate halves, this check can be skipped when one string is UTF-8. Specialize `OsString::push` to use a more efficient concatenation in this case. The WTF-8 version of `OsString` tracks whether it is known to be valid UTF-8 with its `is_known_utf8` field. Specialize `From<AsRef<OsStr>>` so this can be set for UTF-8 string types. Unfortunately, a specialization for `T: AsRef<str>` conflicts with `T: AsRef<OsStr>`, so stamp out string types with a macro. r? ``@ChrisDenton``
When concatenating two WTF-8 strings, surrogate pairs at the boundaries need to be joined. However, since UTF-8 strings cannot contain surrogate halves, this check can be skipped when one string is UTF-8. Specialize
OsString::push
to use a more efficient concatenation in this case.The WTF-8 version of
OsString
tracks whether it is known to be valid UTF-8 with itsis_known_utf8
field. SpecializeFrom<AsRef<OsStr>>
so this can be set for UTF-8 string types.Unfortunately, a specialization for
T: AsRef<str>
conflicts withT: AsRef<OsStr>
, so stamp out string types with a macro.r? @ChrisDenton