changeset 85:52d59a351bf5

add post about unicode in schism
author Paper <paper@paper.us.eu.org>
date Sun, 09 Jun 2024 18:20:45 -0400
parents 5914d06f72b4
children 1fed81c848a5
files _posts/2024-05-19-how-do-I-blog.html _posts/2024-06-09-schism-unicode-and-you.html css/style.css index.html media/blog/schism-spanish-file-listing.png
diffstat 5 files changed, 39 insertions(+), 2 deletions(-) [+]
line wrap: on
line diff
--- a/_posts/2024-05-19-how-do-I-blog.html	Wed Jun 05 21:02:44 2024 -0400
+++ b/_posts/2024-05-19-how-do-I-blog.html	Sun Jun 09 18:20:45 2024 -0400
@@ -8,4 +8,3 @@
 <span>anyway, with the blog stuff, I'll write here when I actually have things to blog about ;)</span>
 <br><br>
 <span>P.S.: you can also get a feed of these posts under <a class="prettylink" href="/blog/feed.xml">/blog/feed.xml</a></span>
-<br>
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/_posts/2024-06-09-schism-unicode-and-you.html	Sun Jun 09 18:20:45 2024 -0400
@@ -0,0 +1,30 @@
+---
+layout: post
+author: Paper
+title: 'Schism Tracker, Unicode, and you'
+---
+<span>Recently I've taken on adding real Unicode-awareness to Schism, and it was <i>surprisingly</i> easy, to say the least.</span>
+<br><br>
+<span>I was expecting to have to convert lots of things to be real Unicode, but nope! All that really needed to be done was to convert UTF-8 to CP437 where necessary to actually *draw* the data while keeping the internal form pure UTF-8, and then bundle everything up into a neat macro to keep everything consistent:</span>
+<figure><pre><code>#define CHARSET_EASY_MODE_EX(MOD, in, inset, outset, x) \
+	do { \
+		MOD uint8_t* out; \
+		charset_error_t err = charset_iconv(in, (uint8_t**)&out, inset, outset); \
+		if (err) \
+			out = in; \
+	\
+		x \
+	\
+		if (!err) \
+			free((uint8_t*)out); \
+	} while (0)
+</code></pre></figure>
+<span>I just shoved this macro anywhere necessary and it works perfectly fine for loading any Unicode path. For example, the Spanish word "maƱana" gets displayed correctly now:</span>
+<br><br>
+<img class="drop-shadow-box center-image" src="/media/blog/schism-spanish-file-listing.png">
+<br>
+<span>The file sorting algorithms were a different beast though, and even now strverscmp doesn't have a real charset-independent variant. For strcasecmp, I had to implement (simple) Unicode case folding, which meant having a <a class="prettylink" href="https://github.com/schismtracker/schismtracker/blob/b858a5917ee7e83f7cb4da1ad698dd24159f241b/schism/charset_data.c#L183">switch statement that is almost 1500 lines long</a> and takes up about 20K of space in the binary.</span>
+<br><br>
+<span>Schism currently does not do any Unicode normalization when comparing strings. This is primarily a problem with decomposed strings (which will likely not get converted properly), though with filenames that probably shouldn't exist anyway...</span>
+<br><br>
+<span>anyway, Unicode is easy, if you can't use it properly it's a skill issue :p</span>
--- a/css/style.css	Wed Jun 05 21:02:44 2024 -0400
+++ b/css/style.css	Sun Jun 09 18:20:45 2024 -0400
@@ -158,3 +158,8 @@
 .blog-date-right {
 	float: inline-end;
 }
+
+.center-image {
+	display: block;
+	margin: 0 auto;
+}
--- a/index.html	Wed Jun 05 21:02:44 2024 -0400
+++ b/index.html	Sun Jun 09 18:20:45 2024 -0400
@@ -23,7 +23,10 @@
 	<h2 class="drop-shadow-text">Socials</h2>
 	<ul class="index-socials-list">
 		<li class="drop-shadow-text">
-			E-mail (<a class="prettylink" href="mailto:paper@paper.us.eu.org">paper@paper.us.eu.org</a>); please do not send me <a class="prettylink" href="https://useplaintext.email/">HTML-infested garbage</a>
+			E-mail (<a class="prettylink" href="mailto:paper@paper.us.eu.org">paper@paper.us.eu.org</a>); please send me <a class="prettylink" href="https://useplaintext.email/">plain text email</a> if possible.
+		</li>
+		<li class="drop-shadow-text">
+			IRC; you can usually find me in <a class="prettylink" href="irc://irc.libera.chat/openmpt">#openmpt on Libera Chat</a>.
 		</li>
 		<li class="drop-shadow-text">
 			Discord (@slipofpaper)
Binary file media/blog/schism-spanish-file-listing.png has changed